PHP Simple HTML DOM parsing to get to elements inside a href - parsing

<div>
text1
</div>
<div>
text2
</div>
<div>
text3
</div>
hello,
i try to parse links in html code below."thread_title" tag has different numbers.but could not solve it
thanks
$html = new simple_html_dom();
$html->load($input);
foreach($html->find('a[id=thread_title_([^\"]*)]') as $link)
echo $link->outertext . '<br>';

$html = new simple_html_dom();
$html->load($input);
foreach($html->find('a[id^=thread_title]') as $link)
echo $link->outertext . '<br>';

Related

sessionStorage is 1/2 working

pag1.php
<div id="guest_name"></div>
<?
//Some code
echo '<script>';
echo 'var guest_name = ' . json_encode($guest_name) . ';';
echo '</script>';
echo "<a href=main.php>Enter</a>";
?>
<script>
//This works great
$("#guest_name").text(sessionStorage.getItem("guest_name"));
</script>
when I click on and go to main.php, it stops working!!
<div id="guest_name"></div><!--Nothing is shown this div here -->
<script>
$("#guest_name").text(sessionStorage.getItem("guest_name")); //Does not work!!
</script>
main.php is NOT loaded in a new window or a new tab! Thank you

Remove extra spaces and \n from Nokogiri result [duplicate]

I have the following HTML in a variable named html_data where I wish to replace <img> tags with <a> tags and the src parameters of the "img" tags becomes href of the "a" tags.
Existing HTML:
<!DOCTYPE html>
<html>
<head>
<title>Learning Nokogiri</title>
</head>
<body marginwidth="6">
<div valign="top">
<div class="some_class">
<div class="test">
<img src="apple.png" alt="Apple" height="42" width="42">
<div style="white-space: pre-wrap;"></div>
</div>
</div>
</div>
</body>
</html>
This is my solution A:
nokogiri_html = Nokogiri::HTML(html_data)
nokogiri_html("img").each { |tag|
a_tag = Nokogiri::XML::Node.new("a", nokogiri_html)
a_tag["href"] = tag["src"]
tag.add_next_sibling(a_tag)
tag.remove()
}
puts 'nokogiri_html is', nokogiri_html
This is my solution B:
nokogiri_html = Nokogiri::HTML(html_data)
nokogiri_html("img").each { |tag|
tag.name= "a";
tag.set_attribute("href" , tag["src"])
}
puts 'nokogiri_html is', nokogiri_html
While solution A works fine, I am looking if there is a quicker/direct way to replace the tags using Nokogiri. With solution B, my "img" tag does get replaced with the "a" tag, but the properties of the "img" tag still remains inside the "a" tag. Below is the result of Solution B:
<!DOCTYPE html>
<html>
<body>
<p>["\n", "\n", " </p>
\n", "
<title>Learning Nokogiri</title>
\n", " \n", " \n", "
<div valign='\"top\"'>
\n", "
<div class='\"some_class\"'>
\n", "
<div class='\"test\"'>
\n", " <a src="%5C%22apple.png%5C%22" alt='\"Apple\"' height='\"42\"' width='\"42\"' href="%5C%22apple.png%5C%22"></a>\n", "
<div style='\"white-space:' pre-wrap></div>
\n", "
</div>
\n", "
</div>
\n", "
</div>
\n", " \n", ""]
</body>
</html>
Is there a way to replace the tags faster in HTML using Nokogiri? Also how can remove the "\n"s am getting in the result?
First, please strip your sample data (HTML) to the barest amount necessary to demonstrate the problem.
Here's the basics of doing what you want:
require 'nokogiri'
doc = Nokogiri::HTML(<<EOT)
<!DOCTYPE html>
<html>
<body>
<img src="apple.png" alt="Apple" height="42" width="42">
</body>
</html>
EOT
doc.search('img').each do |img|
src, alt = %w[src alt].map{ |p| img[p] }
img.replace("<a href='#{ src }'>#{ alt }</a>")
end
doc.to_html
# => "<!DOCTYPE html>\n<html>\n <body>\n Apple\n </body>\n</html>\n"
puts doc.to_html
# >> <!DOCTYPE html>
# >> <html>
# >> <body>
# >> Apple
# >> </body>
# >> </html>
Doing it this way allows Nokogiri to replace nodes cleanly.
It's not necessary to do all this rigamarole:
a_tag = Nokogiri::XML::Node.new("a", nokogiri_html)
a_tag["href"] = tag["src"]
tag.add_next_sibling(a_tag)
tag.remove()
Instead, create a string that is the tag you want to use and let Nokogiri convert the string to a node and replace the old node:
src, alt = %w[src alt].map{ |p| img[p] }
img.replace("<a href='#{ src }'>#{ alt }</a>")
It's not necessary to strip extraneous whitespace between nodes. It can affect the look of the HTML but browsers will gobble that extra whitespace and not display it.
Nokogiri can be told to not output the inter-node whitespace, resulting in a compressed/fugly output, but how to do that is a separate question.

c# HtmlAgilityPack HTML parsing issue

I have this html
<div class="postrow firs">
<h2 class="title icon">
This is the title
</h2>
<div class="content">
<div id="post_message_1668079">
<blockquote class="postcontent restore ">
<div>Category</div>
<div>Authour: Kim</div>
line 1<br /> line2
</blockquote>
</div>
</div>
</div> <div class="postrow">
<h2 class="title icon">
This is the title
</h2>
<div class="content">
<div id="post_message_1668079">
<blockquote class="postcontent restore ">
<div>Category</div>
line 1<br /> line2
</blockquote>
</div>
</div>
</div>
I want to extract the following things from each div having class "postrow" and may also have another classes like <div class="postrow first">. So the class "first" is not my concern, just need to have "postrow" in the beginning.
The content inside the tag with class title
the HTML from the "blockquote" tag. But not any div withing this
tag.
Code I tried:
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml("http://localhost/vanilla/");
List<string> facts = new List<string>();
foreach (HtmlNode li in doc.DocumentNode.SelectNodes("//div[#class='postrow']"))
{
facts.Add(li.InnerHtml);
foreach (String s in facts)
{
textBox1.Text += s + "/n";
}
}
Your code has issue you have to give html as string not the path
doc.LoadHtml("http://localhost/vanilla/");
instead
var request = (HttpWebRequest)WebRequest.Create("http://localhost/vanilla/");
String response = request.GetResponse();
doc.loadHtml(response);
now iterate the parsed html

JQuery dialog display according to html

The JQuery $('#dialog').html(htmlContent) displays the html as dialog's body .
I want to display it like twitter bootstrap modal approach .
For ex:
<div class="ui-dialog-titlebar ui-widget-header">
some content...
</div>
<div class='ui-dialog-content ui-widget-content'>
some content...
</div>
Now , at time displaying dialog the 'title-bar' div content should be display in dialog's title header and same for diloag body should show 'ui-dialog-content' divs content.
So is there is any way to show dialog with this approach..
This might do what you're trying to achieve:
$("#dialog").remove(); // Remove any existing dialog
var $html = '<div id="dialog"><div class="ui-dialog-titlebar ui-widget-header">' + headerContent + '</div>' +
'<div class="ui-dialog-content ui-widget-content">' + htmlContent + '</div></div>';
// Append modal HTML to the page
$("body").append( $html );
You'll need to dress it up with some CSS of course.

JQuery Sortable\Serialize issue

I have a document that uses PHP to generate a list from a DB
$q = "SELECT * FROM display ORDER BY diplay_order ASC";
$r = #mysqli_query ($dbc, $q);
echo "<ul id='categoryorder'>";
while ($item = mysqli_fetch_array($r)) {
echo "<li><div id='" . $item['ad_type'] . "_" . $item['unique_id'] . "'>
<form name='remove' action='saveable.php' method='get'>
<input type='submit' value='remove' />
<input type='hidden' name='id' value='" . $item['unique_id'] . "'>
</form>" .$item['unique_id'] . "</div></li>";
}
echo "</ul>";
I have then made these sortable which works fine but I can't get the serialize to work and it just creates an empty array.
$(document).ready(function() {
$('ul#categoryorder').sortable({
update: function() {
var order = $("ul#categoryorder").sortable("serialize");
alert(order);
}
});
});
Then the alert box just comes up empty. I am very new to this and any help would be much appreciated.
You need to add id to <li> tag. Change this part of PHP code,
echo "<li id='" . $item['ad_type'] . "_" . $item['unique_id'] . "'><div><form name='remove' action='saveable.php' method='get'><input type='submit' value='remove' /><input type='hidden' name='id' value='" . $item['unique_id'] . "'></form>" .$item['unique_id'] . "</div></li>";
Added id to <li>.

Resources