HTML Agility Pack not behaving as expected - html-parsing

I am trying to parse this HTML page here with HTML Agility Pack, but I cannot seem to get it to work as expected.
This is my page (shortened):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="de-ch" xml:lang="de-ch">
<head>
</head>
<body id="Adressservices">
<div id="page">
<div id="page-544">
<table class="full">
<thead>
<tr>
<th class="first" scope="col" style="width: 18%;">Type</th>
<th class="col" style="width: 20%;">Name</th>
<th class="col">Date</th>
<th class="col" style="text-align: right; width: 10%;">Size</th>
</tr>
</thead>
<tbody>
<tr>
<td class="first">Change</td>
<td>somefile01.zip</td>
<td style="width: 5%;"><b class="filesize">2012-03-01</b></td>
<td style="text-align: right;"><b class="filesize">881.00</b></td>
</tr>
<tr>
<td class="first">Change</td>
<td>somefile02.zip</td>
<td style="width: 5%;"><b class="filesize">2012-02-01</b></td>
<td style="text-align: right;"><b class="filesize">1400.00</b></td>
</tr>
<tr>.....</tr>
</tbody>
</table>
</div>
</div>
</body>
</html>
The real page has quite a few more <tr>....</tr> rows in that table.
I was able to download the page just fine with HTML Agility Pack using this code snippet:
HtmlWeb web = new HtmlWeb();
HtmlDocument archiveDoc = web.Load(_archiveUrl);
var tables = archiveDoc.DocumentNode.SelectNodes("//table");
So I get a handle on my <table> element, works just fine.
Now I was trying to get the first <tr> element from within that table, and I tried this:
HtmlNode node = tables[0];
var allTRNodes = node.SelectNodes("tbody/tr");
var firstTR = allTRNodes[0];
Here, I'm not getting the n <tr> nodes as expected - but just two. And the first of those doesn't contain a list of y child nodes of type <td> either ...
Then I tried Linq-to-"HTML":
HtmlNode node = tables[0];
var firstTR = node.Element("tbody").Element("tr");
but again: I'm not getting the first <tr> node containing a list of y child nodes of type <td> either ...
Trying to get the list of all <td> nodes inside the first <tr> also didn't work quite as expected:
HtmlNode node = tables[0];
var allTDNodes = node.SelectNodes("tbody/tr/td");
var firstTD = allTDNodes[0];
instead of the y <td> nodes expected, I'm getting just three child nodes - two of the #text, the last one of type <td> - why??
Seems like HTML Agility Pack is misinterpreting the list of <td> nodes as nested nodes......
Any ideas? Thoughts? Hints how to solve this?

use descendant as in this example:
var linkNode = doc.DocumentNode.SelectSingle("//div[#id=\"content-wrapper\"]/dl/dd");
var hrefNode = linkNode.SelectSingleNode("descendant::a");
Something I don't agree with HtmlAgility pack that node.SelectNode* call traversing dom from the top and not from the current node.
Here's adopted sample for your case
// table
var tableNode = docNode.SelectSingleNode("//table");
// first tr
var trNode = tableNode.SelectSingleNode("descendant::tr");
// you can also try, but it's overkill
var trNode1 = tableNode.SelectSingleNode("descendant::tr[0]");
// then your td
var tdNode = trNode.SelectSingleNode("descendant::td");

Related

Hide/show results based on filter

I have a MVC.net web application.
In the view I have a List of records from my database.
The records are displayed in the following format
if (List!=null)
{
<table>
<thead>
<th></th>
</thead>
<tbody>
foreach (item in List)
if (item.startWith("AA"))
{
hide the item originally. and add class to be used by javascript/jquery to show/hide element
}
<td>Item</td>
</tbody>
</table>
}
What i want to do is put a button above the table "Show/hide"
That will hide/show some of the results when clicked.
This is oversimplified skeleton of my code. My actual table has much more information on it.
Mark the items that should be shown/hidden, e.g. with a marker CSS class. You can also set the display CSS property to hide them initially.
#{ const string markerCssClass = "js-hideable"; }
<table>
<thead>
<th></th>
</thead>
<tbody>
foreach (item in List) {
#{ var isHideable = item.StartsWith("AA"); }
<td class="#(isHideable ? markerCssClass : string.Empty)"
style="display: #(isHideable ? "none" : "block")">Item</td>
}
</tbody>
</table>
<button id="show">Show</button>
Then use jquery to show/hide these items using the marker class as selector.
<script type="text/javascript">
$('#show').on('click', function() {
var affectedElements = $('.#markerCssClass');
affectedElements.show();
});
</script>
You can do the filtering with pure CSS if you put the button BEFORE the table and either at the same level or higher.
<html>
<head>
<style>
#filter-toggle:checked ~ .filterable-table .filterable { display: none; }
</style>
</head>
<body>
<input id="filter-toggle" type="checkbox">
<label for="filter-toggle"> Filter</label>
<table class="filterable-table">
<tr class="filterable"><td>Hide me when filtered</td></tr>
<tr><td>Show me when filtered</td></tr>
</table>
</body>
</html>

Displaying information in a grid on my ASP.NET MVC page

There is a "Weekly Monitoring tool" on the intranet I'm working on that displays the following information (supposedly) in a grid : the company, the employee (name), his expected work time, and his current weekly activity.
The Data is saved and filtered week by week, it's all sent to my WeeklyMonitoring.aspx view.
The issue is that I can't seem to find a way to display it as a grid, cleanly, separating each information to make it easier to read.
Here is the relevant part of my View :
<table width="300" border="1" cellpadding="0" cellspacing="0">
<tr>
<th>Company</th>
<th>Employee</th>
<th>Expected Time</th>
<th>Activity</th>
</tr>
<%
int i = 0;
foreach (var name in ViewBag.Names)
{
if ((string)ViewBag.Names[i] != null)
{ %>
<td><%= (string)ViewBag.Company[i]%></td>
<td><%= (string)ViewBag.Names[i]%></td>
<td> --- </td>
<td><%= (string)ViewBag.RecTime[i]%></td>
<%} %>
<%i++; %>
<%
}
%>
</table>
It does display it as a board, separating the information by spaces, how may I create a clean looking grid as it is supposed to be?
JQuery DataTables in an awesome plugin for this and it's very simple to use. First, add the plugin:
<link rel="stylesheet" type="text/css" href="//cdn.datatables.net/1.10.9/css/jquery.dataTables.css">
<script type="text/javascript" charset="utf8" src="//code.jquery.com/jquery-1.10.2.min.js"></script>
<script type="text/javascript" charset="utf8" src="//cdn.datatables.net/1.10.9/js/jquery.dataTables.js"></script>
then Format the table properly:
<table id ="table_id" class="display">
<thead>
<tr>
<th>Company</th>
<th>Employee</th>
<th>Expected Time</th>
<th>Activity</th>
</tr>
</thead>
<%
int i = 0;
foreach (var name in ViewBag.Names)
{
<tr>
if ((string)ViewBag.Names[i] != null)
{ %>
<td><%= (string)ViewBag.Company[i]%></td>
<td><%= (string)ViewBag.Names[i]%></td>
<td> --- </td>
<td><%= (string)ViewBag.RecTime[i]%></td>
<%} %>
<%i++; %>
</tr>
<% } %>
Now, apply the DataTable properties on the table:
<script>
$(document).ready(function () {
$('#table_id').DataTable();
});
</script>
and see the charm.

Make an image in a table cell into a link

I have created an e-commerce website and am using a label to display the products from my sql database, the image it shows of these products are not hyperlinks, but this is what I need them to be, I think I have written the right code but I have a "parentControl" error, could someone help plz?
Below is also a link to show you visually what is being asked. REMEMBER the picture is just an image, but needs to be a hyperlink!
private void FillPage()
{
ArrayList teesList = new ArrayList();
if (!IsPostBack)
{
teesList = ConnectionClass.GetTeesBySize("%");
}
else
{
teesList = ConnectionClass.GetTeesBySize(DropDownList1.SelectedValue);
}
StringBuilder sb = new StringBuilder();
HyperLink link = new HyperLink();
link.NavigateUrl = "http://google.com";
parentControl.Controls.Add(link);
foreach (Tees tees in teesList)
{
sb.Append(string.Format(#"<table class='TeesTable'>
<tr>
<th rowspan='1' width='150px'><img runat='server' src='{6}' /</th>
<th width='50px'>Name: </th>
<td>{0}</td>
</tr>
<tr>
<th>Size:</th>
<td>{1}</td>
</tr>
<tr>
<th>Price:</th>
<td>{2}</td>
</tr>
</table>",
tees.name, tees.size, tees.price, tees.id, tees.id, tees.id, tees.image));
LblOutput.Text = sb.ToString();
(https://dl-web.dropbox.com/get/CompetitiveStreakTemplate/Pic.png?_subject_uid=9403629&w=AAD63dzqPQcNMNSU0OwbVBrGjNGFvtt7VWJ6DKwlu4UoPw).
You need to close your <img> tag properly and wrap it in a <a> to make it a hyperlink. Also use <td> instead of <th> for data rows and not headers.
Something a bit more like:
<td rowspan='1' width='150px'>
<a href='{0}.aspx'>
<img runat='server' src='{6}' />
</a>
</td>
You should probably also tidy up the rest of the code appended to your table, checking what is a header or data, and using a <tbody> tag inside the <table>
Edit: looking closer it seems as though you aren't creating the table properly in the first place, change this chunk of your code to do the loop properly:
parentControl.Controls.Add(link);
sb.Append("<table class='TeesTable'><tbody>");
foreach (Tees tees in teesList)
{
sb.Append(string.Format(#"
<tr>
<td rowspan='1' width='150px'>
<a href='{0}.aspx'>
<img runat='server' src='{6}' />
</a>
</td>
<td width='50px'>Name: </td>
<td>{0}</td>
</tr>
<tr>
<td>Size:</td>
<td colspan='2'>{1}</td>
</tr>
<tr>
<td>Price:</td>
<td colspan='2'>{2}</td>
</tr>",
tees.name, tees.size, tees.price, tees.id, tees.id, tees.id, tees.image));
}
sb.Append("</tbody></table>");
LblOutput.Text = sb.ToString();

How to Use/ Display a Asp.net DataList in MVC Razor View

I have this list below, which I am generating on the fly using Razor view:
<table id="contractCoverablesGrid">
<tbody>
#foreach (var coverableItem in Model.ContractCoverablesList)
{
<tr>
<td>
#Html.Hidden("coverID",coverableItem.CoverID)
</td>
<td>
#Html.Label(coverableItem.Name)
</td>
<td>
<input type='checkbox' class='chkboxActive' checked='checked' />
</td>
</tr>
}
</tbody>
</table>
As you can imagine the outcome of this is a vertical list, which keeps extending downwards.
I would like to have this list 'wrapped' into three adjacent columns instead of one single long column; just as you would do this in ASP.net Datalist server control (in Webforms).
My initial thoughts were to limit the width & height of the table to set values and then keep floating the td(s) left. But how do I float a td. I cannot turn this into div.
Any thoughts?
Please let me know.
You might try a for instead of foreach. You would loop through the list divided by 3 and start a new td each time.
This is how I resolved it:
<table id="contractCoverablesGrid" width="600" align="center">
<tbody>
<tr>
#foreach (var coverableItem in Model.ContractCoverablesList)
{
<td>
<div id="dataListItem">
#Html.Hidden("coverableID", coverableItem.CoverID)
#Html.Label(coverableItem.Name)
<input type='checkbox' name="coverableItemCheckBox" id="coverableItemCheckBox" />
</div>
</td>
}
</tr>
</tbody>
</table>
<style>
#dataListItem {
float: left;
font-weight: normal;
}
</style>

jQuery UI dialog - cannot insert table in dialog contents

I have a simple web page where for each row of data, I can pop up a jQuery UI dialog with the details of that row. Since there can be multiple rows in the sub-query a table is the best choice. The result is that I get an empty dialog box, and the table contained in that div (the one for the dialog) appears at the bottom of the page, whether the row is clicked to activate the dialog. Everything else works perfectly, the event for the click, the dialog popup, the passing of the right id for the div, all perfect.
But the dang table (the one inside the dialog, with the class of 'inner-table') appears at the bottom of the page, right off the bat.
The HTML is created in Groovy, with the HTMLMarkupBuilder, and the resulting HTML looks like the following:
<html>
<head>
<title>NAS Execution Groovy Servlet</title>
<script type='text/javascript' src='js/jquery.js'></script>
<script type='text/javascript' src='js/jquery-ui-1.8.23.custom.min.js'></script>
<script type='text/javascript' src='js/jquery.dataTables.min.js'></script>
<script type='text/javascript' src='js/executions.js'></script>
<link rel='stylesheet' type='text/css' href='css/jquery.dataTables_themeroller.css'></link>
<link rel='stylesheet' type='text/css' href='css/jquery-ui.css'></link>
<link rel='stylesheet' type='text/css' href='css/nas.css'></link>
</head>
<body>
<div id='results' class='execution-results'>
<p id='rpt-header'>
<span class='rpt-header-txt'>Backup Schedule Report for </span>
<span class='rpt-header-asset'>ret2w089n1t1</span>
</p>
<table id='nas-table'>
<thead>
<tr class='table-header'>
<th class='hidden'>Backup ID</th>
<th>Schedule Name</th>
<th>Backup Product</th>
<th>Size Mb</th>
<th>Start Time</th>
<th>Duration</th>
<th>Expiration Date</th>
<th>Mon 17</th>
</tr>
</thead>
<tbody>
<tr class='row'>
<td class='hidden'>12345678</td>
<td class='row-data'>null</td>
<td class='row-data'>Product One</td>
<td id='size-mb' class='row-data'>601.31</td>
<td class='row-data'>00:09:03</td>
<td class='row-data'>158 secs</td>
<td class='row-data'>2012-10-01</td>
<td class='row-center'>
<img id='success-fail' src='img/success.gif'></img>
</td>
</tr>
<tr class='row'>
<td class='hidden'>23456789</td>
<td class='row-data'>PolicyName</td>
<td class='row-data'>Product Two</td>
<td id='size-mb' class='row-data'>995.92</td>
<td class='row-data'>20:09:00</td>
<td class='row-data'>191 secs</td>
<td class='row-data'>2012-10-01</td>
<td class='row-center'>
<img id='success-fail' src='img/success.gif'></img>
</td>
</tr>
<div id='dialog-23456789' class='details-dialog'>
<table class='inner-table'>
<thead>
<tr>
<th>JOB_TYPE_NAME</th>
<th>VENDOR_STATUS_NAME</th>
<th>KILOBYTES</th>
</tr>
</thead>
<tbody>
<tr>
<td>Incr Backup</td>
<td>Successful</td>
<td>1019821</td>
</tr>
</tbody>
</table>
</div>
</tbody>
</table>
</div>
</body>
</html>
The jQuery for this is pretty simple; it uses the id from the row clicked on, and pops up a dialog window. That works fine, but the table that is contained in that div is actually at the bottom of the screen, even before anything is clicked:
$(document).ready(function() {
$('#nas-table').dataTable( {
"bJQueryUI": true,
"aaSorting": [[4, 'asc']]
} );
$('.row').live("click", function(){
var target = $(this);
var backupId = $(this).children(":first").html();
var thisId = '#dialog-' + backupId;
$(thisId).dialog(
{
title: "Backup Job Detail",
width: 800,
height: 450
}
);
$(thisId).dialog("open");
$(thisId).dialog("widget").position({
my: 'left top',
at: 'left bottom',
of: target
});
});
} );
At first, I thought the Groovy HTMLMarkupBuilder was outputting the DOM before everything happened, but when I do a view source, copy it to a file, and open the file in my browser, I get the same result.
I would appreciate any help with this. I asked this question earlier, in case you want to complain about that, but I had to follow up some other potential issues in the Groovy code, which I resolved. This example is more complete, and represents exactly what my code will do.
Brian
The problem is that you have a div nested within a table outside of a TR and TD, which will cause the rendering and DOM to be a bit wrong. If you adjust the html so that it resembles something like this it will work:
<div id='results' class='execution-results'>
<p id='rpt-header'>
<span class='rpt-header-txt'>Backup Schedule Report for </span>
<span class='rpt-header-asset'>ret2w089n1t1</span>
</p>
<table id='nas-table'>
<thead>
<tr class='table-header'>
<th class='hidden'>Backup ID</th>
<th>Schedule Name</th>
<th>Backup Product</th>
<th>Size Mb</th>
<th>Start Time</th>
<th>Duration</th>
<th>Expiration Date</th>
<th>Mon 17</th>
</tr>
</thead>
<tbody>
<tr class='row'>
<td class='hidden'>12345678</td>
<td class='row-data'>null</td>
<td class='row-data'>Product One</td>
<td id='size-mb' class='row-data'>601.31</td>
<td class='row-data'>00:09:03</td>
<td class='row-data'>158 secs</td>
<td class='row-data'>2012-10-01</td>
<td class='row-center'>
<img id='success-fail' src='img/success.gif'></img>
</td>
</tr>
<tr class='row'>
<td class='hidden'>23456789</td>
<td class='row-data'>PolicyName</td>
<td class='row-data'>Product Two</td>
<td id='size-mb' class='row-data'>995.92</td>
<td class='row-data'>20:09:00</td>
<td class='row-data'>191 secs</td>
<td class='row-data'>2012-10-01</td>
<td class='row-center'>
<img id='success-fail' src='img/success.gif'></img>
</td>
</tr>
</tbody>
</table>
<div id='dialog-23456789' class='details-dialog' style="display:none;">
<table class='inner-table'>
<thead>
<tr>
<th>JOB_TYPE_NAME</th>
<th>VENDOR_STATUS_NAME</th>
<th>KILOBYTES</th>
</tr>
</thead>
<tbody>
<tr>
<td>Incr Backup</td>
<td>Successful</td>
<td>1019821</td>
</tr>
</tbody>
</table>
</div>
</div>
The trick is to move the div outside of the parent table and also you need to set the display:none on the details table or it will be shown when the page is rendered.

Resources