Why my code is selecting all text() nodes in Htmldocument - c#

HtmlNode node = doc.DocumentNode.SelectNodes("//tr")[0];
foreach(HtmlTextNode n in node.SelectNodes("//text()"))
Console.WriteLine(n.Text);
HTML:
<table class="infobox" style="width: 17em; font-size: 100%;float: left;">
<tr>
<th style="text-align: center; background: #f08080;" colspan="3">خدیجہ مستور</th>
</tr>
<tr style="text-align: center;">
<td colspan="3"><img alt="خدیجہ مستور" src="//upload.wikimedia.org/wikipedia/ur/thumb/7/7b/Khatijamastoor.JPG/150px-Khatijamastoor.JPG" width="150" height="203" srcset="//upload.wikimedia.org/wikipedia/ur/thumb/7/7b/Khatijamastoor.JPG/225px-Khatijamastoor.JPG 1.5x, //upload.wikimedia.org/wikipedia/ur/thumb/7/7b/Khatijamastoor.JPG/300px-Khatijamastoor.JPG 2x"><br>
<div style="font-size: 90%">خدیجہ مستور</div>
</td>
</tr>
<tr>
<th style="background: #f08080;" colspan="3">ادیب</th>
</tr>
<tr>
<td><b>ولادت</b></td>
<td colspan="2">1930ء، لکھنؤ، برطانوی ہندوستان</td>
</tr>
<tr>
<td><b>اصناف ادب</b></td>
<td colspan="2">ناول</td>
</tr>
<tr>
<td><b>معروف تصانیف</b></td>
<td colspan="2">آنگن</td>
</tr>
</table>
Output Should be :
خدیجہ مستور
but i found :
خدیجہ مستور
خدیجہ مستور
ادیب
ولادت
1930ء
،
لکھنؤ
،
برطانوی ہندوستان
اصناف ادب
ناول
معروف تصانیف
آنگن
Why node.selectNodes("//text()") is selecting all text() nodes in document rather text() nodes from just first tr tag??

Because you are adding two forward slashes to the beginning of your XPath (//tr), which selects all of the elements in the document, not just descendants of the selected node.
Try this instead:
foreach (HtmlTextNode n in node.SelectNodes("text()"))
Or just simplify the XPath to:
var node = doc.DocumentNode.SelectSingleNode("//tr[1]/text()");
Console.WriteLine(node.Text);

Related

Select Node with size of Its specific child Node In Linq, HtmlAgilityPack

I'm trying to get following data.
<html>
<body>
<tr class="udline">
<th rowspan="2" class="noln">시간</th>
<th rowspan="2">개인</th>
<th rowspan="2">외국인</th>
<th rowspan="2">기관계</th>
<th colspan="6" class="eb">기관</th>
<th rowspan="2">기타법인</th>
</tr>
<tr class="udline">
<th class="sub">금융투자</th>
<th class="sub">보험</th>
<th class="sub">투신<br>(사모)</th>
<th class="sub">은행</th>
<th class="sub">기타금융기관</th>
<th class="sub">연기금등</th>
</tr>
<tr>
<td colspan="11" class="blank_07"></td>
</tr>
<!-- following are data -->
<tr>
<td class="date2">18:01</td>
<td class="rate_up3">2,024</td>
<td class="rate_down3">-3,307</td>
<td class="rate_up3">1,116</td>
<td class="rate_up3">824</td>
<td class="rate_down3">-16</td>
<td class="rate_up3">764</td>
<td class="rate_down3">-43</td>
<td class="rate_down3">-5</td>
<td class="rate_down3">-408</td>
<td class="rate_up3">166</td>
</tr>
<tr>
<td class="date2">18:00</td>
<td class="rate_up3">2,022</td>
<td class="rate_down3">-3,305</td>
<td class="rate_up3">1,116</td>
<td class="rate_up3">824</td>
<td class="rate_down3">-16</td>
<td class="rate_up3">764</td>
<td class="rate_down3">-43</td>
<td class="rate_down3">-5</td>
<td class="rate_down3">-408</td>
<td class="rate_up3">166</td>
</tr>
...
</body></html>
I want to get Nodes list of "tr" tag which has a data. but I have problem with getting "tr" tag.
I think it is enough if I can get sets of "tr" which has 11 td tags.
so I write following source.
result = await httpClient.GetStringAsync(new Uri(timeUrlAddress));
htmlDoc.LoadHtml(result);
var nodes =
htmlDoc.DocumentNode.SelectNodes("//tr")
.Where(i => i.ChildNodes.Any(j => j.Name.Equals("td")).Count>10); // <--- I have Problem.
foreach(var i in nodes) { ... } // <-- iterating list of <tr> tags.
and It doesn't work.
I could get List of tr tag with DoucmentNode.SelectNodes("//tr") ... and I appended .Where(i=>i.ChildNodes.Count >10 ) to get what i want.
but tr has several "text"childNodes and I get Unwanted Node. following picture shows that I got with .Where(i=>i.ChildNodes.Count>10).
I want to get tr node that has td tag as child nodes and has exactly 11 of td tag.
how can I get that tr nodes with Linq syntax..?
If you want tr node with exactrly 11 td children you can use below XPath:
//tr[count(td) = 11]

Export to Excel fails on some machines

We are developing a Web application and some of our query results need to be Exported to Excel. We are using the following C# code to Export :
System.Web.HttpContext ctx = System.Web.HttpContext.Current;
CurrentPackingListModel.Voyage.ShipmentDataContext = ShipmentDataContext;
ctx.Response.Clear();
string filename = "ApprovalForm.xls";
ctx.Response.AddHeader("content-disposition", "attachment;filename=" + filename);
ctx.Response.ContentType = "application/vnd.ms-excel";
ctx.Response.ContentEncoding = System.Text.Encoding.GetEncoding("UTF-8");
ctx.Response.Charset = "UTF-8";
return View("../Packing/_ExportApprovalForm", CurrentPackingListModel);
The partial View I am returning to result is as follows :
<body id="body" onload="window.print();">
<table>
<tbody>
<tr>
<td class="table-header" colspan="8">
<div style="width: 100%">
<div class="lleft">
#* <img id="imgLogo" src="~/Images/myLogo.png" />*#
</div>
<div class="baslik">Approval Packing List Form</div>
<div style="float: right;">#DateTime.Now.ToString("MM.dd.yyyy")</div>
</div>
</td>
</tr>
<tr>
<td colspan="6"></td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td class="line-header">#Html.DisplayNameFor(x => x.ID)</td>
<td>: #Html.HiddenFor(x => x.ID)#Html.DisplayFor(x => x.ID)</td>
<td class="line-header" style="width: 165px;">#Html.DisplayNameFor(x => x.Voyage.StartDate)</td>
<td>: #Html.DisplayFor(x => x.Voyage.StartDate)</td>
<td class="line-header">#Html.DisplayNameFor(x => x.Voyage.VesselID)</td>
<td>: #Html.DisplayFor(x => x.Voyage.VesselText)
</td>
</tr>
<tr>
<td class="line-header">#Html.DisplayNameFor(x => x.Voyage.Id)</td>
<td>: #Html.DisplayFor(x => x.Voyage.Id)</td>
<td class="line-header" style="width: 165px;">#Html.DisplayNameFor(x => x.Voyage.EndDate)</td>
<td>: #Html.DisplayFor(x => x.Voyage.EndDate)</td>
<td></td>
<td></td>
</tr>
<tr>
<td colspan="6">
<hr />
</td>
</tr>
</tbody>
</table>
<table>
<tr>
<td class="line-header" style="width: 160px;">Approve Personel</td>
<td style="border: solid 1px; width: 180px;"></td>
<td class="line-header">Discharge Port</td>
<td style="border: solid 1px; width: 180px;"></td>
</tr>
<tr>
<td class="line-header">Approve Date</td>
<td style="border: solid 1px;"></td>
<td class="line-header">Terminal</td>
<td style="border: solid 1px;"></td>
</tr>
<tr>
<td class="line-header">Signiture</td>
<td style="border: solid 1px;"></td>
<td></td>
<td></td>
</tr>
</table>
#if (Request.QueryString["type"] == "HRC" && Model.HrcListPrint != null)
{
<table>
<tr>
<td colspan="10" style="height: 20px;">
<hr />
</td>
</tr>
<tr>
<td style="text-align: center; width: 210mm; font-weight: bold;" colspan="11">HRC LIST
</td>
</tr>
</table>
<table class="display dataTable no-footer">
<thead>
<tr>
<th>Customer Name</th>
<th>Customer PO No</th>
<th>Ord. ITem No</th>
<th>CM No</th>
<th>Product</th>
<th>Size (T x W inch)</th>
<th>Thickness Tolerance</th>
<th>Qty (tons)</th>
<th>Coil Weight (Lbs)</th>
<th>Destination Port</th>
<th>Barcode</th>
<th>Heat No</th>
<th>Status</th>
</tr>
</thead>
<tbody>
#foreach (MedTrade.Apollo.Shared.Models.Shipment.PackingListDetailModel item in Model.HrcListPrint)
{
<tr>
<td>#item.CustomerName</td>
<td>#item.CustomerPurchaseOrderNumber</td>
<td>#String.Format("'{0}'", item.OrderItemText)</td>
<td>#item.CMNO</td>
<td>#item.ProductStandartName</td>
<td>#item.ProductProperty</td>
<td>#item.ThicknessToleranceType</td>
<td>#((item.Quantity / 1000).ToString("N3"))</td>
<td>#item.CoilWeight.ToString("N0")</td>
<td>#item.DischargePortTanim</td>
<td>#item.BarcodeNo</td>
<td>#item.HeatNo</td>
<td>#item.StatusText</td>
</tr>
}
</tbody>
</table>
}
#if (Request.QueryString["type"] == "Rebar" && Model.RebarListPrint != null)
{
<table>
<tr>
<td colspan="10" style="height: 20px;">
<hr />
</td>
</tr>
<tr>
<td style="text-align: center; width: 210mm; font-weight: bold;" colspan="10">REBAR LIST
</td>
</tr>
</table>
<table class="display dataTable no-footer">
<thead>
<tr>
<th>Customer Name</th>
<th>Customer PO No</th>
<th>Ord. ITem No</th>
<th>CM No</th>
<th>Product</th>
<th>Size (D x L inch)</th>
#if (Model.SearchCriteria.ViewType == ViewType.Group)
{
<th>Qty (tons) / # of bundles</th>
}
else
{
<th>Quantity (Tons)</th>
}
<th>Bundle Weight (Lbs)</th>
<th>Destination Port</th>
#if (Model.SearchCriteria.ViewType == ViewType.Detail)
{
<th>Barcode</th>
}
<th>Heat No</th>
<th>Status</th>
</tr>
</thead>
<tbody>
#foreach (MedTrade.Apollo.Shared.Models.Shipment.PackingListDetailModel item in Model.RebarListPrint)
{
<tr>
<td>#item.CustomerName</td>
<td>#item.CustomerPurchaseOrderNumber</td>
<td>#String.Format("'{0}'", item.OrderItemText)</td>
<td>#item.CMNO</td>
<td>#item.ProductStandartName</td>
<td>#item.ProductProperty</td>
#if (Model.SearchCriteria.ViewType == ViewType.Group)
{
<td>#((item.Quantity / 1000).ToString("N3")) / #item.Count</td>
}
else
{
<td>#((item.Quantity / 1000).ToString("N3"))</td>
}
<td>#item.BundleWeight.ToString("N0")</td>
<td>#item.DischargePortTanim</td>
#if (Model.SearchCriteria.ViewType == ViewType.Detail)
{
<td>#item.BarcodeNo</td>
}
<td>#item.HeatNo</td>
<td>#item.StatusText</td>
</tr>
}
</tbody>
</table>
}
</body>
But export to Excel works on some machines, but not others. This started to happen only recently.
Is there a possible solution to fix this without re-writing the whole Exporting functionality?
Apologies this is not an answer but i am unable to comment as i have low rep.
IS excel installed on the machines it is not working on? From my memory excel must be installed for you to export. I have used Epplus to get around this in the past.
Are you able to add some exception handling and logging around this issued so you can get more details from the error? Even if you just write it to a txt file on the machine.
Sorry i dont have any actual answers.
May below code will be helpful for you:
public void ExportToExcel()
{
DataGrid dgGrid = new DataGrid();
dgGrid.DataSource = /*Give your data source here*/;
dgGrid.DataBind();
System.Web.HttpContext.Current.Response.ClearContent();
System.Web.HttpContext.Current.Response.Buffer = true;
System.Web.HttpContext.Current.Response.AddHeader("content-disposition", string.Format("attachment; filename={0}", "Data Report.xls"));
System.Web.HttpContext.Current.Response.ContentType = "application/vnd.ms-excel";
System.Web.HttpContext.Current.Response.Charset = "";
System.IO.StringWriter sw = new System.IO.StringWriter();
HtmlTextWriter htw = new HtmlTextWriter(sw);
dgGrid.RenderControl(htw);
System.Web.HttpContext.Current.Response.Output.Write(sw.ToString());
System.Web.HttpContext.Current.Response.Flush();
System.Web.HttpContext.Current.Response.End();
}

Add tbody XML Element to table Element in XDcoument

Want to add <tbody> element in <table> elements if missing on Xdcoument.
<table class="newtable" id="item_559_Table1" cellpadding="0" cellspacing="0" data-its-style="width:11.4624em; border-spacing:0;">
<colgroup data-its-style="width:11.4624em; " />
<tr>
<td data-its-style="padding:0.2292em; vertical-align:top; ">
<p data-its-style="">My dad cooks up a pot of chicken soup, and</p>
</td>
</tr>
<tr>
<td data-its-style="padding:0.2292em; vertical-align:top; ">
<p data-its-style="font-weight:normal; ">This cold means I can’t taste a thing today!</p>
</td>
</tr>
</table>
Output should look like
<table class="newtable" id="item_559_Table1" cellpadding="0" cellspacing="0" data-its-style="width:11.4624em; border-spacing:0;">
<colgroup data-its-style="width:11.4624em; " />
<tbody>
<tr>
<td data-its-style="padding:0.2292em; vertical-align:top; ">
<p data-its-style="">My dad cooks up a pot of chicken soup, and</p>
</td>
</tr>
<tr>
<td data-its-style="padding:0.2292em; vertical-align:top; ">
<p data-its-style="font-weight:normal; ">This cold means I can’t taste a thing today!</p>
</td>
</tr>
</tbody>
</table>
**Not looking for XSLT solution.
One way to do it would be to grab the children of <table>, then add them back they way you want them.
var doc = XDocument.Load("file.xml");
var colgroup = doc.Root.Elements("colgroup");
var tr = doc.Root.Elements("tr");
// Add tr to tbody
var tbody = new XElement("tbody", tr);
// Replace the children of table with colgroup and tbody
doc.Root.ReplaceNodes(colgroup, tbody);

How to read <table> into 'onmouseover' event with C# and HTMLAgilityPack

How to read <table> into onmouseover event with C# and HTMLAgilityPack?
markup code :
<a href="#" class="chan_live_not_free" onclick="return false;" onmouseover="return overlib('
<table>
<tr class=fieldRow>
<td class=posH_col width=40>
<strong>pos</strong>
</td>
<td class=rest_col width=90>
<strong>satellite</strong>
</td>
<td class=freqH_col width=50>
<strong>freq</strong>
</td>
<td class=rest_col width=90>
<strong>symbol</strong>
</td>
<td class=rest_col width=90>
<strong>encryption</strong>
</td>
</tr>
<tr>
<td class="pos_col">39.0°e</td>
<td class=rest_col>Hellas Sat 2</td>
<td class="freq_col">12.606 H</td>
<td class=rest_col>30000 - 2/3</td>
<td class=enc_not_live>MPEG-4 BulCrypt</td>
</tr>
</table>',CAPTION, 'Arena Sport 4 (serbia) – 19/10/14 - 11:30');" onmouseout="return nd();">
Arena Sport 4 (serbia)
</a>
I need to read the table into onmouseover event. How does it read?
You could get the element attribute of the <a> tag with HTML Agility Pack and then using regular expressions get the <table> inside the string, something like the following code :
var html = #"<a href='#' class='chan_live_not_free' onclick='return false;' onmouseover='return overlib(
<table>
<tr class=fieldRow>
<td class=posH_col width=40>
<strong>pos</strong>
</td>
<td class=rest_col width=90>
<strong>satellite</strong>
.
.
.
<tr>
<td class="pos_col">39.0°e</td>
<td class=rest_col>Hellas Sat 2</td>
<td class="freq_col">12.606 H</td>
<td class=rest_col>30000 - 2/3</td>
<td class=enc_not_live>MPEG-4 BulCrypt</td>
</tr>
</table>,CAPTION, 'Arena Sport 4 (serbia) – 19/10/14 - 11:30');' onmouseout='return nd();'>
Arena Sport 4 (serbia)
</a>";
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var value = doc.DocumentNode.SelectSingleNode("//a[#class='chan_live_not_free']").Attributes["onmouseover"].Value;
var text = Regex.Matches(value, #"<table>([^)]*)</table>")[0].Value;

Grabbing a timesheet HTMLAgilityPack

I need to grab a timesheets from a website. I want to store/add this timesheet to a data table in my C# Application.
The structure of the data table looks like this:
1. | Day | Time | Status |
2. ..1.......7:00.........IN
3. ..1.......9:45.......OUT
4. ..1......10:15........IN
5. ..1......15:45......OUT
6. ..1.......8:45.....TOTAL
7. ..2 .. ..
My C# code for the DataTable:
DataTable table = new DataTable("Worksheet");
table.Columns.Add("Day");
table.Columns.Add("Time");
table.Columns.Add("Status");
I tried different variants and I always mess up with all the data.
For testing purpose I made a new Winform with a "textbox" (for the sitepath) and "button"(to start the process)
Then I want HTMLAgilityPack to get all the data. one example:
public string[] GREYsource;
public Form1()
{
InitializeComponent();
}
private void btnSubmit_Click(object sender, EventArgs e)
{
var doc = new HtmlAgilityPack.HtmlDocument();
var fileName = txtPath.Text; // I downloaded the HTML-File
doc.Load(fileName);
string strGREYInner;
foreach (HtmlNode td in doc.DocumentNode.SelectNodes("//tr[#class=\"tblDataGreyNH\"]"))
{
strGREYInner = td.InnerText.Trim();
string shorted = strGREYInner.Replace("\t", ""); string shorted2 = shorted.Replace("\n\n\n\n", "\n\n\n"); string shorted3 = shorted2.Replace("\n\n\n", "\n\n"); string shorted4 = shorted3.Replace("\n\n", "\n");
GREYsource = shorted4.Split(new Char[] { '\n', });
}
foreach (string str in GREYsource)
{
...
}
}
Problem: the result contains a lot of tabs(/t) and newlines(/n) I need to trim.
Problem: This isn't a good way to do it, IMO. And this would just grab the Totaltimes.
It can be done better.
This is just a example I tried (other codes just went a pile of junk)
I attached the HTML-structure below:
Overview(picture):
A bit more in depth:
<html>
<head>
</head>
<style type="text/css">
</style>
<body id="body" onload="handleMenuOverlapLogo();onload_column_expand();;firstElementFocus();">
<.. some (java)scripts> /* has to be ignoered. not necessary */
<.. some other divs> /* has to be ignoered. not necessary */
<div id="rowContent"> /* This <div> contains the content i need */
<div id="titleTab"> /* Title is not necessary */
</div>
<div id="rowContentInner"> /* Here the content starts */
<table class="tblList">
<tbody>
<tr> /* not necessary */
<tr class="tblHeader"> /* not necessary */
<tr class="tblHeader"> /* not necessary */
<tr class="tblDataWhiteNH"> /* IN : */
<td class="tblHeader" style="font-weight: bold; text-align: right"> In </td>
<td nowrap=""> /* "tblDataWhiteNH" always contains 7 "td nowrap"
<td nowrap="">
<td nowrap=""> /* Example: if it contains a value */
<table width="100%" border="0" align="center">
<tbody>
<tr>
<td width="25%" align="left"> </td>
<td nowrap="" width="50%" align="center"> 7:53 </td> /* value = 7:53 (THIS!) */
<td width="25%" align="right"> </td>
</tr>
</tbody>
</table>
</td>
<td nowrap="">
<td nowrap=""> /* Example: if it contains no value */
<table width="100%" border="0" align="center">
<tbody>
<tr>
<td width="25%" align="left"> </td>
<td nowrap="" width="50%" align="center"> /* no value = 0:00 (THIS!) */
<td width="25%" align="right"> </td>
</tr>
</tbody>
</table>
</td>
<td nowrap="">
<td nowrap="">
<tr class="tblDataWhiteNH"> /* OUT : */
<td class="tblHeader" style="font-weight: bold; text-align: right"> Out </td>
<td nowrap=""> /* "tblDataWhiteNH" always contains 7 "td nowrap".
<td nowrap="">
<td nowrap=""> /* Example: if it contains a value */
<table width="100%" border="0" align="center">
<tbody>
<tr>
<td width="25%" align="left"> </td>
<td nowrap="" width="50%" align="center"> 7:53 </td> /* value = 7:53 (THIS!) */
<td width="25%" align="right"> </td>
</tr>
</tbody>
</table>
</td>
<td nowrap="">
<td nowrap=""> /* Example: if it contains no value */
<table width="100%" border="0" align="center">
<tbody>
<tr>
<td width="25%" align="left"> </td>
<td nowrap="" width="50%" align="center"> /* no value = 0:00 (THIS!) */
<td width="25%" align="right"> </td>
</tr>
</tbody>
</table>
</td>
<td nowrap="">
<td nowrap="">
<tr class="tblDataGreyNH"> /* IN : */
<tr class="tblDataGreyNH"> /* OUT : */
... /* "tblDataGreyNH" is built up the same way like "tblDataWhiteNH".
... /* sometimes there could be more "tblDataWhiteNH" and "tblDataGreyNH". */
... /* Usally there are just the "tblDataWhiteNH"(IN/OUT) */
<tr class="tblHeader"> /* not necessary */
/* It continues f.egs. with "tblDataWhite" if the last above header was a "tblDatagrey" */
/* and versa vice ("grey" if there was a "white" before.) */
<tr class="tblDataWhiteNH"> /* Worked : */
<td class="tblHeader" style="font-weight: bold; text-align: right"> Total Time </td>
<td> 07:47 </td> /* value = 7:47 (THIS!) */
<td> 04:48 </td>
<td> 00:00 </td> /* no value = 0:00 (THIS!) */
<td> 00:00 </td>
<td> 07:42 </td>
<td> 00:00 </td>
<td> 00:00 </td>
</tr>
<tr class="tblDataGreyNH"> /* Total : */
<td class="tblHeader" style="font-weight: bold; text-align: right"> Regular Time </td>
<td> 07:47 </td> /* value = 7:47 (THIS!) */
<td> 04:48 </td>
<td> </td> /* no value = 0:00 (THIS!) */
<td> </td>
<td> 07:42 </td>
<td> </td>
<td> </td>
</tr>
<tr class="tblHeader"> /* not necessary */
<tr valign="top"> /* not necessary */
</tbody>
</table>
</div>
</div>
</body>
</html>
a copy of the original HTML: http://time.wnb.dk/123/
I Hope anyone could help me get this to work.
Okay let me explain it with a picture. https://www.abload.de/img/eeeqnuwu.png
On the Picture you see the website + a table below, how the result should look like.
Declaring the Datatable isnt the problem.
The main problem is I can't get htmlagility to spit out right results and if it did, its almost buggy.
Some of the selectnodes I tried got the output messed up after a while. As yet I wasn't able to get "all" data from the table on the website, just some values, but often buggy.
So I'm actually searching for someone who could take a look on this and maybe help me to find the right selectnodes.
Not sure I fully understand what you want to do but here is a sample code that should help you get started. I strongly suggest you have a look at XPATH to understand it.
HtmlDocument doc = new HtmlDocument();
doc.Load(yourFile);
// get all TR with a specific class name, starting from root (/), and recursively (//)
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//tr[#class='tblDataGreyNH' or #class='tblDataWhiteNH']"))
{
// get all TD below the current node with a specific class name
HtmlNode inOrOut = node.SelectSingleNode("td[#class='tblHeader']");
if (inOrOut != null)
{
string io = inOrOut.InnerText.Trim();
Console.WriteLine(io.ToUpper());
if (io.Contains("Time"))
{
// normalize-space gets rid or whitespaces (\r,\n, etc.)
// text() gets the node's inner text
foreach (HtmlNode td in node.SelectNodes("td[normalize-space(#class)='' and normalize-space(text())!='' and normalize-space(text())!='00:00']"))
{
Console.WriteLine("value:" + td.InnerText.Trim());
}
}
}
// gets all TD below the current node that define the NOWRAP attribute
HtmlNodeCollection tdNoWraps = node.SelectNodes("td[#nowrap]");
if (tdNoWraps != null)
{
foreach (HtmlNode tdNoWrap in tdNoWraps)
{
string value = tdNoWrap.InnerText.Trim();
if (value == string.Empty)
continue;
Console.WriteLine("value:" + value);
}
}
}
It will output this from your sample page:
IN
value:7:47
value:7:46
value:7:45
value:7:51
OUT
value:15:35
value:15:33
value:12:38
value:8:59
IN
value:12:38
value:8:59
OUT
value:15:35
TOTAL TIME
value:07:48
value:07:47
value:07:50
value:01:08
REGULAR TIME
value:07:48
value:07:47
value:07:50
value:01:08

Categories