I have the following HTML structure, each tr tag is separated with each other, so when i tried to parse with XPATH, it is supposed to have 2 subitems for just one category, but with my code below it selects all 4 subitems into 1 category, so each category has 4 subitems instead of just 2.
<table class="available">
<tbody>
<tr>
<td class="catname" colspan="2">
<span>Category 1</span>
</td>
</tr>
<tr>
<td rowspan="2" colspan="1" class="itemdetail">
<div class="subname">
SubItem1-1
</div>
</td>
<td class="precioseleccion desgloseth">
<div class="preprice">
<strong class="price">39.99 €</strong>
</div>
</td>
</tr>
<tr>
<td rowspan="2" colspan="1" class="itemdetail">
<div class="subname">
SubItem1-2
</div>
</td>
<td class="precioseleccion desgloseth">
<div class="preprice">
<strong class="price">49.99 €</strong>
</div>
</td>
</tr>
<tr>
<td class="catname" colspan="2">
<span>Category 2</span>
</td>
</tr>
<tr>
<td rowspan="2" colspan="1" class="itemdetail">
<div class="subname">
SubItem2-1
</div>
</td>
<td class="precioseleccion desgloseth">
<div class="preprice">
<strong class="price">59.99 €</strong>
</div>
</td>
</tr>
<tr>
<td rowspan="2" colspan="1" class="itemdetail">
<div class="subname">
SubItem2-2
</div>
</td>
<td class="precioseleccion desgloseth">
<div class="tooltip3">
<strong class="price">69.99 €</strong>
</div>
</td>
</tr>
</tbody>
</table>
var doc = new HtmlDocument(); // with HTML Agility pack
doc.LoadHtml(uricontent);
var rooms = doc.DocumentNode
.SelectNodes("//table[#class='available']//td[#class='catname']")
.Select(r => new
{
Type= r.InnerText.CleanInnerText(),
SubTypes= r.SelectNodes("../..//tr//td[#class='itemdetail']//div[#class='subname']")
.Select(s => new
{
SubType= s.InnerText.CleanInnerText(),
Price =
s.SelectSingleNode(".//parent::td/following-sibling::td[#class='allprice']//div[#class='preprice']//strong[#class='price']")
.InnerText.CleanInnerText()
}).ToArray()
}).ToArray();
If I understand your question correctly, to select all the Categories you want //tr[td[#class='catname']], and to select their sub-items you want following-sibling::tr/td[div[#class='subname']].
Related
i need to print some content in a way that it maximizes space and fills the whole page. For splitting content in 2 columns i do this on server side. I have a list and divide it in half. While this worked when i did it directly on the content coming from database, it does not work after i grouped countent by classes in order to not repeat information. Hence why everything looks uneven on the html page.
AbastecimentosColuna1 = referenciasList.Take(referenciasList.Count() / 2).ToList();
AbastecimentosColuna2 = referenciasList.Skip(referenciasList.Count() / 2).ToList();
In other words, how can keep content adjusted to whole page? Or is there any other way to split content in 2 columns without splitting the array from server side?
content
<div class="row">
<div class="col-6 table-responsive">
<table class="table table-sm table-bordered border-dark text-center">
<thead>
<tr>
<th>Referência</th>
<th>Qtd. Abastecimento</th>
<th>Peças Por Caixa</th>
<th>Nº Caixas</th>
<th>Localização - Etiqueta FIFO</th>
</tr>
</thead>
<tbody>
#foreach (var item in Model.AbastecimentosColuna1)
{
<tr>
<td>
#Html.DisplayFor(modelItem => item.Referencia)
</td>
<td>
#Html.DisplayFor(modelItem => item.QtdAbastecimento)
</td>
<td>
#Html.DisplayFor(modelItem => item.QtdPecasPorCaixa)
</td>
<td>
#Html.DisplayFor(modelItem => item.QtdCaixas)
</td>
<td>
#foreach (var localizacao in item.Localizacoes)
{
<div class="row py-2">
<div class="col-6">
#localizacao.Localizacao
</div>
<div class="col-6">
#foreach (var etiqueta in localizacao.Etiquetas)
{
#etiqueta.Etiqueta
<br />
}
</div>
</div>
}
</td>
</tr>
}
</tbody>
</table>
</div>
<div class="col-6 table-responsive">
<table class="table table-sm table-bordered border-dark text-center">
<thead>
<tr>
<th>Referência</th>
<th>Qtd. Abastecimento</th>
<th>Peças Por Caixa</th>
<th>Nº Caixas</th>
<th>Localização - Etiqueta FIFO</th>
</tr>
</thead>
<tbody>
#foreach (var item in Model.AbastecimentosColuna2)
{
<tr>
<td>
#Html.DisplayFor(modelItem => item.Referencia)
</td>
<td>
#Html.DisplayFor(modelItem => item.QtdAbastecimento)
</td>
<td>
#Html.DisplayFor(modelItem => item.QtdPecasPorCaixa)
</td>
<td>
#Html.DisplayFor(modelItem => item.QtdCaixas)
</td>
<td>
#foreach (var localizacao in item.Localizacoes)
{
<div class="row py-2">
<div class="col-6">
#localizacao.Localizacao
</div>
<div class="col-6">
#foreach (var etiqueta in localizacao.Etiquetas)
{
#etiqueta.Etiqueta
<br />
}
</div>
</div>
}
</td>
</tr>
}
</tbody>
</table>
</div>
</div>
I have string html, I want to get all id name of tag in string html.
get string html in file text:
<tr>
<td class="X8">
</td>
<td colspan="6" class="X9"></td>
<td colspan="4" class="X12" id="closedate">
</td>
<td colspan="6" class="X9"></td>
<td colspan="4" class="X12" id="startdate">
</td>
<td class="X8">
</td>
<td class="X8" colspan="3">
</td>
<td class="X8">
</td>
<td colspan="9" class="X9"></td>
<td colspan="6" class="X15" id="totalpayment"></td>
<td class="X8">
</td>
<td class="X8">
</td>
</tr>
<tr>
<td class="X17">
</td>
<td class="X17" colspan="8">
</td>
<td class="X17" colspan="33">
</td>
<td class="X17">
</td>
</tr>
<tr>
<td class="X17">
</td>
<td class="X17" colspan="8">
<td class="X17" colspan="16">
</td>
<td class="X17">
</td>
<td colspan="9" class="X20"></td>
<td colspan="6" class="X23" id="approvaldate"></td>
<td class="X17">
</td>
<td class="X17">
</td>
</tr>
expected results:
closedate, startdate,totalpayment, approvaldate.
Then I want to set inner text for id name tag
(Ex:<td colspan="6" class="X23" id="approvaldate">2018/07/18</td>)
Using c#.Help me, please. Thanks a lot.
What I am understood from your question is you need the id of all in string simple Example Created for you
<form id="form1" runat="server">
<input id="Name" type="text" name="Full Name" runat="server" />
<input id="Email" type="text" name="Email Address" runat="server" />
<input id="Phone" type="text" name="Phone Number" runat="server" />
</form>
foreach (var control in Page.Form.Controls)
{
if (control is HtmlInputControl)
{
var htmlInputControl = control as HtmlInputControl;
string controlName = htmlInputControl.Name;
string controlId = htmlInputControl.ID;
}
}
Another Approach:-
HtmlElement table = testWebBrowser.Document.GetElementById("TableID");
if (table != null)
{
foreach (HtmlElement row in table.GetElementsByTagName("TR"))
{
// ...
}
}
I need to convert a Html Table to DataTable in C#. I used HtmlAgilityPack but it does not convert it well because of rowspans.
The code I am currently using is:
private static DataTable convertHtmlTableToDataTable()
{
WebClient webClient = new WebClient();
string urlContent = webClient.DownloadString("http://example.com");
string tableCode = getTableCode(urlContent);
string htmlCode = tableCode.Replace(" ", " ");
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(htmlCode);
var headers = doc.DocumentNode.SelectNodes("//tr/th");
DataTable table = new DataTable();
foreach (HtmlNode header in headers)
{
table.Columns.Add(header.InnerText);
}
foreach (var row in doc.DocumentNode.SelectNodes("//tr[td]"))
{
table.Rows.Add(row.SelectNodes("td").Select(td => td.InnerText).ToArray());
}
return table;
}
And this is a part of Html Table:
<table class="tabel" cellspacing="0" border="0">
<caption style="font-family:Verdana; font-size:20px;">SEMGRP</caption>
<tr>
<th class="celula" >Ora</th>
<th class="latime_celula celula">Luni</th>
<th class="latime_celula celula">Marti</th>
<th class="latime_celula celula">Miercuri</th>
<th class="latime_celula celula">Joi</th>
<th class="latime_celula celula">Vineri</th>
</tr>
<tr>
<td class="celula" nowrap="nowrap">8-9</td>
<td class="celula" rowspan="2">
<table border="0" align="center">
<tr>
<td nowrap="nowrap" align="center">
Curs
<br />
<a class="link_celula" href="afis_n0.php?id_tip=287&tip=p">Prof</a>
<br />
<a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a>
<br />
</td>
</tr>
</table>
</td>
<td class="celula" rowspan="2">
<table border="0" align="center">
<tr>
<td nowrap="nowrap" align="center">
Curs
<br />
<a class="link_celula" href="afis_n0.php?id_tip=287&tip=p">Prof</a>
<br />
<a class="link_celula" href="afis_n0.php?id_tip=12&tip=s">Sala</a>
<br />
</td>
</tr>
</table>
</td>
<td class="celula"> </td>
<td class="celula"> </td>
<td class="celula" rowspan="2">
<table border="0" align="center">
<tr>
<td nowrap="nowrap" align="center">
Curs
<br />
<a class="link_celula" href="afis_n0.php?id_tip=293&tip=p">Prof</a>
<br />
<a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a>
<br />
</td>
</tr>
</table>
</td>
</tr>
<tr>
<td class="celula" nowrap="nowrap">9-10</td>
<td class="celula"> </td>
<td class="celula"> </td>
</tr>
<tr>
<td class="celula" nowrap="nowrap">10-11</td>
<td class="celula" rowspan="2">
<table border="0" align="center">
<tr>
<td nowrap="nowrap" align="center"> Curs
<br /><a class="link_celula" href="afis_n0.php?id_tip=303&tip=p">Prof</a>
<br /><a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a>
<br />
</td>
</tr>
</table>
</td>
<td class="celula" rowspan="2">
<table border="0" align="center">
<tr>
<td nowrap="nowrap" align="center"> Curs
<br />
<a class="link_celula" href="afis_n0.php?id_tip=331&tip=p">Prof</a>
<br />
<a class="link_celula" href="afis_n0.php?id_tip=14&tip=s">Sala</a>
<br />
</td>
</tr>
</table>
</td>
<td class="celula" rowspan="2">
<table border="0" align="center">
<tr>
<td nowrap="nowrap" align="center"> Curs
<br /><a class="link_celula" href="afis_n0.php?id_tip=330&tip=p">Prof</a>
<br /><a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a>
<br />
</td>
</tr>
</table>
</td>
<td class="celula"> </td>
<td class="celula" rowspan="2">
<table border="0" align="center">
<tr>
<td nowrap="nowrap" align="center"> Curs
<br />
<a class="link_celula" href="afis_n0.php?id_tip=293&tip=p">Prof</a>
<br />
<a class="link_celula" href="afis_n0.php?id_tip=10&tip=s">Sala</a> <br />
</td>
</tr>
</table>
</td>
</tr>
<tr>
<td class="celula" nowrap="nowrap">11-12</td>
<td class="celula"> </td>
</tr>
<tr>
I tried some solutions but I did not find any good...
Thanks for any help in advance.
I am facing an error in generating a proper PDF Document. the code i have, can generate a pdf document, it can download the document, but the issue is i cannot view the view the document. This is the Page I am trying to export to pdf.
Here is my code so far:
ASPX:
<asp:Button ID="btnDownload" CssClass="btn" runat="server" Text="Download Invoice" OnClick="btnDownload_Click" />
<asp:Panel ID="pnl" runat="server">
<div id="page-wrap">
<textarea id="header" style="height: 30px">PAYMENT DETAILS</textarea>
<div id="identity">
<textarea style="background-color: #F7F7F7;" readonly="readonly" id="address">My Name
My Street Address
Phone: 111-111-111</textarea>
<div id="logo">
<div id="logoctr">
</div>
<div id="logohelp">
<input id="imageloc" readonly="readonly" type="text" size="50" value="" /><br />
(max width: 540px, max height: 100px)
</div>
<img id="image" src="images/logo.png" alt="logo" />
</div>
</div>
<div style="clear: both"></div>
<div id="customer">
<textarea id="tbCustomer" readonly="readonly" runat="server"></textarea>
<table id="meta">
<tr>
<td class="meta-head">Payment ID</td>
<td>
<textarea readonly="readonly" runat="server" id="tbPID"></textarea></td>
</tr>
<tr>
<td class="meta-head">Date</td>
<td>
<textarea id="date" readonly="readonly" runat="server"></textarea></td>
</tr>
<tr>
<td class="meta-head">Amount Due</td>
<td>
<div class="due">
$
<asp:Label ID="lblTotal" runat="server" Text="Total"></asp:Label>
</div>
</td>
</tr>
</table>
</div>
<table id="items">
<tr>
<th>Property Title</th>
<th>Description</th>
<th>Status</th>
<th>Invoiced By</th>
<th>Total Payment</th>
</tr>
<tr class="item-row">
<td class="item-name">
<div class="delete-wpr">
<textarea readonly="readonly" id="tbTitle" runat="server"></textarea>
</div>
</td>
<td class="description">
<div contenteditable="true" id="tbDetail" class="blank" runat="server">
</div>
</td>
<td>
<textarea id="tbStatus" runat="server" readonly="readonly">PAID</textarea></td>
<td>
<textarea class="qty" readonly="readonly" id="tbInvoicedBy" runat="server"></textarea></td>
<td><span class="price">$
<asp:Label ID="tbTotal1" runat="server" Text="Total"></asp:Label></span></td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Vaccant</td>
<td class="total-value">
<div id="subtotal">$<asp:Label ID="lblVaccant" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Maintainance</td>
<td class="total-value">
<div id="total">$<asp:Label ID="lblMaintainance" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Property Insurance</td>
<td class="total-value">
<div id="Insurance">$<asp:Label ID="lblInsurance" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Dewa Bill</td>
<td class="total-value">
<div id="dewa">$<asp:Label ID="lblDewa" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Furnishing Cost</td>
<td class="total-value">
<div id="FurnishingCost">$<asp:Label ID="lblFurnishing" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Cleaning Fees</td>
<td class="total-value">
<div id="CleaningFees">$<asp:Label ID="lblCleaning" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">House Keeping</td>
<td class="total-value">
<div id="HouseKeeping">$<asp:Label ID="lblHouseKeeping" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Next Rent Due</td>
<td class="total-value">
<div id="paid">$<asp:Label ID="lblNextRent" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Rental Comission</td>
<td class="total-value">
<div id="RentalComission">$<asp:Label ID="lblRentalComission" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Credit Card Fees</td>
<td class="total-value">
<div id="CreditCardFees">$<asp:Label ID="lblCreditCardFees" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Pest Control</td>
<td class="total-value">
<div id="PestControl">$<asp:Label ID="lblPestControl" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Chillar Utilities</td>
<td class="total-value">
<div id="ChillarUtilities">$<asp:Label ID="lblChillarUtilities" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line">Du, etisilate wifi</td>
<td class="total-value">
<div id="DuEtisilatewifi">$<asp:Label ID="lblDuEtisilateWifi" runat="server" Text=""></asp:Label></div>
</td>
</tr>
<tr>
<td colspan="2" class="blank"></td>
<td colspan="2" class="total-line balance">Total Payment</td>
<td class="total-value balance">
<div class="due">$<asp:Label ID="lblTotal2" runat="server" Text=""></asp:Label></div>
</td>
</tr>
</table>
<div id="terms">
<h5>Terms</h5>
<textarea readonly="readonly">These payment details are final and non negotiable.</textarea>
</div>
</div>
</asp:Panel>
ASPX.CS
public void ExportToPDF()
{
Response.ContentType = "application/pdf";
Response.AddHeader("content-disposition", "attachment;filename=Panel.pdf");
Response.Cache.SetCacheability(HttpCacheability.NoCache);
StringWriter sw = new StringWriter();
HtmlTextWriter hw = new HtmlTextWriter(sw);
pnl.RenderControl(hw);
StringReader sr = new StringReader(sw.ToString());
Document pdfDoc = new Document(PageSize.A4, 10f, 10f, 100f, 0f);
HTMLWorker htmlparser = new HTMLWorker(pdfDoc);
PdfWriter.GetInstance(pdfDoc, Response.OutputStream);
pdfDoc.Open();
htmlparser.Parse(sr);
pdfDoc.Close();
sw.Close();
htmlparser.Close();
Response.Write(pdfDoc);
Response.End();
}
also it says the HTMLWorker class is obsolete.
I'm working with some html tables and trying to dig through them with htmlagilitypack. The source html is found here: https://www.ultimate-guitar.com/search.php?title=breaking+benjamin+polyamorous&type%5B1%5D=200&rating%5B0%5D=4&rating%5B1%5D=5
Sample table:
<table cellspacing="1" class="tresults">
<tbody>
<tr>
<th width="175">Artist :</th>
<th>Song :</th>
<th width="115">Rating :</th>
<th width="80">Type :</th>
</tr>
<tr>
<td>
<a href="/tabs/breaking_benjamin_tabs.htm" class="song search_art">
<b>Breaking</b> <b>Benjamin</b>
</a>
</td>
<td>
<a target="_blank" href="http://plus.ultimate-guitar.com/tp/?artist=Breaking+Benjamin&song=Polyamorous" class="song js-tp_link"><b>Polyamorous</b></a>
<a target="_blank" class="js-tp_link" href="http://plus.ultimate-guitar.com/tp/?artist=Breaking+Benjamin&song=Polyamorous"><b
class="play_tab_list"title="Playback"></b></a>
</td>
<td class="gray4"></td>
<td><strong>tab pro</strong>
</td>
</tr>
<tr class="stripe">
<td> </td>
<td>
<b>Polyamorous</b> (ver 2)
</td>
<td class="gray4"><span class="rating"><span class="r_4"></span></span> <span>[ <b class="ratdig">5</b> ]</span>
</td>
<td><strong>tab</strong>
</td>
</tr>
<tr>
<td> </td>
<td>
<b>Polyamorous</b> (ver 4)
</td>
<td class="gray4"><span class="rating"><span class="r_4"></span></span> <span>[ <b class="ratdig">30</b> ]</span>
</td>
<td><strong>tab</strong>
</td>
</tr>
<tr class="stripe">
<td> </td>
<td>
<b>Polyamorous</b> (ver 5)
</td>
<td class="gray4"><span class="rating"><span class="r_4"></span></span> <span>[ <b class="ratdig">12</b> ]</span>
</td>
<td><strong>tab</strong>
</td>
</tr>
<tr>
<td> </td>
<td>
<b>Polyamorous</b> (ver 6)
<span rel="#info_333408" class="tabinfo">info</span>
<div class="dn" id="info_333408">
<font style="font-family:trebuchet ms;font-size:12px;font-weight:bold;line-height:120%"><b><font color="#DDDDCC">+</font> Difficulty:</b> <font color="#DDDDCC">novice</font>
<br>
</font>
</div>
</td>
<td class="gray4"><span class="rating"><span class="r_4"></span></span> <span>[ <b class="ratdig">20</b> ]</span>
</td>
<td><strong>tab</strong>
</td>
</tr>
<tr class="stripe">
<td> </td>
<td>
<b>Polyamorous</b> (ver 7)
</td>
<td class="gray4"><span class="rating"><span class="r_4"></span></span> <span>[ <b class="ratdig">5</b> ]</span>
</td>
<td><strong>tab</strong>
</td>
</tr>
<tr>
<td> </td>
<td>
<b>Polyamorous</b> (ver 8)
<span rel="#info_952279" class="tabinfo">info</span>
<div class="dn" id="info_952279">
<font style="font-family:trebuchet ms;font-size:12px;font-weight:bold;line-height:120%"><b><font color="#DDDDCC">+</font> Difficulty:</b> <font color="#DDDDCC">novice</font>
<br>
</font>
<p style="margin-top:3px"><font style="font-family:trebuchet ms;font-size:12px;font-weight:bold;line-height:120%"><b><font color="#DDDDCC">+</font> Tuning:</b> <font color="#DDDDCC">Drop C#</font></font>
</p>
</div>
</td>
<td class="gray4"><span class="rating"><span class="r_5"></span></span> <span>[ <b class="ratdig">6</b> ]</span>
</td>
<td><strong>tab</strong>
</td>
</tr>
<tr class="stripe">
<td> </td>
<td>
<b>Polyamorous</b> Acoustic
<span rel="#info_258880" class="tabinfo">info</span>
<div class="dn" id="info_258880">
<font style="font-family:trebuchet ms;font-size:12px;font-weight:bold;line-height:120%"><b><font color="#DDDDCC">+</font> Difficulty:</b> <font color="#DDDDCC">novice</font>
<br>
</font>
</div>
</td>
<td class="gray4"><span class="rating"><span class="r_5"></span></span> <span>[ <b class="ratdig">9</b> ]</span>
</td>
<td><strong>tab</strong>
</td>
</tr>
</tbody>
</table>
In order to grab this table from the full html doc, here is a snippet of my C# code:
string source_code = web.DownloadString("https://www.ultimate-guitar.com/search.php?title="+ songArtist + songTitle + "&type%5B1%5D=200&rating%5B0%5D=4&rating%5B1%5D=5");
doc.LoadHtml(source_code);
HtmlNodeCollection resultsTable = doc.DocumentNode.SelectSingleNode("//table[#class='tresults']");
foreach(var cell in resultsTable.Descendants())
{
Console.WriteLine(cell.InnerHtml);
}
I am expecting to have the full contents of the table returned, except it stops at the line: <b class="play_tab_list" title="Playback"></b>
My ultimate goal is to return all of the links in the table, but I cannot even get as far as to see the full table.
This code will print the url for all links on the table.
var doc = new HtmlDocument();
var web = new WebClient();
string source_code = web.DownloadString("https://www.ultimate-guitar.com/search.php?title=breaking+benjamin+polyamorous&type[1]=200&rating[0]=4&rating[1]=5");
doc.LoadHtml(source_code);
HtmlNodeCollection links = doc.DocumentNode.SelectNodes("//a[contains(#class,'link')]");
foreach (var link in links)
{
Console.WriteLine("{0} {1}", link.InnerText, link.Attributes["href"].Value);
}