I have this piece of html code. I want to get the text inside the <div> tag using WatiN. The C# code is below, but I'm pretty sure it could be done way better than my solution. Any suggestions?
HTML:
<table id="someId" cellspacing="0" border="1" style="border-collapse:collapse;" rules="all">
<tbody>
<tr>
<th scope="col"> </th>
</tr>
<tr>
<td>
<div>Some text</div>
</td>
</tr>
</tbody>
</table>
C#
// Get the table ElementContainer
IElementContainer diagnosisElementContainer = (IElementContainer)_control.GetElementById("someId");
// Get the tbody element
IElementContainer tbodyElementContainer = (IElementContainer)diagnosisElementContainer.ChildrenWithTag("tbody");
// Get the <tr> children
ElementCollection trElementContainer = tbodyElementContainer.ChildrenWithTag("tr");
// Get the <td> child of the last <tr>
IElementContainer tdElementContainer = (IElementContainer)trElementContainer.ElementAt<Element>(trElementContainer.Count - 1);
// Get the <div> element inside the <td>
Element divElement = tdElementContainer.Divs[0];
Based on the given, something like this is how I'd go for IE.
IE myIE = new IE();
myIE.GoTo("[theurl]");
string theText = myIE.Table("someId").Divs[0].Text;
The above is working on WatiN 2.1, Win7, IE9.
Related
I have some html and want to scrape some data from it.
The HTML is structured in the following way
<div class="someClass"><span class="someOtherClass">Text</span></div>
<table>
<tbody>
<tr>
<td>label</td>
<td>data</td>
</tr>
<tr>
<td>label</td>
<td>data</td>
</tr>
<tr>
<td>label</td>
<td>data</td>
</tr>
</tbody>
</table>
<div class="someClass"><span class="someOtherClass">Text</span></div>
<table>
<tbody>
<tr>
<td>label</td>
<td>data</td>
</tr>
<tr>
<td>label</td>
<td>data</td>
</tr>
<tr>
<td>label</td>
<td>data</td>
</tr>
</tbody>
</table>
<div class="someClass"><span class="someOtherClass">Text</span></div>
I need to be able to scrape the Text value located in the span where class="someOtherClass" (I've already implemented this portion)
I then need to be able to scrape the table directly below the div. Since the "parent" div doesn't actually contain the table, I'm having some issues implementing this.
I need to be able to scrape the Text value located in the span
You don't need regex. An Xpath query is enough.
var text = doc.DocumentNode
.SelectNodes("//span[#class='someOtherClass']")
.Select(x => x.InnerText)
.ToList();
I then need to be able to scrape the table directly below the div.
using a similar xpath
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(htmlstring);
var tables = doc.DocumentNode
.SelectNodes("//span[#class='someOtherClass']/following::table").ToList();
foreach (var table in tables)
{
var list = table.Descendants("tr")
.Select(tr => tr.Descendants("td")
.Select(td => td.InnerText).ToList())
.ToList();
}
I'm not a professional in C# and ASP.Net so please have some patience with me.
I have the following problem.
I'm using ASP.Net WebForm API with C# for creating a dashboard.
I have a generic HTML table (taken out from a sql query) which will be displayed. Now I want to implement the feature, that when the user clicks on a cell for example in the column ID, he should get an details view which is a bootstrap modal.
For that I need the ID value which is in this cell. How can I get this value?
With the value I will start a new sql query and more other specific informations are going to be shown.
Here is my aspx. structure:
<table id="MyTable" class="table table-striped table-bordered table-condensed table-responsive">
<thead>
<tr>
<th>ID</th>
<th>Name</th>
<th>Typ</th>
<th>Something else</th>
<th>Date</th>
</tr>
</thead>
<tbody>
<%=Tabelle.GetTable.dataTable_all%>
</tbody>
</table>
<script type="text/javascript">
$(document).ready(function () {
$('#MyTable').DataTable();
});
</script>
the variable dataTable_all is a string. So this is my table in HTML Code.
My Result for <tbody> is 366 rows big and here is an extract:
<tr>
<td>154789</td>
<td>Testproject X</td>
<td>Good</td>
<td>greencolored</td>
<td>01.01.2015</td>
</tr>
<tr>
<td>189365</td>
<td>Testproject B</td>
<td>Good</td>
<td>redcolored</td>
<td>08.01.2015</td>
</tr>
<tr>
<td>136471</td>
<td>Testproject Y</td>
<td>Bad</td>
<td>pinkcolored</td>
<td>15.04.2015</td>
</tr>
So how can I do it that when I click on for example ID 136471 that the value will be given to a variable in my c# code?
Change to:
<tr data-id="154789">
<td>154789</td>
<td>Testproject X</td>
<td>Good</td>
<td>greencolored</td>
<td>01.01.2015</td>
</tr>
<tr data-id="189365">
<td>189365</td>
<td>Testproject B</td>
<td>Good</td>
<td>redcolored</td>
<td>08.01.2015</td>
</tr>
<tr data-id="136471">
<td>136471</td>
<td>Testproject Y</td>
<td>Bad</td>
<td>pinkcolored</td>
<td>15.04.2015</td>
</tr>
Then use:
$('tbody tr').click(function() {
alert($(this).data('id'));
});
Working demo
https://jsfiddle.net/jknysneo/
I have the following HTML:
<tbody>
<tr>
<td class="metadata_name">Headquarters</td>
<td class="metadata_content">Princeton New Jersey, United States</td>
</tr>
<tr>
<td class="metadata_name">Industry</td>
<td class="metadata_content"><ul><li>Engineering Software</li><li>Software Development & Design</li><li>Software</li><li>Custom Software & Technical Consulting</li></ul></td>
</tr>
<tr>
<td class="metadata_name">Revenue</td>
<td class="metadata_content">$17.5 Million</td>
</tr>
<tr>
<td class="metadata_name">Employees</td>
<td class="metadata_content">201 to 500</td>
</tr>
<tr>
<td class="metadata_name">Links</td>
<td class="metadata_content"><ul><li>Company website</li></ul></td>
</tr>
</tbody>
I want to be able to load the metadata_content value (ex "$17.5 Million") in to a var where the metadata_name is = to a value (ex: "Revenue").
I have tried to use combinations of code like this for a few hours...
orgHtml.DocumentNode.SelectNodes("//td[#class='metadata_name']")[0].InnerHtml;
But I'm not getting the right combination down. If you have a helpful SelectNodes syntax - that will get me the solution I would appreciate it.
It seems what you're looking for is this:
var found = orgHtml.DocumentNode.SelectSingleNode(
"//tr[td[#class = 'metadata_name'] = 'Revenue']/td[#class = 'metadata_content']");
if (found != null)
{
string html = found.InnerHtml;
// use html
}
Note that to get the text of an element, you should use found.InnerText, not found.InnerHtml, unless you specifically need its HTML content.
I have one following table html, the data of rows in the table is looped by foreach.
Can I use Razor to get the value of each row into a List or an array in C#?
My C# code Razor (I tried this so far)
#{
var l = new List<string>();
l.Add(#<input id="updated_value2" data-bind="value:value,visible:isEditing()" />);
}
Here's my table
<table class="table table-hover">
<tbody data-bind="foreach: $root.mapJsons(parameters())">
<tr class="data-hover">
<td>
<strong>
<span data-bind="text:key" />
</strong>
</td>
<td>
#*display label and input for dictionary<value> false DIS true APP*#
<input id="updated_value" data-bind="value:value,visible:isEditing()" />
<label id="display_value" data-bind="text:value,visible:!isEditing()" />
</td>
</tr>
</tbody>
<thead>
<tr>
<th style="width: 30%">
Name
</th>
<th style="width: 30%">
Value
</th>
<th></th>
</tr>
</thead>
</table>
Your foreach loop is being executed client-side (looks like a KnockoutJS binding?) rather than server-side, so any Razor code you embed in the table is only going to be called once as it's rendered by the server. So the answer is no, you cannot populate a server-side list with this particular foreach loop.
This is a newbie question so please provide working code.
How do I count the tables in an html file using C# and the html-agility-pack?
(I will need to get values from specific tables in an html file based on the count of tables. I will then perform some math on the values retrieved.)
Here is a sample file with three tables for your convenience:
<html>
<head>
<title>Tables</title>
</head>
<body>
<table border="1">
<tr>
<th>Name</th>
<th>Phone</th>
<th>City</th>
<th>Number</th>
</tr>
<tr>
<td>Scott</td>
<td>555-2345</td>
<td>Chicago</td>
<td>42</td>
</tr>
<tr>
<td>Bill</td>
<td>555-1243</td>
<td>Detroit</td>
<td>23</td>
</tr>
<tr>
<td>Ted</td>
<td>555-3567</td>
<td>Columbus</td>
<td>9</td>
</tr>
</table>
<p></p>
<table border="1">
<tr>
<th>Name</th>
<th>Year</th>
</tr>
<tr>
<td>Abraham</td>
<td>1865</td>
</tr>
<tr>
<td>Martin</td>
<td>1968</td>
</tr>
<tr>
<td>John</td>
<td>1963</td>
</tr>
</table>
<p></p>
<table border="1">
<tr>
<th>Animal</th>
<th>Location</th>
<th>Number</th>
</tr>
<tr>
<td>Tiger</td>
<td>Jungle</td>
<td>8</td>
</tr>
<tr>
<td>Hippo</td>
<td>River</td>
<td>4</td>
</tr>
<tr>
<td>Camel</td>
<td>Desert</td>
<td>3</td>
</tr>
</table>
</body>
</html>
If you would, please SHOW how to send the results to a new text file.
Thanks!
I think this can be a starting point
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var tables = doc.DocumentNode.Descendants("table");
int tablesCount = tables.Count();
foreach (var table in tables)
{
var rows = table.Descendants("tr")
.Select(tr => tr.Descendants("td").Select(td => td.InnerText).ToList())
.ToList();
foreach(var row in rows)
Console.WriteLine(String.Join(",", row));
}
Something like this:
HtmlDocument doc = new HtmlDocument();
doc.Load(myTestFile);
// get all TABLE elements recursively
int count = doc.DocumentNode.SelectNodes("//table").Count;
// output to a text file
File.WriteAllText("output.txt", count.ToString());