From this XHTML source:
<div class = "page">
<h1>UNIQUE NAME</h1>
<table>
<tbody>
<tr>
<td>DATA TO EXTRACT 1</td>
</tr>
<tr>
<td />
<td />
<td />
<td />
<td />
<td>DATA TO EXTRACT 2</td>
</tr>
</tbody>
</table>
etc...
There are multiple instances of UNIQUE NAME with a similar set of child elements.
I need to locate the UNIQUE NAME element and extract all values (DATA TO EXTRACT) within each of the child element tags. In addition, I need to keep a count of where each value is located. For example DATA TO EXTRACT 1 would be at tr 1, td 1. DATA TO EXTRACT 2 would be at tr 2, td 6.
I am new to linq to xml and I was wondering whether someone could point me in the right direction with regards to a strategy. I have managed to figure out how to get to the UNIQUE name element with the following code:
var choice1 = (from category in _data.Descendants("div")
where category.Element("h1").Value == "UNIQUE NAME"
select category).DescendantNodes();
This returns a set of the values, which I'm sure I could loop through but I'm sure there must be a more elegant way of achieving this goal.
Many thanks!
Here’s one way of doing it using LINQ:
var choice1 =
from category in _data.Descendants("div")
where category.Element("h1").Value == "UNIQUE NAME"
from row in category.Descendants("tr").Select((element, index) => new { element, index })
from col in row.element.Elements("td").Select((element, index) => new { element, index })
where !string.IsNullOrEmpty(col.element.Value)
select new
{
RowIndex = row.index + 1, // one-based index
ColIndex = col.index + 1,
Value = col.element.Value,
};
An example of how to use your results:
foreach (var v in choice1)
Console.WriteLine(string.Format(
"RowIndex = {0}, ColIndex = {1}, Value = \"{2}\".",
v.RowIndex, v.ColIndex, v.Value));
…which would output:
RowIndex = 1, ColIndex = 1, Value = "DATA TO EXTRACT 1".
RowIndex = 2, ColIndex = 6, Value = "DATA TO EXTRACT 2".
Related
Let's say, I've this table:
+--------+----------+----
| Device | Serial # |
+--------+----------+----
| Cam1 | AB123 |
+--------+----------+----
Since I don't know in advance the columns that'll be displayed, I construct the table by sending just a pair of key/vale for each cell.
This is how I'm getting my data in C# code.
List<List<KeyValue>> myTable = deviceRepository.GetKeyValues(int facilityId);
Once set to the client side, data in the myTable will be of the following structure:
myTable = [
[ { key: "DeviceName", value: "Device"}, { key: "SerialNumber", value: "Serial #"}, ..],
[ { key: "DeviceName", value: "Cam1"}, { key: "SerialNumber", value: "AB123"}, ..],
...
]
In razor, I'd just have to loop through the list.
#foreach(var row in Model)
{
<tr>
#foreach(var cell in row)
{
<td>#cell.Value</td>
}
</tr>
}
In Angular, I don't see how to do that with directives.
<tr *ngFor="let myInnerList of myTable">
//I'd like to loop through the each inner list to build each table cell
</tr>
Thanks for helping
EDIT
Is it possible to get something like this? i.e if the column is the ID, display a checkbox so that the row can be selected.
#foreach(var cell in row)
{
if(cell.Key == "Id")
{
<td><input type="checkbox" id="row_#cell.Value" /></td>
}
else
{
<td>#cell.Value</td>
}
}
This way, the first cell for every row will display a checkbox.
I am not sure what you are trying to show, you write this but it is dependent on your arrays all being sorted the same within each array. If that is not the case you can either add code to make it so or create a filter.
This is the equivalent of the c# code you have in your question.
<tr *ngFor="let row of myTable">
<td *ngFor="let col of row">
{{col.value}}
</td>
</tr>
I have a table of fees I am trying to parse through to return data, but it is returning a few blanks before it actually returning the string of data.
<table id="Fees">
<thead>
<tr>
<th>Rate Code</th>
<th>Description</th>
<th>Amount</th>
</tr>
</thead>
<tbody>
<tr>
<td class="code">A1</td>
<td>Charge Type 1</td>
<td class="amount">$11.20</td>
</tr>
<tr>
<td class="code">C2</td>
<td>Charge Type 2</td>
<td class="amount">$36.00</td>
</tr>
<tr>
<td class="code">CMI</td>
<td>Cuba Medical Insurance</td>
<td class="amount">$25.00</td>
</tr>
</tbody>
<tfoot>
<tr>
<td colspan="2">Total:</td>
<td class="amount">$145.16</td>
</tr>
</tfoot>
</table>
I return by xpath
private By lst_Fee
{
get { return By.XPath("//*[#id=\"Fees\"]/tbody/tr"); }
}
Selenium code:
IList<IWebElement> fees = GetNativeElements(lst_Fee, 5);
List<string> actual = new List<string>();
foreach (IWebElement elem in fees)
{
actual.Add(GetText(elem, ControlType.Label));
}
Questions
Is ControlType.Label correct for a table? I am getting a few blank elems before actually getting to the data.
If I wanted to separate each Rate, Description and Fee out in each item to make sure the cost adds up to Total correctly, how can I do that?
I would do something like the below. I created a class Fee that holds the parts of a fee: the code, description, and amount. For each table row , you would extract these three values and store them in an instance of the Fee class. The function returns a collection of Fee instances. To get the sum of the fees themselves, you would call the GetFees() method and then iterate through the Fee instances summing the amount into the final Total.
public class Fee
{
private String code;
private String desc;
private BigDecimal amount;
private Fee(String _code, String _desc, BigDecimal _amount)
{
this.code = _code;
this.desc = _desc;
this.amount = _amount;
}
}
public List<Fee> GetFees()
{
List<Fee> fees = new ArrayList<Fee>();
List<WebElement> rows = driver.findElements(By.cssSelector("#Fees > tbody > tr"));
for (WebElement row : rows)
{
List<WebElement> cells = row.findElements(By.cssSelector("td"));
fees.add(new Fee(cells.get(0).getText(), cells.get(1).getText(), parse(cells.get(2).getText(), Locale.US)));
}
return fees;
}
// borrowed from http://stackoverflow.com/a/23991368/2386774
public BigDecimal parse(final String amount, final Locale locale) throws ParseException
{
final NumberFormat format = NumberFormat.getNumberInstance(locale);
if (format instanceof DecimalFormat)
{
((DecimalFormat) format).setParseBigDecimal(true);
}
return (BigDecimal) format.parse(amount.replaceAll("[^\\d.,]", ""));
}
You can grab all the column headers and as well the row data by the below code:
Happy coding =
//// Grab the table
IWebElement grid;
grid = _browserInstance.Driver.FindElement(By.Id("tblVoucherLookUp"));
IWebElement headercolumns = grid.FindElement(By.Id("tblVoucherLookUp"));
_browserInstance.Driver.Manage().Timeouts().ImplicitlyWait(TimeSpan.FromSeconds(75));
_browserInstance.ScreenCapture("Voucher LookUp Grid");
//// get the column headers
char[] character = "\r\n".ToCharArray();
string[] Split = headercolumns.Text.Split(character);
for (int i = 0; i < Split.Length; i++)
{
if (Split[i] != "")
{
_log.LogEntry("INFO", "Voucher data", true,
Split + " Text matches the expected:" + Split[i]);
}
}
I get data from a table but values show not easy to manipulate.
My HTML structure like:
<table>
<tbody>
<tr>
<td>
<span>1</span>
<span>0</span>
<br>
<span>
<span>Good Luck</span>
<img src="/App_Themes/Resources/img/icon_tick.gif" width="3" height="7">
</span>
</td>
</tr>
<tr>
<td>
<b>Nowaday<br></b>
<p>hook<br>zp</p>
</td>
</tr>
</tbody>
</table>
But when I tried to get data will be like:
10Good LuckNowadayhookzp
I was using this code:
ReadOnlyCollection<IWebElement> lstTable = browser.FindElements(By.XPath("table/tbody/tr"));
foreach (IWebElement val in lstTable)
{
ReadOnlyCollection<IWebElement> lstTDElement = val.FindElements(By.XPath("td"));
ReadOnlyCollection<IWebElement> lstSpecialEle =
val.FindElements(By.XPath("//td/span | //td/b | //td/p"));
}
It will create many rows (in a <tr> tag I found about 6000), and I don't know how to arrangement with correct columns.
Because each column, data can be null or have many values in this.
Current, I have lstTDElement contains two columns(in real: 10 columns).
And lstSpecialEle contains all data necessary.
I was filter only get with: [//td/span | //td/b | //td/p].
How to integrated lstSpecialEle to lstTDElement with rights columns. Using foreach with the condition?
Edited:
Typical, I will receive from lstTDElement is: 10Good LuckNowadayhookzp.
lstSpecialEle will create many rows contains all values I need.
The problem is: I don't know how to arrange all rows from lstSpecialEle into a table.
My table has two <tr> tag; this means it has two columns. How to organise all values in lstSpecialEle correct to this columns.
It should be like:
Num Time
1 0 Good Luck Nowaday hookzp
As mentioned, data is dynamic, first <tr> or second <tr> can not have tag like <span>, or don't have tag <b>, etc (it just does not appear, no new <tag> added)
Actually you are finding element from root means using // in your xpath which will search for elements in the whole page while you need to search within the specific row element only so you should try with .// in your xpath which will search for elements only specific element context. So I think you should try as below which will gives you only desire elements list instead of large amount of elements list as below :
ReadOnlyCollection<IWebElement> lstTable = browser.FindElements(By.XPath("//table/tbody/tr"));
foreach (IWebElement val in lstTable)
{
ReadOnlyCollection<IWebElement> lstSpecialEle = val.FindElements(By.XPath(".//td/span | .//td/b | .//td/p"));
}
Edited1 : If your getting list of elements with the combination of null text you can filter it with null condition and get sublists which contains exact text as below :-
var FinalList = lstSpecialEle.Where(x=>x.Text != null).ToList();
Edited2 :- if you want to merge all columns text list into Single list of string try as below :-
List<string> FinalList = new List <string>();
foreach (IWebElement val in lstTable)
{
ReadOnlyCollection<IWebElement> lstSpecialEle = val.FindElements(By.XPath(".//td/span | .//td/b | .//td/p"));
var AllTextList = lstSpecialEle.Where(x=>x.Text != null).ToList().Select(El => El.Text).ToList();
string AllText = String.Join(" ", AllTextList);
FinalList.Add(AllText);
}
Console.WriteLine(FinalList);
Now FinalList will contain all values separated by per row.
Hope it helps...:)
I have been trying to scrape some data off a website. The source has differentiated all the headers of tables to that of the actual contents by different class names. Because I want to scrape all the table information, I got all the headers into one array and contents into another array. But the problem is that when I am trying to write the array contents into a file, I can write a header but second array contains contents from all the table and I cannot mark where contents of first table ends.
Because htmlagilitypack scrapes all the tags of specified Nodes, I get all the contents. First let me show the code to make it clear:
<tr class=tableHeader>
<th width=16%>Caught</th>
<th width=16%><p>Normal Range</p></th>
</tr>
<TR class=content><TD><i>Bluegill</i></TD>
<TD>trap net</TD>
<TD align=CENTER>4.05</TD>
<TD align=CENTER> 7.9 - 37.7</TD>
<TD align=CENTER>0.26</TD>
<TD align=CENTER> 0.1 - 0.2</TD>
</TR>
<TR class=content><TD><i></i></TD>
<TD>Gill net</TD>
<TD align=CENTER>1.50</TD>
<TD align=CENTER>N/A</TD>
<TD align=CENTER>0.07</TD>
<TD align=CENTER>N/A</TD>
</TR>
<tr class=tableHeader>
<th>0-5</th>
<th>6-8</th>
<th>9-11</th>
<th>12-14</th>
<th>15-19</th>
<th>20-24</th>
<th>25-29</th>
<th>30+</th>
<th>Total</th>
</tr>
<TR class=content><TD><i>bluegill</i></TD>
<TD align=CENTER>19</TD>
<TD align=CENTER>65</TD>
<TD align=CENTER>0</TD>
<TD align=CENTER>0</TD>
<TD align=CENTER>0</TD>
<TD align=CENTER>0</TD>
<TD align=CENTER>0</TD>
<TD align=CENTER>0</TD>
<TD align=CENTER>84</TD>
</TR>
Below is my code to save the headers and contents into array and try to display it exactly like in the website.
int count =0;
foreach (var trTag4Pale in trTags4Pale)
{
string trText4Pale = trTag4Pale.InnerText;
paleLake[count] = trText4Pale;
if (trTags4Small != null)
{
int counter = 0;
foreach (var trTag4Small in trTags4Small)
{
string trText4Small = trTag4Small.InnerText;
smallText[counter] = trText4Small;
counter++;
}
}
File.AppendAllText(path,paleLake[count]+Environment.Newline+smallText[count]+Environment.Newline);
}
As you see, When I try to append the contents of the array to a file, it lines in the first header, and contents of all the table. But I only want contents of the first table and would repeat the process to get the content of the second table and so forth.
If I could get the contents between tr tag tableHeader, the arrays for the content would contain every contents for all the tables in different arrays. I don't know how to do this.
This might not be the best approach but I made it work somehow. It might be useful resource for somebody someday. So below is the code that worked for me. I append the data stored in the list into an excel sheet. As I have all the data I need for each tr tag with each class, I can manipulate the data I want:
var trTags4Header = document.DocumentNode.SelectNodes("//tr[#class='tableheader']");
if (trTags4Header != null)
{
//Create a list to store td values
List<string> tableList1 = new List<string>();
int row = 2;
foreach (var item in trTags4Header)
{
//Get only next siblings which matches the calss name as "content"
var found = item.SelectNodes("followin-sibling::*").TakeWhile(tag => tag.Name == "tr" && tag.Attributes["class"].Value == "content");
//store the nodes selected in an array (this is the selection of nodes I wanted which has td information I want.
HtmlNode[] nextItem = found.ToArray();
foreach (var node in nextItem)
{
//Gets individual td values within tr class='content' Notice .//td- this starts looking from the present node instead of the root nodes.
var tdValues = node.SelectNodes(".//td").TakeWhile(tdTag => tdTag.Name == "td");
int column = 1;
//Stores each td values into the list which is why I have control over the data to where I want to store, I am storing them in one excel worksheet.
foreach (var tdText in tdValues)
{
tableList1.Add(tdText.InnerText);
ws1.Cells[row, column] = tdText.InnerText;
column++;
}
row++;
}
}
//Display the content in a listbox
listBox1.DataSource = tableList1;
}
Please suggest a better solution if you come across this or leave your feedback. Thanks
I wrote the below code for finding the records in a table grid.
$(function () {
grid = $('#tblsearchresult');
// handle search fields key up event
$('#search-term').keyup(function (e) {
text = $(this).val(); // grab search term
if (text.length > 1) {
// iterate through all grid rows
grid.find('tr').each(function (i) {
if ($(this).find('td:eq(1)').text().toUpperCase().match(text.toUpperCase()))
$(this).css({ background: "#A4D3EE" });
});
}
else {
grid.find('tr:has(td)').css({ background: "" });
grid.find('tr').show();
} // if no matching name is found, show all rows
});
});
<table id="tblsearchresult" class="tablesorter"">
<thead>
<tr>
<th>ApplicationName</th>
</tr>
</thead>
<tbody>
<% foreach (var request in Model.ApplicationRoles)
{ %>
<tr>
<td>
<span id="appName_<%: request.Id%>">
<%: request.Application.Name%></span>
</td>
</tr>
</tbody>
</table>
EDIT Table Data
applicationame role
application1 appadministrator
app developer
application2 tester
if i given 'app'as search text need to highlight secondrow only .highlightling firstrow also because 'app' is there in role of firstrow..exact match should be highlight on every rows.please tell me.
Your code is behaving correctly. Just that you need to clear all previously highlighted rows on "keyup" of input text first.
if (text.length > 1) {
grid.find('tr:has(td)').css({ background: "" });
grid.find('tr').show();
......rest of your code.......
You need to clear the highlight before you parse. Add this statement of yours:
grid.find('tr:has(td)').css({ background: "" });
before entering this loop:
// iterate through all grid rows
grid.find('tr').each(function (i) {
...
});
Check this fiddle: http://jsfiddle.net/F3jRj/1/
And this updated fiddle with 3 columns: http://jsfiddle.net/F3jRj/2/