SelectSingleNode and SelectNodes XPath syntax - c#

My question is very similar to this one XmlNode.SelectSingleNode syntax to search within a node in C#
I'm trying to use HTML Agility Pack to pull price/condition/ship price... Here's the URL I am scraping: http://www.amazon.com/gp/offer-listing/0470108541/ref=dp_olp_used?ie=UTF8&condition=all
Here's a snippet of my code:
string results = "";
var w = new HtmlWeb();
var doc = w.Load(url);
var nodes = doc.DocumentNode.SelectNodes("//div[#class='a-row a-spacing-medium olpOffer']");
if (nodes != null)
{
foreach (HtmlNode item in nodes)
{
var price = item.SelectSingleNode(".//span[#class='a-size-large a-color-price olpOfferPrice a-text-bold']").InnerText;
var condition = item.SelectSingleNode(".//h3[#class='a-spacing-small olpCondition']").InnerText;
var price_shipping = item.SelectSingleNode("//span[#class='olpShippingPrice']").InnerText;
results += "price " + price + " condition " + condition + " ship " + price_shipping + "\r\n";
}
}
return results;
No matter what combination I try of .// and . and ./ and / etc... I cannot get what I want (just now trying to learn xpaths), also currently it is returning just the 1st item over and over and over, just like the original question I referenced earlier. I think I'm missing a fundamental understanding of how selecting nodes work and/or what is considered a node.
UPDATE
Ok, I've changed the URL to point to a different book and the first two items are working as expected... When I try to change the third item (price_shipping) to a ".//" Absolutely no information is being pulled from anything. This must be due to sometime there is not even a shipping price and that span is omitted. How do I handle this? I tried if price_shipping !=null.
UPDATE
Solved. I removed the ".InnerText" from the price_shipping that causing issues when it was null... then I did the null check and Then it was safe to use .InnerText.

Solved. I removed the ".InnerText" from the price_shipping that causing issues when it was null... then I did the null check and Then it was safe to use .InnerText.

Related

Retrieving and updating data from InfoPath repeating tables

I've found this link helpful in getting data out of field in infopath. Unfortunately, after i attempt to set value back every time users made change on field, it appears to be infinite loop and cause error.
Here is my code:
XPathNavigator xNavigation = this.MainDataSource.CreateNavigator();
XPathNodeIterator xNodeIterator = xNavigation.Select(“/my:myFields/my:group1/my:group2”, this.NamespaceManager);
while (xNodeIterator.MoveNext()){
string mystring = xNodeIterator.Current.SelectSingleNode(“my:County”, this.NamespaceManager).Value;
xNodeIterator.Current.SelectSingleNode(“my:County”, this.NamespaceManager).SetValue("mystring"+ mystring);
}
What's the problem here? Please help me.
As I mentioned in the comments, you could use XPathNavigator to go through the list without using While loop:
EDITED, based on more info from comments:
foreach (XPathNavigator nav in xNavigation.Select(“/my:myFields/my:group1/my:group2”, this.NamespaceManager))
{
string mystring = myTextBox.Text + ", " + "GUID"; //replace "GUID" with the actual GUID which you need
nav.SelectSingleNode("my:field3", this.NamespaceManager).SetValue("mystring"+ mystring);
}
Basically, you should not get the value of the element/node, then set the same element/node's value after adding something to it. That will cause an exponential string expansion.

How do you separate results by a character when looping through an XML file in C#?

I am storing some settings in a settings.xml file for my C# Windows Forms Application and in that XML file I am storing e-mail addresses.
I would ultimately like to achieve being able to loop through these e-mail addresses and send one e-mail to all of them.
What would be the best way of looping through them and adding them using the To.Add method of the MailMessage class in C#?
I already have the following code below to retrieve them from the XML file:
var doc = XDocument.Load(Application.StartupPath + "//settings.xml");
StringBuilder result = new StringBuilder();
foreach (XElement c in doc.Descendants("EmailAddresses"))
{
MessageBox.Show("Results: " + c.Value, "Test");
}
I have not been able to figure out how to split the results. The results in the MessageBox are like so: "email#domain.comemail#domain.comemail#domain.com" and so on..or even if this is the best way to achieve what I want.
Your help is greatly appreciated!
Rather than evaluating the entire content below the "EmailAddresses" element, you should be enumerating its child nodes individually. Assuming the "Email<#>" elements are the only children, code similar to what commenter stribizhev offered should work fine:
foreach(XElement c in doc.Descendants("EmailAddresses")
.SelectMany(x => x.DescendantNodes()
.Where(‌​x => x.NodeType == System.Xml.XmlNodeType.Text)))
{
MessageBox.Show("Results: " + c.Value, "Test");
}
Note that you can't actually call DescendantNodes() on the result of the call to Descendants(), as that return value is an instance of IEnumerable<XElement>, not a single XElement. But you can use the SelectMany() method to flatten the enumeration of descendants into an enumeration of their descendants.
Alternatively, you could check the node's name:
foreach(XElement c in doc.Descendants("EmailAddresses")
.SelectMany(x => x.Elements().Where(‌​x => x.Name.StartsWith("Email")))
{
MessageBox.Show("Results: " + c.Value, "Test");
}
Based on the information you've provided so far, I would expect either of those to work fine.
The above just displays the values in the MessageBox, as in your original example. Obviously, you can just pass c.Value to the MailAddressCollection.Add() method instead, to add them as you wanted.

How can I output a list of field names and values that were changed?

I'm still learning C# whilst building an MVC web app. Trying to find a way to create a list of values that were changed by a user during an edit operation.
Here's one way I have that would work:
public List<string> SaveVehicleTechnicalInformation(VehicleAssetTechnicalInformationViewModel editmodel)
{
// Create a list of fields that have changed
List<string> changes = new List<string>();
var record = db.VehicleAssetTechnicalInformations.Find((int)editmodel.RecordID);
if (editmodel.Make != null && editmodel.Make != record.Make)
{
changes.Add(" [Make changed from " + record.Make + " to " + editmodel.Make + "] ");
record.Make = editmodel.Make;
}
if (editmodel.Model != null && editmodel.Model != record.Model)
{
changes.Add(" [Model changed from " + record.Model + " to " + editmodel.Model + "] ");
record.Model = editmodel.Model;
}
return changes;
}
But... As you can tell, I am going to need to write an IF/ELSE statement for every single field in my database. There are about 200 fields in there. I'm also worried that it's going to take a long time to work through the list.
Is there some way to go through the list of properties for my object iteratively, comparing them to the database record, changing them if necessary and then outputting a list of what changed.
In pseudo code this is what I guess I am after:
foreach (var field in editmodel)
{
if (field != database.field)
{
// Update the value
// Write a string about what changed
// Add the string to the list of what changed
}
}
Because I'm still learning I would appreciate guidance/tips on what subject matter to read about or where I can independently research the answer. The gaps in my skill are currently stopping me from being able to even research a solution approach.
Thanks in advance.
You can try to use Reflection for your purposes. Something like this
var fields = editmodel.GetType().GetFields();
foreach (var item in fields)
{
if (item.GetValue(editmodel) == database.field)
{
// Update the value
// Write a string about what changed
// Add the string to the list of what changed
}
}
I think I have found the hint I was looking for...
System.Reflection
More specifically, the FieldInfo.GetValue() method.
I was previously unaware of what System.Reflection was all about, so I'll research this area further to find my solution.

Iterate all 'select' elements and get all their values in Selenium

I have the following code in C# using selenium:
private void SelectElementFromList(string label)
{
var xpathcount = selenium.GetXpathCount("//select");
for (int i = 1; i <= xpathcount; ++i)
{
string[] options;
try
{
options = selenium.GetSelectOptions("//select["+i+"]");
}
catch
{
continue;
}
foreach (string option in options)
{
if (option == label)
{
selenium.Select("//select[" + i + "]", "label=" + label);
return;
}
}
}
}
The problem is the line:
options = selenium.GetSelectOptions("//select["+i+"]");
When i == 1 this works, but when i > 1 the method return null ("ERROR: Element //select[2] not found"). It works only when i == 1.
I have also tried this code in JS:
var element = document.evaluate("//select[1]/option[1]/#value", document, null, XPathResult.ANY_TYPE, null);
alert(element.iterateNext());
var element = document.evaluate("//select[2]/option[1]/#value", document, null, XPathResult.ANY_TYPE, null);
alert(element.iterateNext());
Which print on the screen "[object Attr]" and then "null".
What am I doing wrong?
My goal is to iterate all "select" elements on the page and find the one with the specified label and select it.
This is the second most FAQ in XPath (the first being unprefixed names and default namespace.
In your code:
options = selenium.GetSelectOptions("//select["+i+"]");
An expression of the type is evaluated:
//select[position() =$someIndex]
which is a synonym for:
//select[$someIndex]
when it is known that $someIndex has an integer value.
However, by definition of the // XPath pseudo-operator,
//select[$k]
when $k is integer, means:
"Select all select elements in the document that are the $k-th select child of their parent."
When i == 1 this works, but when i > 1 the method return null ("ERROR:
Element //select[2] not found"). It works only when i == 1.
This simply means that in the XML document there is no element that has more than one select child.
This is a rule to remember: The [] XPath operator has higher precedence (priority) than the // pseudo-operator.
The solution: As always when we need to override the default precedence of operators, we must use brackets.
Change:
options = selenium.GetSelectOptions("//select["+i+"]");
to:
options = selenium.GetSelectOptions("(//select)["+i+"]");
Finally I've found a solution.
I've just replaced these lines
options = selenium.GetSelectOptions("//select["+i+"]");
selenium.Select("//select["+i+"]", "label="+label);
with these
options = selenium.GetSelectOptions("//descendant::select[" + i + "]");
selenium.Select("//descendant::select[" + i + "]", "label=" + label);
The above solution options = selenium.GetSelectOptions("(//select)["+i+"]"); doesn't worked for me but i tried to use css selectors.
I want to get username and password text box. I tried with css=input this gave me Username text box and when used css=input+input this gave me Password textbox.
along with this selectors you can use many things in combination.
here is the link from where i read.
I think this will help u to achieve your target.
Regards.

Simple Linq to XML Query Doesn't Work

I nominate me for village idiot.
Why doesn't this work:
foreach (XElement clientField in _clientXml.Descendants("row").Descendants())
{
var newFieldName =
from sourceField in _sourceEntries.Descendants("Field")
where (string)sourceField.Attribute("n") == (string)clientField.Attribute("n")
select new
{
FieldName = ((string) sourceField.Attribute("n")),
AcordRef = ((string) sourceField.Attribute("m"))
};
foreach (var element in newFieldName)
{
Console.WriteLine("Field Name: {0}",
element.FieldName, element.AcordRef);
}
}
My source XML files are loaded with XElement.Load(myFileName). In debug, clientField has an attribute n="Policy Number". The first element of _sourceEntries.Descendants("Field") also has an attribute n="Policy Number". Indeed, each element in _clientXml.Descendants("row").Descendants() has a matching row in _sourceEntries.Descendants("Field"). And, I know just enough to know that the select is lazy, so in debug I look at the Console.WriteLine block. No matter what I've tried, newFieldName is an empty set.
Just in case, here's the first element of the client file:
<Column_0 n="Policy Number">ABC000123</Column_0>
And, here's the fist element of the _sourceEntries collection:
<Field n="Policy Number" c="1" l="" s="" cd="" m="1805" f="" />
I know it's going to be something simple, but I just don't see what I'm doing wrong.
Thanks.
Randy
This accomplished what I ultimately needed to do:
foreach (var clientField in _clientXml.Descendants("row").Descendants())
{
foreach (var acordMapRef in
from sourceEntry in _clientTemplate.Descendants("SourceEntries").Descendants("Field")
where (string) clientField.Attribute("n") == (string) sourceEntry.Attribute("n")
from acordMapRef in _clientTemplate.Descendants("Acord").Descendants("Field")
where (string) sourceEntry.Attribute("m") == (string) acordMapRef.Attribute("id")
select acordMapRef)
{
clientField.Attribute("n").Value = (string) acordMapRef.Attribute("n");
}
}
But, it's surely a candidate for ugliest code of the month. One thing I noticed in fooling around is that elements in an XElement tree don't seem to match to XElements in an IEnumerable collection. You might notice in the original code, above, I had an object _sourceEntries. This was a collection derived from _clientTemplate.Descendants("SourcEntries").Descendants("Field"). I would have thought that the two forms were essentially equivalent for my purposes, but apparently not. I'd appreciate somebody commenting on this issue.
Thanks folks!
Try changing:
where (string)sourceField.Attribute("n") == (string)clientField.Attribute("n")
To:
where sourceField.Attribute("n").Value == clientField.Attribute("n").Value

Categories