I'm using an XmlReader to iterate through some XML. Some of the XML is actually HTML and I want to get the text content from the node.
Example XML:
<?xml version="1.0" encoding="UTF-8"?>
<data>
<p>Here is some <b>data</b></p>
</data>
Example code:
using (XmlReader reader = new XmlReader(myUrl))
{
while (reader.Read())
{
if (reader.Name == "p")
{
// I want to get all the TEXT contents from the this node
myVar = reader.Value;
}
}
}
This doesn't get me all the contents. How do I get all the contents from the node in that situation?
Use ReadInnerXml:
StringReader myUrl = new StringReader(#"<?xml version=""1.0"" encoding=""UTF-8""?>
<data>
<p>Here is some <b>data</b></p>
</data>");
using (XmlReader reader = XmlReader.Create(myUrl))
{
while (reader.Read())
{
if (reader.Name == "p")
{
// I want to get all the TEXT contents from the this node
Console.WriteLine(reader.ReadInnerXml());
}
}
}
Or if you want to skip the <b> as well, you can use an aux reader for the subtree, and only read the text nodes:
StringReader myUrl = new StringReader(#"<?xml version=""1.0"" encoding=""UTF-8""?>
<data>
<p>Here is some <b>data</b></p>
</data>");
StringBuilder myVar = new StringBuilder();
using (XmlReader reader = XmlReader.Create(myUrl))
{
while (reader.Read())
{
if (reader.Name == "p")
{
XmlReader pReader = reader.ReadSubtree();
while (pReader.Read())
{
if (pReader.NodeType == XmlNodeType.Text)
{
myVar.Append(pReader.Value);
}
}
}
}
}
Console.WriteLine(myVar.ToString());
I can't upvote or comment on others' responses, so let me just say carlosfigueira hit the nail on the head, that's exactly how you read the text value of an element. his answer helped me immensely.
for the sake of exmeplification here's my code:
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
{
if (reader.Name == "CharCode")
{
switch (reader.ReadInnerXml())
{
case "EUR":
{
reader.ReadToNextSibling("Value");
label4.Text = reader.ReadInnerXml();
}
break;
case "USD":
{
reader.ReadToNextSibling("Value");
label3.Text = reader.ReadInnerXml();
}
break;
case "RUB":
{
reader.ReadToNextSibling("Value");
label5.Text = reader.ReadInnerXml();
}
break;
case "RON":
{
reader.ReadToNextSibling("Value");
label6.Text = reader.ReadInnerXml();
}
break;
}
}
}
break;
}
}
the file I'm reading can be found here: http://www.bnm.md/md/official_exchange_rates?get_xml=1&date=
(you have to add a date in DD.MM.YYYY format to it to get the .XML)
I suggest you use HtmlAgilityPack which is a mature and, stable library for doing this sort of thing. It takes care of fetching the html, converting it to xml, and allows you to select the nodes you'd like with XPATH.
In your case it would be as simple as executing
HtmlDocument doc = new HtmlWeb().Load(myUrl);
string text = doc.DocumentNode.SelectSingleNode("/data/p").InnerText;
Related
I am trying to retrieve all elements from an XML file, but I just can reach one, is there any way I can retrieve all?
HttpWebResponse objResponse = (HttpWebResponse)objRequest.GetResponse();
using (XmlReader reader = XmlReader.Create(new StreamReader(objResponse.GetResponseStream())))
{
while (reader.Read())
{
#region Get Credit Score
//if (reader.ReadToDescendant("results"))
if (reader.ReadToDescendant("ssnMatchIndicator"))
{
string ssnMatchIndicator = reader.Value;
}
if (reader.ReadToDescendant("fileHitIndicator"))
{
reader.Read();//this moves reader to next node which is text
result = reader.Value; //this might give value than
Res.Response = true;
Res.SocialSecurityScore = result.ToString();
//break;
}
else
{
Res.Response = false;
Res.SocialSecurityScore = "Your credit score might not be available. Please contact support";
}
#endregion
#region Get fileHitIndicator
if (reader.ReadToDescendant("fileHitIndicator"))
{
reader.Read();
Res.fileHitIndicator = reader.Value;
//break;
}
#endregion
}
}
Can somebody help me out with this issue?
I am also using objResponse.GetResponseStream() because the XML comes from a response from server.
Thanks a lot in advance.
Try this :
XmlDataDocument xmldoc = new XmlDataDocument();
XmlNodeList xmlnode ;
int i = 0;
string str = null;
FileStream fs = new FileStream("product.xml", FileMode.Open, FileAccess.Read);
xmldoc.Load(fs);
xmlnode = xmldoc.GetElementsByTagName("Product");
for (i = 0; i <= xmlnode.Count - 1; i++)
{
xmlnode[i].ChildNodes.Item(0).InnerText.Trim();
str = xmlnode[i].ChildNodes.Item(0).InnerText.Trim() + " " + xmlnode[i].ChildNodes.Item(1).InnerText.Trim() + " " + xmlnode[i].ChildNodes.Item(2).InnerText.Trim();
MessageBox.Show (str);
}
I don't know why what you're doing is not working, but I wouldn't use that method. I've found the following to work well. Whether you're getting the xml from a stream, just put it into a string and bang...
StreamReader reader = new StreamReader(sourcepath);
string xml = reader.ReadToEnd();
reader.Close();
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
XmlNodeList list = doc.GetElementsByTagName("*");
foreach (XmlNode nd in list)
{
switch (nd.Name)
{
case "ContactID":
var ContactIdent = nd.InnerText;
break;
case "ContactName":
var ContactName = nd.InnerText;
break;
}
}
To capture what is between the Xml tags, if there are no child Xml tags, use the InnerText property, e.g. XmlNode.InnerText. To capture what is between the quotes in the nodes' attributes, use XmlAttribute.Value.
As for iterating through the attributes, if one of your nodes has attributes, such as the elements "Name", "SpectralType" and "Orbit" in the Xml here:
<System>
<Star Name="Epsilon Eridani" SpectralType="K2v">
<Planets>
<Planet Orbit="1">Bill</Planet>
<Planet Orbit="2">Moira</Planet>
</Planets>
</Star>
</System>
Detect them using the Attributes property, and iterate through them as shown:
if (nd.Attributes.Count > 0)
{
XmlAttributeCollection coll = nd.Attributes;
foreach (XmlAttribute cn in coll)
{
switch (cn.Name)
{
case "Name":
thisStar.Name = cn.Value;
break;
case "SpectralType":
thisStar.SpectralClass = cn.Value;
break;
}
}
}
You might find some more useful information HERE.
I have some XML that looks like this:
<PackageConstruct900 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<ID>{5209724e-1c5a-4d84-962e-371271c3836c}</ID>
<ParentID />
<Name />
<Type>Package</Type>
<Tasks>
<anyType xsi:type="Task">
<ID>{4c97132c-ba7d-4fba-9b01-333976e9ad22}</ID>
<ParentID>{E893A7FD-2758-4315-9181-93F8728332E5}</ParentID>
<Name>ProcessAgility</Name>
<Type>Task</Type>
<StartedOn>1900-01-01T00:00:00</StartedOn>
<EndedOn>1900-01-01T00:00:00</EndedOn>
</anyType>
</Tasks>
</PackageConstruct900>
I'm trying to capture the Value of the second "Name" node ("ProcessAgility").
But (XmlReader) reader.Value returns an empty string when I arrive at this node. How do I capture the TEXT that falls betweeen <nodeName>TEXT</nodeName> ?
Here's my code so far:
XmlReader reader = XmlReader.Create(pathToFile, settings);
reader.MoveToContent();
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Text:
break;
case XmlNodeType.Element:
switch (reader.Name)
{
case "anyType":
newJob = true;
break;
case "AML":
string ss = string.Empty;
ss = reader.ReadInnerXml();
ss = System.Net.WebUtility.HtmlDecode(ss);
rs = XmlReader.Create(ss, settings);
break;
case "Name":
if (newJob && reader.HasValue)
{
jobName = reader.Value;
}
if (!string.IsNullOrWhiteSpace(jobName))
{
if (!jobsAdded.Contains(jobName))
{
jobsAdded.Add(jobName);
}
}
break;
case "Tasks":
m_ConvertingTask = true;
break;
case "TRIGGERS":
break;
}
break;
}
}
Try using XmlDocument then you can use XPath navigation like this: /PackageConstruct900/Tasks/anyType/Name
XmlDocument doc = new XmlDocument();
doc.LoadXml("<PackageConstruct900 xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\">\r\n <ID>{5209724e-1c5a-4d84-962e-371271c3836c}</ID>\r\n <ParentID />\r\n <Name />\r\n <Type>Package</Type>\r\n <Tasks>\r\n <anyType xsi:type=\"Task\">\r\n <ID>{4c97132c-ba7d-4fba-9b01-333976e9ad22}</ID>\r\n <ParentID>{E893A7FD-2758-4315-9181-93F8728332E5}</ParentID>\r\n <Name>ProcessAgility</Name>\r\n <Type>Task</Type>\r\n <StartedOn>1900-01-01T00:00:00</StartedOn>\r\n <EndedOn>1900-01-01T00:00:00</EndedOn>\r\n </anyType>\r\n </Tasks>\r\n</PackageConstruct900>");
XmlNode root = doc.DocumentElement;
XmlNode node = root.SelectSingleNode(
"/PackageConstruct900/Tasks/anyType/Name");
Console.WriteLine(node.InnerXml);
This will give you the first node, if you want the list then iterate it you can use:
XmlNodeList nodes = root.SelectNodes("/PackageConstruct900/Tasks");
foreach (XmlNode node in nodes)
{
var typename = node.SelectSingleNode("anyType/Name");
Console.WriteLine(typename.InnerXml);
}
A pretty simple (and very specific) System.Xml.Linq.XElement solution would be:
string processAgility = XElement.Parse(File.ReadAllText(pathToFile))
.Element("Tasks")
.Element("anyType")
.Element("Name")
.Value;
Or if you include a using System.Xml.XPath directive, you could use XPath natigation with the XPathSelectElement extension method:
string processAgility = XElement.Parse(File.ReadAllText(pathToFile))
.XPathSelectElement("Tasks/anyType/Name")
.Value;
I have this xml file and I want to extract author name and access number and what I have is a very naive implementation in C# where I am using xml reader and reading line by line. But I am looking for an implementation where I can read the author name and access number in c# efficiently. I am new to C# and I have been told that LINQ should be used but looking at the document and this file I am not able to relate how to use Xdocument. Any help will be appreciated.
<xml>
<records>
<record>
<database name="CP_EndnoteLibrary_2012-2015-1.enl" path="C:\Users\Downloads\file.enl">file.enl</database>
<source-app name="EndNote" version="17.4">EndNote</source-app>
<rec-number>24</rec-number>
<contributors>
<authors>
<author>
<style face="normal" font="default" size="100%">ABCD, X.</style>
</author>
<author>
<style face="normal" font="default" size="100%">EFGH, I.</style>
</author>
</authors>
</contributors>
<accession-num>
<style face="normal" font="default" size="100%">12345678</style>
</accession-num>
</record>
<record>...</record>
</records>
Following a document, I was able to write this code to figure out author name.
{
class Program
{
static void Main(string[] args)
{
XmlReader reader = XmlReader.Create("C:\\Users\\ile_xml.xml");
while(reader.Read())
{
if((reader.NodeType == XmlNodeType.Element) && (reader.Name == "author"))
{
reader.Read();
reader.Read();
if((reader.NodeType == XmlNodeType.Element) && (reader.Name == "style") && reader.HasAttributes)
{
var val = reader.ReadInnerXml();
Console.WriteLine("Display:" + reader.GetAttribute("author"));
}
}
}
}
}
}
The above code seems to be very inefficient and I am looking for ways to improve this or do it in a better way.
This will give you the correct result:-
XDocument xdoc = XDocument.Load(#"YourXMLfilePath");
var result = xdoc.Root.Elements("record")
.Select(x => new
{
Name = (string)x.Element("database").Attribute("name"),
Number = (string)x.Element("rec-number")
});
//Helpfull namespaces:
using System.Xml.Linq;
using System.Xml.XPath;
using System.Xml.Serialization;
static void Main(string[] args)
{
//Your snippet, which didn't work on my machine:
XmlReader reader = XmlReader.Create("C:\\Users\\Public\\ile_xml.xml");
while (reader.Read())
{
if ((reader.NodeType == XmlNodeType.Element) && (reader.Name == "author"))
{
reader.Read();
reader.Read();
if ((reader.NodeType == XmlNodeType.Element) && (reader.Name == "style") && reader.HasAttributes)
{
var val = reader.ReadInnerXml();
Console.WriteLine("Display:" + reader.GetAttribute("author"));
}
}
}
//Should produce the results you are looking for:
XmlNodeList xmlNodeList;
XmlDocument xDoc = new XmlDocument();
XmlReaderSettings xrs = new XmlReaderSettings();
xrs.DtdProcessing = DtdProcessing.Parse;
//Get Authors from XML Source
using (XmlReader reader2 = XmlReader.Create("C:\\Users\\Public\\ile_xml.xml"))
{
xDoc.Load(reader2);
xmlNodeList = xDoc.SelectNodes("records/record/contributors/authors/author");
}
foreach (XmlNode node in xmlNodeList)
{
Console.WriteLine(node.InnerText);//.InnerXML to include style tags.
};
}
xpath will help find the information you need. Hopefully the above will get you closer with xdoc.
Another pattern I have recently adopted is to serialize the xml into c# class (or in this case a List) and then use LINQ to manipulate as desired.
this was helpful to me: Deserializing XML to Objects in C#
I am trying to read windows update package.xml file which is roughly 65mb in size, I am trying to just grab the URL attribute using Xpath but for some odd reason my object always returns empty. Here is my code:
doc.Load(#".\package.xml");
string xpath= "/OfflineSyncPackage/FileLocations/FileLocation/#Url";
XmlNodeList nodeList2 = doc.SelectNodes(xpath);
I have also tried using XmlReader which is also not working for me:
string packXML = #".\package.xml";
using (XmlReader xr = XmlReader.Create(packXML))
{
while (xr.Read())
{
switch (xr.NodeType)
{
case XmlNodeType.Element:
if (xr.Name == "OfflineSyncPackage")
{
xr.ReadStartElement("FileLocations");
if (xr.Name == "FileLocations")
{
if (xr.Name == "FileLocation")
{
}
}
}
break;
}
}
}
The package.xml file can be found in package.cab which is in this file: http://download.windowsupdate.com/microsoftupdate/v6/wsusscan/wsusscn2.cab
What is the best way to do this as I do not want to load whole file into memory due to size
Any advice is appreciated! Thank you
I figured this out finally!
public void ParseXML(string XMLPath)
{
XmlReader xmlReader = XmlReader.Create(XMLPath);
while (xmlReader.Read())
{
if (xmlReader.Name.Equals("FileLocation") && (xmlReader.NodeType == XmlNodeType.Element))
{
string url = xmlReader.GetAttribute("Url");
}
}
}
I want to know how to get attribute "text" via c#?
Example xml:
<?xml version="1.0" encoding="utf-8" ?>
<Model Name="modelname">
<Mode Name="text">
<Class>Class1</Class>
</Mode>
</Model>
I try to parse this xml by using XMLReader(example from msdn) :
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
Console.Write("<" + reader.Name+"");
Console.WriteLine(str);
if (reader.Name =="Mode")
{
namemode = true;
}
if (namemode)
{
if (reader.Name == element)
{
elementExists = true;
}
}
// Console.WriteLine(">");
break;
case XmlNodeType.Text:
Console.WriteLine(reader.Value);
if (elementExists)
{
values.Add(reader.Value);
elementExists = false;
}
break;
}
}
Maybe i should use XMLDocument to do this?
Thanks.
You could use XDocument and LINQ
You'll need to include the System.Xml.Linq.XDocument namespace.
Then you could do something like:
XDocument document = XDocument.Load(filePath);
var modes = (from modes in document.Root.Descendants("Mode")
select modes.Attribute("Name").Value).ToList();
Like this:
const string xml = #"<?xml version=""1.0"" encoding=""utf-8"" ?>
<Model Name=""modelname"">
<Mode Name=""text"">
<Class>Class1</Class>
</Mode>
</Model>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
Console.WriteLine(doc.DocumentElement["Mode"].Attributes["Name"].Value);