In c#, if I have this
doc = new XmlDocument();
doc.LoadXml("<.....>");
doc.LoadXml("<.....>");
Does that 2nd one do a replace or append to the doc object?
It replaces. This is pretty obvious if you think about it and consider that valid XML documents must have a single root node.
More importantly, use XDocument, it is just better.
using System.Xml.Linq;
...
var document = XDocument.Parse(...
Closed. This question needs to be more focused. It is not currently accepting answers.
Closed 4 years ago.
This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
Is there a simple method of parsing XML files in C#? If so, what?
It's very simple. I know these are standard methods, but you can create your own library to deal with that much better.
Here are some examples:
XmlDocument xmlDoc= new XmlDocument(); // Create an XML document object
xmlDoc.Load("yourXMLFile.xml"); // Load the XML document from the specified file
// Get elements
XmlNodeList girlAddress = xmlDoc.GetElementsByTagName("gAddress");
XmlNodeList girlAge = xmlDoc.GetElementsByTagName("gAge");
XmlNodeList girlCellPhoneNumber = xmlDoc.GetElementsByTagName("gPhone");
// Display the results
Console.WriteLine("Address: " + girlAddress[0].InnerText);
Console.WriteLine("Age: " + girlAge[0].InnerText);
Console.WriteLine("Phone Number: " + girlCellPhoneNumber[0].InnerText);
Also, there are some other methods to work with. For example, here. And I think there is no one best method to do this; you always need to choose it by yourself, what is most suitable for you.
I'd use LINQ to XML if you're in .NET 3.5 or higher.
Use a good XSD Schema to create a set of classes with xsd.exe and use an XmlSerializer to create a object tree out of your XML and vice versa. If you have few restrictions on your model, you could even try to create a direct mapping between you model classes and the XML with the Xml*Attributes.
There is an introductory article about XML Serialisation on MSDN.
Performance tip: Constructing an XmlSerializer is expensive. Keep a reference to your XmlSerializer instance if you intend to parse/write multiple XML files.
If you're processing a large amount of data (many megabytes) then you want to be using XmlReader to stream parse the XML.
Anything else (XPathNavigator, XElement, XmlDocument and even XmlSerializer if you keep the full generated object graph) will result in high memory usage and also a very slow load time.
Of course, if you need all the data in memory anyway, then you may not have much choice.
Use XmlTextReader, XmlReader, XmlNodeReader and the System.Xml.XPath namespace. And (XPathNavigator, XPathDocument, XPathExpression, XPathnodeIterator).
Usually XPath makes reading XML easier, which is what you might be looking for.
I have just recently been required to work on an application which involved the parsing of an XML document and I agree with Jon Galloway that the LINQ to XML based approach is, in my opinion, the best. I did however have to dig a little to find usable examples, so without further ado, here are a few!
Any comments welcome as this code works but may not be perfect and I would like to learn more about parsing XML for this project!
public void ParseXML(string filePath)
{
// create document instance using XML file path
XDocument doc = XDocument.Load(filePath);
// get the namespace to that within of the XML (xmlns="...")
XElement root = doc.Root;
XNamespace ns = root.GetDefaultNamespace();
// obtain a list of elements with specific tag
IEnumerable<XElement> elements = from c in doc.Descendants(ns + "exampleTagName") select c;
// obtain a single element with specific tag (first instance), useful if only expecting one instance of the tag in the target doc
XElement element = (from c in doc.Descendants(ns + "exampleTagName" select c).First();
// obtain an element from within an element, same as from doc
XElement embeddedElement = (from c in element.Descendants(ns + "exampleEmbeddedTagName" select c).First();
// obtain an attribute from an element
XAttribute attribute = element.Attribute("exampleAttributeName");
}
With these functions I was able to parse any element and any attribute from an XML file no problem at all!
In Addition you can use XPath selector in the following way (easy way to select specific nodes):
XmlDocument doc = new XmlDocument();
doc.Load("test.xml");
var found = doc.DocumentElement.SelectNodes("//book[#title='Barry Poter']"); // select all Book elements in whole dom, with attribute title with value 'Barry Poter'
// Retrieve your data here or change XML here:
foreach (XmlNode book in nodeList)
{
book.InnerText="The story began as it was...";
}
Console.WriteLine("Display XML:");
doc.Save(Console.Out);
the documentation
If you're using .NET 2.0, try XmlReader and its subclasses XmlTextReader, and XmlValidatingReader. They provide a fast, lightweight (memory usage, etc.), forward-only way to parse an XML file.
If you need XPath capabilities, try the XPathNavigator. If you need the entire document in memory try XmlDocument.
I'm not sure whether "best practice for parsing XML" exists. There are numerous technologies suited for different situations. Which way to use depends on the concrete scenario.
You can go with LINQ to XML, XmlReader, XPathNavigator or even regular expressions. If you elaborate your needs, I can try to give some suggestions.
You can parse the XML using this library System.Xml.Linq. Below is the sample code I used to parse a XML file
public CatSubCatList GenerateCategoryListFromProductFeedXML()
{
string path = System.Web.HttpContext.Current.Server.MapPath(_xmlFilePath);
XDocument xDoc = XDocument.Load(path);
XElement xElement = XElement.Parse(xDoc.ToString());
List<Category> lstCategory = xElement.Elements("Product").Select(d => new Category
{
Code = Convert.ToString(d.Element("CategoryCode").Value),
CategoryPath = d.Element("CategoryPath").Value,
Name = GetCateOrSubCategory(d.Element("CategoryPath").Value, 0), // Category
SubCategoryName = GetCateOrSubCategory(d.Element("CategoryPath").Value, 1) // Sub Category
}).GroupBy(x => new { x.Code, x.SubCategoryName }).Select(x => x.First()).ToList();
CatSubCatList catSubCatList = GetFinalCategoryListFromXML(lstCategory);
return catSubCatList;
}
You can use ExtendedXmlSerializer to serialize and deserialize.
Instalation
You can install ExtendedXmlSerializer from nuget or run the following command:
Install-Package ExtendedXmlSerializer
Serialization:
ExtendedXmlSerializer serializer = new ExtendedXmlSerializer();
var obj = new Message();
var xml = serializer.Serialize(obj);
Deserialization
var obj2 = serializer.Deserialize<Message>(xml);
Standard XML Serializer in .NET is very limited.
Does not support serialization of class with circular reference or class with interface property,
Does not support Dictionaries,
There is no mechanism for reading the old version of XML,
If you want create custom serializer, your class must inherit from IXmlSerializable. This means that your class will not be a POCO class,
Does not support IoC.
ExtendedXmlSerializer can do this and much more.
ExtendedXmlSerializer support .NET 4.5 or higher and .NET Core. You can integrate it with WebApi and AspCore.
You can use XmlDocument and for manipulating or retrieve data from attributes you can Linq to XML classes.
My tasks seems to sound simple but is proving very difficult. I have a XML file like below:
<?xml version="1.0" encoding="UTF-8"?>
<ref:ReferralDocument xmlns:ref="http://ref.com" xmlns:gen="http://ref.com"
xmlns:h="http://www.w3.org/1999/xhtml" xmlns:pbr="ref.com"
xmlns:xsi="ref.com" schemaVersion="2.9">
<ref:MessageDetails>
<gen:TestID>
<gen:IdValue>2412665651</gen:IdValue>
<gen:IdScheme>Test</gen:IdScheme>
<gen:IdType>Person</gen:IdType>
</gen:TestID>
</ref:MessageDetails>
<gen:Name>
<gen:StructuredName>
<gen:GivenName>Test</gen:GivenName>
<gen:FamilyName>Test</gen:FamilyName>
</gen:StructuredName>
<gen:NameType>Current Name</gen:NameType>
</gen:Name>
</ref:MessageDataRef>
I want to read every XML element in to a XML Element object. As store them in a List or Array of XML Elements.
//Create new document
var document = new XmlDocument();
document.Load("C:\\Users\\liam.mccann\\Documents\\test.xml");
After loading in in I haven't had any success in getting every element only seem to be able to get the first.
So simple Load XML Document >> Validate XML >> Get Scheme Version from first Element >> Parse To XML Element List.
After parsing i hope it would be easy to restructure the document and save it again. And the document been the same.
Hopefully someone is able to help!
I want to read every XML element in to a XML Element object. As store
them in a List or Array of XML Elements.
You could do the following:
var array = document.SelectNodes("/descendant-or-self::*")
.OfType<XmlElement>()
.ToArray();
Having said that, if all you're intending is:
After parsing i hope it would be easy to restructure the document and
save it again.
then storing the elements in an array is probably not the optimal approach. You can use the XmlDocument class itself (or better yet - the new LINQ approach: System.Xml.Linq.XDocument).
Here's the MSDN document that hopefully can get you started: http://msdn.microsoft.com/en-us/library/bb387084%28v=vs.110%29.aspx.
I am using c# console app to get xml document. Now once xmldocument is loaded i want to search for specific href tag:
href="/abc/def
inside the xml document.
once that node is found i want to strip tag completly and just show Hello.
Hello
I think i can simply get the tag using regex. But can anyone please tell me how can i remove the href tag completly using regex?
xml & html same difference: tagged content. xml is stricter in it's formatting.
for this use case I would use transformations and xpath queries rebuild the document. As #Yahia stated, regex on tagged documents is typically a bad idea. the regex for parsing is far to complex to be affective as a generic solution.
The most popular technology for similar tasks is called XPath. (It is also a key component of XQuery and XSLT.) Would the following perhaps solve your task, too?
root.SelectSingleNode("//a[#href='/abc/def']").InnerText = "Hello";
You could try
string x = #"<?xml version='1.0'?>
<EXAMPLE>
<a href='/abc/def'>Hello</a>
</EXAMPLE>";
System.Xml.XmlDocument doc = new XmlDocument();
doc.LoadXml(x);
XmlNode n = doc.SelectSingleNode("//a[#href='/abc/def']");
XmlNode p = n.ParentNode;
p.RemoveChild(n);
System.Xml.XmlNode newNode = doc.CreateNode("element", "a", "");
newNode.InnerXml = "Hello";
p.AppendChild(newNode);
Not really sure if this is what you are trying to do but it should be enough to get you headed in right direction.
I have an XML document that would contain empty nodes that looked like the following:
<metadata territory="USA"></metadata>
After simply opening, then saving using XmlDocument, this line looks like:
<metadata territory="USA">
</metadata>
When I set PreserveWhitespace to true, it converted the entire XML to 1 line, so this won't work.
These XML files need to keep the current formatting as much as possible. I know, technically, it doesn't matter which way they are written, they will be read the same way but I still need to keep the same formatting. I can't figure out a way to keep the nodes with no values to 1 line. Is there a way to do this?
The ONLY method that keeps the document in its original formatting is if the XML file contained 'xml:space="preserve"' in the header, but I am to leave the header as is.
The only thing I want to change is the addition of values. As I said, simply loading and saving a document adds this, so if you want to test, just try...
XmlDocument doc = new XmlDocument();
doc.Load(#"C:\Temp\test.xml");
doc.Save(#"C:\Temp\test_02.xml");
Just did the test and this works using both XDocument and XmlDocument by setting the PreserveWhitespace property.
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.PreserveWhitespace = true;
xmlDoc.Load("test.xml");
xmlDoc.Save("testOut.xml");
..
XDocument xdoc = XDocument.Load("test.xml", LoadOptions.PreserveWhitespace);
xdoc.Save(#"testOut.xml");
Input:
<foo>
<metadata territory="USA"></metadata>
<bar></bar>
<baz>
</baz>
</foo>
Output:
<foo>
<metadata territory="USA"></metadata>
<bar></bar>
<baz>
</baz>
</foo>
I'm with Richard Schneider: I don't believe it's possible. One possible solution is to take the output XML file and run it through an XML formatting program that normalizes the format of the XML file (you can probably write one with the unmanaged XML dom if one can't be found).
Since the file is always normalized, it won't change that much hopefully.
If you're using xmlDocument, I may recommend you to use XDocument instead (Framework 3.0+).
PreserveWhitespace will add a
<whatever> <...>
**</whatever>**
to each line while None will just close it like <... />.
I looked up for 5 minutes how to preserve those white space, but couldn't find it. There's something ommitting char(13) in the de/reserialization.
XDocument doc;
using (FileStream fs = new FileStream(file, FileMode.Open, FileAccess.Read))
{
//Alternative with .None
doc = XDocument.Load(fs, LoadOptions.PreserveWhitespace);
}
and importantly..
xmlDoc.Save("lala.xml", SaveOptions.None);
I don't think this is possible. When you load the XML document you lose formatting information; so there is no way that Save can give the same results.
Why not save the file as a different format, then rename it back to XML after it's been saved. I would be surprised if it still gets formatted incorrectly. Not pretty but pretty easy.