Deserialize XML containing an embedded xhtml element - c#

I am receiving xml that contains embedded xhtml.
I would like to return the DocBody Element as a string that contains all of the xhtml, but do not know how since the xhtml is contained within the xml. I have not seen many examples of how to do this.
I am receiving this result via an API so the XML cannot be modified.
<Documents xmlns="http://mycompany.com/api/v2" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<Document>
<Content i:type="CreditAnalystNote">
<DocBody>
<xhtml>
<p>
Paragrah 1 html.
</p>
<p>
Paragrah 2 html.
</p>
<p>
Paragrah 3 html.
</p>
<p>
Paragrah 4 html.
</p>
</xhtml>
</DocBody>
</Content>
</Document>
</Documents>
The class I am using to deserialize the XML with is given as follows:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Xml.Serialization;
namespace Corporate
{
[Serializable]
[XmlRoot(ElementName = "Documents", Namespace = "http://cms.mycompany.com/api/v2")]
public class RPSShareclassDocuments
{
[XmlElement(ElementName = "Document")]
public List<ShareClassDocumentData> DocumentData { get; set; }
}
[Serializable]
public class ShareClassDocumentData
{
[XmlElement("Content")]
public DocumentShareClassData CreditNote { get; set; }
}
[XmlType("CreditAnalystNote")]
public class DocumentShareClassData
{
[XmlElement]
public string DocBody { get; set; }
}
}
I am attempting to deserialize the text using the following:
XDocument xDocResult = new XDocument();
xDocResult = ... (gets content into an XDocument)
IXmlUtils xmlUtility = new XmlUtils();
RPSShareclassDocuments document = xmlUtility.Deserialize<RPSShareclassDocuments>(xDocResult);
and the Deserialization method is:
public T Deserialize<T>(XDocument doc)
{
string szNamespace = doc.Root.Name.Namespace.NamespaceName;
XmlSerializer serializer = new XmlSerializer(typeof(T), szNamespace);
T result;
using (System.Xml.XmlReader reader = doc.CreateReader())
{
result = (T)serializer.Deserialize(reader);
}
return result;
}
EDIT 1
The only way that I could figure out how to extract the XHTML from the XML was to use the following code, though this is not what I was hoping for.
xDocResult = restClient.RestGET(szCurrentSecurityAPI);
XNamespace ns = xDocResult.Root.Name.Namespace.NamespaceName;
XElement xdb = xDocResult.Root.Element(ns + "Document").Element(ns + "Content").Element(ns + "DocBody").Element(ns + "xhtml");
If there is a better option such as extracting the information from the XDocument directly or XSLT, I would also be interested.

Related

C#/XML: XPathNavigator.SelectSingleNode() always returns null

I'm trying to integrate a WebDAV client into some bigger tool suite to be able to create events/notifications from my software in the users existing calendar. My project is a WPF application written in c#.
I have set up a calendar with a WebDAV interface/api available and now I try to read the ctag property of the calendar. When sending the PROPFIND http request
<?xml version="1.0" encoding="utf-8"?>
<d:propfind xmlns:d=\"DAV:\" xmlns:cs=\"http://calendarserver.org/ns/\">
<d:prop>
<d:displayname/>
<cs:getctag/>
</d:prop>
</d:propfind>
I receive a http response with the following content:
<?xml version="1.0" encoding="utf-8"?>
<d:multistatus xmlns:d="DAV:" xmlns:nmm="http://all-inkl.com/ns" xmlns:cal="urn:ietf:params:xml:ns:caldav" xmlns:cs="http://calendarserver.org/ns/">
<d:response>
<d:href>/calendars/cal0015dc8/1/</d:href>
<d:propstat>
<d:prop>
<d:displayname>My Calendar Name</d:displayname>
<cs:getctag>0</cs:getctag>
</d:prop>
<d:status>HTTP/1.1 200 OK</d:status>
</d:propstat>
</d:response>
</d:multistatus>
I know that the namespaces might look a little suspicious, some with and some without a trailing slash /, namespace d even with a trailing colon :, but this is exactly what I get from the server. If I for example change the namespace xmlns:d="DAV:" in my request to xmlns:d="DAV", I get a response status 500: InternalServerError, so I took the namespace declarations exactly as they are in the response.
Now, I want to get the value from the cs:getctag node. Problem is, everything I tried always returns null when navigating through the xml structure.
For clarification: response.Content.ReadAsStringAsync().Result returns the afore mentioned response xml string.
First try: Load response in a XmlDocument and access the subnodes by namespace/name combination:
using System.Xml;
XmlDocument doc = new XmlDocument();
XmlNamespaceManager xmlNamespaceManager = new XmlNamespaceManager(doc.NameTable);
xmlNamespaceManager.AddNamespace("d", "DAV:");
xmlNamespaceManager.AddNamespace("nmm", "http://all-inkl.com/ns");
xmlNamespaceManager.AddNamespace("cal", "urn:ietf:params:xml:ns:caldav");
xmlNamespaceManager.AddNamespace("cs", "http://calendarserver.org/ns/");
doc.LoadXml(response.Content.ReadAsStringAsync().Result);
XmlNode root = doc.DocumentElement;
XmlNode ctagNode = root["response", "d"]["propstat", "d"]["prop", "d"]["getctag", "cs"];
ctag = Convert.ToInt64(ctagNode.InnerText);
The node root is correctly set to element <d:multistatus>, but in the next line, where ctagNode should get selected, the code throws an exception:
System.NullReferenceException: Object reference not set to an instance of an object.
Second Try: Get the node with a XPath selection
using System.IO;
using System.Xml;
using System.Xml.Linq;
using System.Xml.XPath;
XmlReader xmlReader = XmlReader.Create(new StringReader(response.Content.ReadAsStringAsync().Result));
XmlNamespaceManager nsManager = new XmlNamespaceManager(xmlReader.NameTable);
nsManager.AddNamespace("d", "DAV:");
nsManager.AddNamespace("nmm", "http://all-inkl.com/ns");
nsManager.AddNamespace("cal", "urn:ietf:params:xml:ns:caldav");
nsManager.AddNamespace("cs", "http://calendarserver.org/ns/");
XDocument myXDocument = XDocument.Load(xmlReader);
XPathNavigator myNavigator = myXDocument.CreateNavigator();
string query = "//d:multistatus/d:response/d:propstat/d:prop/cs:getctag";
XPathNavigator ctagElement = myNavigator.SelectSingleNode(query, nsManager);
ctag = ctagElement.ValueAsLong;
After the execution of XPathNavigator ctagElement = myNavigator.SelectSingleNode(query, nsManager);, the object ctagElement is still null.
Can someone point out what I'm doing wrong in either case (1-Bare xml, 2-XPath) and how to do it right?
I would appreciate answers that help me solve this problem and that generally help me understand how to correctly navigate in xml data. You're welcome to also link to a comprehensive documentation or tutorial.
As #GSerg pointed out in his comment to my question, I was indeed not using the XmlNamespaceManager I have created in my First Try solution.
As it turns out, in my code example was just one small mistake:
using System.Xml;
XmlDocument doc = new XmlDocument();
XmlNamespaceManager xmlNamespaceManager = new XmlNamespaceManager(doc.NameTable);
xmlNamespaceManager.AddNamespace("d", "DAV:");
xmlNamespaceManager.AddNamespace("nmm", "http://all-inkl.com/ns");
xmlNamespaceManager.AddNamespace("cal", "urn:ietf:params:xml:ns:caldav");
xmlNamespaceManager.AddNamespace("cs", "http://calendarserver.org/ns/");
doc.LoadXml(response.Content.ReadAsStringAsync().Result);
XmlNode root = doc.DocumentElement;
// THIS LINE WAS WRONG
XmlNode ctagNode = root["response", "d"]
["propstat", "d"]
["prop", "d"]
["getctag", "cs"];
// IT SHOULD LOOK LIKE THIS:
XmlNode ctagNode = root["response", xmlNamespaceManager.LookupNamespace("d")]
["propstat", xmlNamespaceManager.LookupNamespace("d")]
["prop", xmlNamespaceManager.LookupNamespace("d")]
["getctag", xmlNamespaceManager.LookupNamespace("cs")];
ctag = Convert.ToInt64(ctagNode.InnerText);
Looks like the syntax
XmlNode childNode = parentNode["nameOfChildNode", "namespaceOfChildNode"]
requires the full namespace, not the namespace prefix.
As for my Second Try, I already used the namespace manager and the code worked after a VisualStudio restart and solution rebuild. No code change required.
Thank you #GSerg :-)
Try following :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Serialization;
namespace ConsoleApplication186
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
string xml = File.ReadAllText(FILENAME);
StringReader sReader = new StringReader(xml);
XmlReader xReader = XmlReader.Create(sReader);
XmlSerializer serializaer = new XmlSerializer(typeof(MultiStatus));
MultiStatus multiStatus = (MultiStatus)serializaer.Deserialize(xReader);
}
}
[XmlRoot(ElementName = "multistatus", Namespace = "DAV:")]
public class MultiStatus
{
[XmlElement(Namespace = "DAV:")]
public Response response { get; set; }
}
public class Response
{
[XmlElement(Namespace = "DAV:")]
public string href { get; set; }
[XmlElement(ElementName = "propstat", Namespace = "DAV:")]
public Propstat propstat { get; set; }
}
public class Propstat
{
[XmlElement(ElementName = "prop", Namespace = "DAV:")]
public Prop prop { get; set; }
[XmlElement(ElementName = "status", Namespace = "DAV:")]
public string status { get; set; }
}
public class Prop
{
[XmlElement(Namespace = "DAV:")]
public string displayname { get; set; }
[XmlElement(Namespace = "http://calendarserver.org/ns/")]
public string getctag { get; set; }
}
}

How to parse xml with multiple xmlns attribute in c#?

I have an xml beginning like above
<Invoice
xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2../xsdrt/maindoc/UBL-Invoice-2.1.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xades="http://uri.etsi.org/01903/v1.3.2#"
xmlns:ext="urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2"
xmlns:ds="http://www.w3.org/2000/09/xmldsig#"
xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2">
<cbc:UBLVersionID>2.1</cbc:UBLVersionID>
<cbc:CustomizationID>TR1.2</cbc:CustomizationID>
<cbc:ProfileID>TEMELFATURA</cbc:ProfileID>
<cbc:ID>ALP2018000007216</cbc:ID>
<!-- ... -->
and I try to parse the xml with method like that
public static T FromXml<T>(string xmlString)
{
StringReader xmlReader = new StringReader(xmlString);
XmlSerializer deserializer = new XmlSerializer(typeof(T));
return (T)deserializer.Deserialize(xmlReader);
}
and my xml model is like above
[Serializable]
[XmlRoot(
Namespace = "urn:oasis:names:specification:ubl:schema:xsd:Invoice-2",
ElementName = "Invoice",
DataType = "string",
IsNullable = true)]
public class Invoice
{
public string CustomizationID { get; set; }
// ...
}
However, I cannot parse the xml document, all values come null. I think that it is because of multiple xmlns attribute in Invoice tag. I couldnt solve the problem.
The default namespace of the document is urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 which you have correctly put in the XmlRoot, but the child elements such as UBLVersionID are prefixed with cbc, which is a different namespace. You have to put that namespace against the property to let the serializer know that's what it is.
For example:
[Serializable]
[XmlRoot(
Namespace = "urn:oasis:names:specification:ubl:schema:xsd:Invoice-2",
ElementName = "Invoice",
DataType = "string",
IsNullable = true)]
public class Invoice
{
[XmlElement(Namespace = "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2")]
public string CustomizationID { get; set; }
// ...
}
In Visual Studio, you can use Edit > Paste Special > Paste Xml As Classes to see how to decorate a class to match your XML if you're in doubt.

How to remove empty namespace attribute on manually added xml string when serializing object?

I am using XmlSerializer to output my object model to XML. Everything works very well but now I need to add several lines of pre-built XML to the object without building classes for each line. After lots of searching, I found that I can convert the xml string to an XmlElement using XmlDocument's LoadXml and DocumentElement calls. I get the XML I want except that the string section has an empty namespace. How can I eliminate the empty namespace attribute? Is there a better way to add an xml string to the object and have it be serialized properly?
Note: I am only creating output so I don't need to deserialize the generated XML. I am fairly new to the C#, .NET world, and hence, XmlSerialize.
Here is my code:
public class Book
{
public string Title { get; set; }
public string Author { get; set; }
public XmlElement Extension { get; set; }
public Book()
{
}
public void AddExtension()
{
string xmlString = "<AdditionalInfo>" +
"<SpecialHandling>Some Value</SpecialHandling>" +
"</AdditionalInfo>";
this.Extension = GetElement(xmlString);
}
public static XmlElement GetElement(string xml)
{
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
return doc.DocumentElement;
}
}
static void Main(string[] args)
{
TestSerialization p = new TestSerialization();
Book bookOne = new Book();
bookOne.Title = "How to Fix Code";
bookOne.Author = "Dee Bugger";
bookOne.AddExtension();
System.Xml.Serialization.XmlSerializer serializer = new XmlSerializer(typeof(Book), "http://www.somenamespace.com");
using (var writer = new StreamWriter("C:\\BookReport.xml"))
using (var xmlWriter = XmlWriter.Create(writer, new XmlWriterSettings { Indent = true }))
{
serializer.Serialize(xmlWriter, bookOne);
}
}
Here is my output:
<?xml version="1.0" encoding="utf-8"?>
<Book xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.somenamespace.com">
<Title>How to Fix Code</Title>
<Author>Dee Bugger</Author>
<Extension>
<AdditionalInfo xmlns="">
<SpecialHandling>Some Value</SpecialHandling>
</AdditionalInfo>
</Extension>
</Book>
It is the xmlns="" on AdditionalInfo that I want to eliminate. I believe this coming out because there is no association between the XmlDocument I created and the root serialized object, so the XmlDocument creates its own namespace. How can I tell the XmlDocument (and really, the generated XmlElement) that it belongs to the same namespace as the serialized object?
This is added because the parent elements have a namespace and your AdditionalInfo element does not. The xmlns="" attribute changes the default namespace for that element and its children.
If you want to get rid of it, then presumably you want the AdditionalInfo element to have the same namespace as its parent. In which case, you need to change your XML to this:
string xmlString = #"<AdditionalInfo xmlns=\"http://www.somenamespace.com\">" +
"<SpecialHandling>Some Value</SpecialHandling>" +
"</AdditionalInfo>";

Reading large XML files with C#

I would like to know how can I read a XML file from my desktop and put it into a string?
Here is my XML:
<smallusers>
<user id="1">
<name>John</name>
<motto>I am john, who are you?</motto>
</user>
<user id="2">
<name>Peter</name>
<motto>Hello everyone!</motto>
</user>
</smallusers>
<bigusers>
<user id="3">
<name>Barry</name>
<motto>Earth is awesome</motto>
</user>
</bigusers>
I want to store each user, but still detect if their small or big, is there a way to do this?
Before you downrate this, you might want to check google because I did research, but found nothing.
"Before you downrate this, you might want to check google because I
did research, but found nothing"
You found nothing because you don't know what you are searching for, also your XML is invalid, you need to enclose it in a rootElement. Then the first thing you need to do is read this file from the desktop if it exists.
You can check the size if you wish at that time and determine if this is "too large" even though it doesn't really matter. I highly doubt your XML file will be 5+ GB in size. If it is then you need an alternative, no single object in a .Net program may be over 2GB, the best you could do is 1,073,741,823 on a 64bit machine.
For very large XML files, anything above 1.0 GB, combine XmlReader and LINQ as stated by Jon Skeet here:
If your document is particularly huge, you can combine XmlReader and
LINQ to XML by creating an XElement from an XmlReader for each of your
"outer" elements in a streaming manner: this lets you do most of the
conversion work in LINQ to XML, but still only need a small portion of
the document in memory at any one time.
For small XML files, anything 1.0 GB or lower stick to the DOM as shown below.
With that said, what you need is to learn what Serialization and Deserialization mean.
Serialize convert an object instance to an XML document.
Deserialize convert an XML document into an object instance.
Instead of XML you can also use JSON, binary, etc.
In your case this is what can be done to Deserialize this XML document back into an Object in order for you to use in your code.
First fix up the XML and give it a Root.
<?xml version="1.0" encoding="UTF-8"?>
<DataRoot>
<smallusers>
<user id="1">
<name>John</name>
<motto>I am john, who are you?</motto>
</user>
<user id="2">
<name>Peter</name>
<motto>Hello everyone!</motto>
</user>
</smallusers>
<bigusers>
<user id="3">
<name>Barry</name>
<motto>Earth is awesome</motto>
</user>
</bigusers>
</DataRoot>
Then create the root class in C#, you may generate this directly in Visual Studio 2012+ by copying your XML and going to Edit - Paste Special, but I like to use: XML to C# Class Generator
Here is what your code would look like after you generate the C# Root Class for your XML, hope it helps you understand it better.
using System;
using System.Collections.Generic;
using System.IO;
using System.Xml;
using System.Xml.Serialization;
namespace ConsoleApplication1
{
public class Program
{
[XmlRoot(ElementName = "user")]
public class User
{
[XmlElement(ElementName = "name")]
public string Name { get; set; }
[XmlElement(ElementName = "motto")]
public string Motto { get; set; }
[XmlAttribute(AttributeName = "id")]
public string Id { get; set; }
}
[XmlRoot(ElementName = "smallusers")]
public class Smallusers
{
[XmlElement(ElementName = "user")]
public List<User> User { get; set; }
}
[XmlRoot(ElementName = "bigusers")]
public class Bigusers
{
[XmlElement(ElementName = "user")]
public User User { get; set; }
}
[XmlRoot(ElementName = "DataRoot")]
public class DataRoot
{
[XmlElement(ElementName = "smallusers")]
public Smallusers Smallusers { get; set; }
[XmlElement(ElementName = "bigusers")]
public Bigusers Bigusers { get; set; }
}
static void Main(string[] args)
{
string testXMLData = #"<DataRoot><smallusers><user id=""1""><name>John</name><motto>I am john, who are you?</motto></user><user id=""2""><name>Peter</name><motto>Hello everyone!</motto></user></smallusers><bigusers><user id=""3""><name>Barry</name><motto>Earth is awesome</motto></user></bigusers></DataRoot>";
var fileXmlData = File.ReadAllText(#"C:\XMLFile.xml");
var deserializedObject = DeserializeFromXML(fileXmlData);
var serializedToXML = SerializeToXml(deserializedObject);
//I want to store each user, but still detect if their small or big, is there a way to do this?
foreach (var smallUser in deserializedObject.Smallusers.User)
{
//Iterating your collection of Small users?
//Do what you need here with `smalluser`.
var name = smallUser.Name; //Example...
}
Console.WriteLine(serializedToXML);
Console.ReadKey();
}
public static string SerializeToXml(DataRoot DataObject)
{
var xsSubmit = new XmlSerializer(typeof(DataRoot));
using (var sw = new StringWriter())
{
using (var writer = XmlWriter.Create(sw))
{
xsSubmit.Serialize(writer, DataObject);
var data = sw.ToString();
writer.Flush();
writer.Close();
sw.Flush();
sw.Close();
return data;
}
}
}
public static DataRoot DeserializeFromXML(string xml)
{
var xsExpirations = new XmlSerializer(typeof(DataRoot));
DataRoot rootDataObj = null;
using (TextReader reader = new StringReader(xml))
{
rootDataObj = (DataRoot)xsExpirations.Deserialize(reader);
reader.Close();
}
return rootDataObj;
}
}
}

XML namespace in ASP.net MVC, C#

I'm trying to get an XML file generated using a namespace as such:
<namespace:Example1>
<namespace:Part1>Value1</namespace:Part1>
</namespace:Example1>
I've tried using
[XmlAttribute(Namespace = "namespace")]
public string Namespace { get; set; }
but I'm clearly missing something. The structure I've used is
[XmlRoot("Example1")]
public class Blah
{
[XmlAttribute(Namespace = "namespace")]
public string Namespace { get; set; }
but all I get is
<Example1>
<Part1>Value1</Part1>
</Example1>
Any help would be greatly appreciated.
Edit:
[XmlRoot(ElementName="Chart2", Namespace="vc")]
doesn't work.
You can use the XmlSerializerNamespaces class to add the prefix for a given namespace in the xml.
I hope the below code will he you better.
[XmlRoot(ElementName = "Example1")]
public class Blah
{
public string Part1 { get; set; }
}
Blah bl = new Blah();
bl.Part1 = "MyPart1";
// Serialization
/* Create an XmlSerializerNamespaces object and add two prefix-namespace pairs. */
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
ns.Add("namespace", "test");
XmlSerializer s = new XmlSerializer(typeof(Blah),"test");
TextWriter w = new StreamWriter(#"c:\list.xml");
s.Serialize(w, bl,ns);
w.Close();
/* Output */
<?xml version="1.0" encoding="utf-8"?>
<namespace:Example1 xmlns:namespace="test">
<namespace:Part1>MyPart1</namespace:Part1>
</namespace:Example1>
Can you try this on your Model.cs:
Copy the whole XML, then on the Model.cs:
Edit > Paste Special > Paste XML as Classes.
Might help you. ;)

Categories