Query an XmlDocument without getting a 'Namespace prefix is not defined' problem - c#

I've got an Xml document that both defines and references some namespaces. I load it into an XmlDocument object and to the best of my knowledge I create a XmlNamespaceManager object with which to query Xpath against. Problem is I'm getting XPath exceptions that the namespace "my" is not defined. How do I get the namespace manager to see that the namespaces I am referencing are already defined. Or rather how do I get the namespace definitions from the document to the namespace manager.
Furthermore tt strikes me as strange that you have to provide a namespace manager to the document which you create from the documents nametable in the first place. Even if you need to hardcode manual namespaces why can't you add them directly to the document. Why do you always have to pass this namespace manager with every single query? What can't XmlDocument just know?
Code:
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(programFiles + #"Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\FEATURES\HfscBookingWorkflow\template.xml");
XmlNamespaceManager ns = new XmlNamespaceManager(xmlDoc.NameTable);
XmlNode referenceNode = xmlDoc.SelectSingleNode("/my:myFields/my:ReferenceNumber", ns);
referenceNode.InnerXml = this.bookingData.ReferenceNumber;
XmlNode titleNode = xmlDoc.SelectSingleNode("/my:myFields/my:Title", ns);
titleNode.InnerXml = this.bookingData.FamilyName;
Xml:
<?xml version="1.0" encoding="UTF-8" ?>
<?mso-infoPathSolution name="urn:schemas-microsoft-com:office:infopath:Inspection:-myXSD-2010-01-15T18-21-55" solutionVersion="1.0.0.104" productVersion="12.0.0" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?>
<my:myFields xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/2010-01-15T18:21:55" xmlns:xd="http://schemas.microsoft.com/office/infopath/2003">
<my:DateRequested xsi:nil="true" />
<my:DateVisited xsi:nil="true" />
<my:ReferenceNumber />
<my:FireCall>false</my:FireCall>
Update:
ns.AddNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance");
ns.AddNamespace("xhtml", "http://www.w3.org/1999/xhtml");
ns.AddNamespace("xd", "http://schemas.microsoft.com/office/infopath/2003");
ns.AddNamespace("my", "http://schemas.microsoft.com/office/infopath/2003/myXSD/2010-01-15T18:21:55");
This does the job, but it mean's I have to hard code to this particular xml schema. This schema represents an infopath form template. In particular the my namespace url will be different for every form template so I really don't want to hardcode this. It would be nice to find a clean way to get this namespace from the xml without resorting to RegEx.
I was hoping that the XmlNamespaceManager would just sort of pick up the namespace definitions form the NameTable. I mean their in the Xml but I still have to define them.

ns.AddNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance");
ns.AddNamespace("xhtml", "http://www.w3.org/1999/xhtml");
ns.AddNamespace("xd", "http://schemas.microsoft.com/office/infopath/2003");
ns.AddNamespace("my", "http://schemas.microsoft.com/office/infopath/2003/myXSD/2010-01-15T18:21:55");
This does the job, but it mean's I have to hard code to this particular xml schema. This schema represents an infopath form template. In particular the my namespace url will be different for every form template so I really don't want to hardcode this. It would be nice to find a clean way to get this namespace from the xml without resorting to Regex.
I was hoping that the XmlNamespaceManager would just sort of pick up the namespace definitions form the NameTable. I mean their in the Xml but I still have to define them.

Here is the answer to the "What can't XmlDocument just know?" question.
NameTable is just an optimization for storing names. It has actually nothing to do with namespaces.
And even if XmlNamespaceManager could infer all namespaces and prefixes from XML doc that won't help in general case because of XML namespaces nature, e.g. what would XmlNamespaceManager map "my" prefix in this case:
<root>
<foo xmlns:my="blah"/>
<foo xmlns:my="balh-blah-blah"/>
</root>

Have you defined "my" in the namespace-manager?
ns.AddNamespace("my", "http://schemas.microsoft.com/office/infopath/2003/myXSD/2010-01-15T18:21:55");
Or better - choose something that is unlikely to conflict. It does seem odd that it didn't pick it up from the name-table, though.

For me with InfoPath 2007 this solved the problem
static public XmlNamespaceManager GetNameSpaceManager(this XmlDocument document)
{
XmlNamespaceManager xmlNamespaceManager = new XmlNamespaceManager(document.NameTable);
xmlNamespaceManager.AddNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance");
xmlNamespaceManager.AddNamespace("dfs", "http://schemas.microsoft.com/office/infopath/2003/dataFormSolution");
xmlNamespaceManager.AddNamespace("d", "http://schemas.microsoft.com/office/infopath/2003/ado/dataFields");
xmlNamespaceManager.AddNamespace("my", "http://schemas.microsoft.com/office/infopath/2003/myXSD/2012-03-29T06:28:28");
xmlNamespaceManager.AddNamespace("xd", "http://schemas.microsoft.com/office/infopath/2003");
return xmlNamespaceManager;
}

Related

Namespace of specific XML Node in c#

I have the following XML structure:
<?xml version="1.0" encoding="utf-16"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<StoreResponse xmlns="http://www.some-site.com">
<StoreResult>
<Message />
<Code>OK</Code>
</StoreResult>
</StoreResponse>
</soap:Body>
</soap:Envelope>
I need to get the InnerText from Codeout of this document and I need help with the appropriate XPATH statement.
I'm really confused by XML namespaces. While working on a previous namespace problem in another XML document, I learned, that even if there's nothing in front of Code (e.g. ns:Code), it is still part of a namespace defined by the xmlns attribute in its parent node. Now, there are multiple xmlns nodes defined in parents of Code. What is the namespace that I need to specify in an XPATH statement? Is there such a thing as a "primary namespace"? Do childnodes inherit the (primary) namespace of it's parents?
The namespace of the <Code> element is http://www.some-site.com. xmlsn:xxx means that names prefixed by xxx: (like soap:Body) have that namespace. xmlns by itself means that this is the default namespace for names without any prefix.
An example of using an XDocument (Linq) approach:
XNamespace ns = "http://www.some-site.com";
var document = XDocument.Parse("your-xml-string");
var elements = document.Descendants( ns + "StoreResult" )
Descendant elements will inherit the last immediate namespace. In your example you will need to create two namespaces one for the soap envelope and a second for "some-site".
Here's an option I found in this question: Weirdness with XDocument, XPath and namespaces
var xml = "<your xml>";
var doc = XDocument.Parse(xml); // Could use .Load() here too
var code = doc.XPathSelectElement("//*[local-name()='Code']");

Why does having a xmlns cause my C# program not to read XML?

I have a C# program that attempts to read the following xml, but can't read any elements:
<?xml version="1.0" encoding="UTF-8"?>
<!-- Comments Here -->
<FileFeed
xmlns="http://www.mycompany.com/schemas/xxx/FileFeed/V1"
xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.somecompany.com/schemas/xxx/FileFeed/V1
FileFeed.xsd"
RecordCount = "1">
<Object>
<ID>PAMSMOKE110113xxx</ID>
<CorpID>12509</CorpID>
<AnotherID>201654702345</AnotherID>
<TimeStamp>2013-09-03</TimeStamp>
<Type>Some Type</Type>
<SIM_ID>89011704258012600767</SIM_ID>
<Code>ZZZ</Code>
<Year>2013</Year>
</Object>
</FileFeed>
With the above XML my C# program is unable to read any elements.. For instance the ID Element is always NULL.
Now if I simply remove the first xmlns from the above XML, my program can read all the elements without any issues. The problem is I have to process the XML file in the format that's given to me, and can't change the file format. My program reads the below XML just fine: Note the line xmlns="http://www.mycompany.com/schemas/xxx/FileFeed/V1" is removed.
<?xml version="1.0" encoding="UTF-8"?>
<!-- Comments Here -->
<FileFeed
xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.somecompany.com/schemas/xxx/FileFeed/V1
FileFeed.xsd"
RecordCount = "1">
<Object>
<ID>PAMSMOKE110113xxx</ID>
<CorpID>12509</CorpID>
<AnotherID>201654702345</AnotherID>
<TimeStamp>2013-09-03</TimeStamp>
<Type>Some Type</Type>
<SomeNumber>89011704258012600767</SomeNumber>
<Code>ZZZ</Code>
<Year>2013</Year>
</Object>
</FileFeed>
I realize I'm not posting any code, but just wondering what possible issue could I be having, where simply removing the xmlns line resolves everything??
Your problem is with xml namespaces
Using Linq2Xml
XNamespace ns = "http://www.mycompany.com/schemas/xxx/FileFeed/V1";
var xDoc = XDocument.Load(fname);
var id = xDoc.Root.Element(ns + "Object").Element(ns + "ID").Value;
Your root element FileFeed has a namespace attribute. This means that each element inside it also uses that namespace.
The Element method takes an XName as its argument. Usually you use a string which gets implicitly converted into an XName.
If you want to include a namespace you create an XNamespace and add the string. Since XNamespace overloads the + operator this will also result in an XName.
XDocument doc = XDocument.Load("Test.xml");
// this will be null
XElement objectElementWithoutNS = doc.Root.Element("Object");
XNamespace ns = doc.Root.GetDefaultNamespace();
XElement objectElementWithNS = doc.Root.Element(ns + "Object");
Xml namespaces are more or less like C# namespaces. Would you be able to access a class when its namespace is set or not set?
public namespace My.Company.Schemas {
public class FileFeed
vs
public class FileFeed {
They are two DISTINCT classes! The same applies to XML - by setting a namespace you make it possible to have documents with similar or even the same internal structure but they represent two disctinct documents that are not exchangeable. This is really convenient.
If you'd like to get help on why your actual reading method doesn't consider the namespace, you have to present the C# code. The general rule though is that any reading API makes is possible to set the namespace for actual reading.

Selecting XML Node with XPath

I have a xml where i want to select a node from it here is the xml:
<?xml version="1.0" encoding="utf-8" ?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<InResponse xmlns="https://ww.ggg.com">
<InResult>Error </InResult>
</InResponse>
</soap:Body>
</soap:Envelope>
I am loading it using XmlDocument's LoadXML and trying to get InResult node but I get null see below please:
xml.SelectSingleNode("//InResult").InnerText;
You have a namespace declaration and you should add this into your XPath or you can use namespace agnostic XPath. Try next code as namespace agnostic solution:
xml.SelectSingleNode("//*[local-name()='InResult']").InnerText;
I've received Error as result
From http://www.w3schools.com/ site:
local-name() - Returns the name of the current node or the first node
in the specified node set - without the namespace prefix
You can get more information about XPath functions here.
Namespace aware solution, is given below:
var namespaceManager = new XmlNamespaceManager(x.NameTable);
namespaceManager.AddNamespace("defaultNS", "https://ww.ggg.com");
var result = x.SelectSingleNode("//defaultNS:InResponse", namespaceManager).InnerText;
Console.WriteLine (result); //prints Error
Brief XML notes:
This part in root note xmlns:soap="http://www.w3.org/2003/05/soap-envelope" is a xml namespace declaration. It is used to identify nodes in your xml structure. As a rule, you need to specify them to access nodes with it, but there are namespace agnostic solutions in XPath and in LINQ to XML. Now if you see node name as <soap:Body>, this means, that this node belongs to this namespace.
This seems to be an namespace issue
You can use an XmlNamespaceManager before you call SelectSingleNode():
XmlNamespaceManager ns = new XmlNamespaceManager(xmldoc.NameTable);
ns.AddNamespace("ggg", "https://ww.ggg.com");
xml.SelectSingleNode("//ggg:InResult", ns).InnerText;
Attention: Not tested.

Process namespaces using XmlReader

I have a complex XML file with structure as follows:
<?xml version="1.0" encoding="UTF-8"?>
<Document xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="xxx:xxx:xxx:xxx:xxxxx:xxx:xsd:xxxx.xxx.xxx.xx">
<Element1>
<Element2>
<Element2A>xxxxxx</Element2A>
<Element2B>2012-08-29T00:00:00</Element2B>
</Element2>
</Element1>
</Document>
Now I am using XmlReader to read this XML document and process information as follows
XmlReader xr = XmlReader.Create(filename);
while (xr.Read())
{
xr.MoveToElement();
XElement node = (XElement)XElement.ReadFrom(xr);
Console.WriteLine(node.Name);
}
xr.Close();
The problem I am facing is in the output the namespace is prefixed to the ElementName. E.g output
{xxx:xxx:xxx:xxx:xxxxx:xxx:xsd:xxxx.xxx.xxx.xx}Element1
Is there any way I can remove/ handle this as I need to do further filtering using Element names and Child names.
XElement.Name is not (as you might expect) a String, but rather an XName which has a LocalName property, thus:
Console.WriteLine(node.Name.LocalName);
You may want to remove the namespace. One way to remove namespace is to write c# code and other way is to use XSLT transformation as suggested in Remove Namespace
-Milind

Parse an xml document when namespace is no-longer available

I have a number rather large, complex xml documents that I need to loop through. An xmlns is defined at the top of the document however the url this points to is no longer available.
What's the best way to parse the file to get the important data from it using C#?
I tried to load it into a Dataset but would occasionally receive the errors:
The table (endpoint) cannot be the child table to itself in nested relations.
or
Cannot add a SimpleContent column to a table containing element columns or nested relations.
XPath was my next port of call but I had problems because of the lack of namespace.
I suspect this is seriously limiting my options, but does anyone have any suggestions?
Snippet of the XML document:
<?xml version="1.0" encoding="UTF-8"?>
<cdr:cdr_set xmlns:cdr="http://www.naturalconvergence.com/schema/cdr/v3/cdr">
<!-- Copyright (c) 2001-2009, all rights reserved -->
<cdr:cdr xmlns:cdr="http://www.naturalconvergence.com/schema/cdr/v3/cdr">
<cdr:call_id>2040-1247062136726-5485131</cdr:call_id>
<cdr:cdr_id>1</cdr:cdr_id>
<cdr:status>Normal</cdr:status>
<cdr:responsibility>
<cdr:tenant id="17">
<cdr:name>SpiriTel plc</cdr:name>
</cdr:tenant>
<cdr:site id="45">
<cdr:name>KWS</cdr:name>
<cdr:time_zone>GB</cdr:time_zone>
</cdr:site>
</cdr:responsibility>
<cdr:originator type="sipGateway">
<cdr:sipGateway id="3">
<cdr:name>Audiocodes-91</cdr:name>
</cdr:sipGateway>
</cdr:originator>
<cdr:terminator type="group">
<cdr:group>
<cdr:tenant id="17">
<cdr:name>SpiriTel plc</cdr:name>
</cdr:tenant>
<cdr:type>Broadcast</cdr:type>
<cdr:extension>6024</cdr:extension>
<cdr:name>OLD PMS DDIS DO NOT USE</cdr:name>
</cdr:group>
</cdr:terminator>
<cdr:initiation>Dialed</cdr:initiation>
<cdr:calling_number>02087893850</cdr:calling_number>
<cdr:dialed_number>01942760142</cdr:dialed_number>
<cdr:target>6024</cdr:target>
<cdr:direction>Inbound</cdr:direction>
<cdr:disposition>No Answer</cdr:disposition>
<cdr:timezone>GB</cdr:timezone>
<cdr:origination_timestamp>2009-07-08T15:08:56.727+01:00</cdr:origination_timestamp>
<cdr:release_timestamp>2009-07-08T15:09:26.493+01:00</cdr:release_timestamp>
<cdr:release_cause>Normal Clearing</cdr:release_cause>
<cdr:call_duration>PT29S</cdr:call_duration>
<cdr:redirected>false</cdr:redirected>
<cdr:conference>false</cdr:conference>
<cdr:transferred>false</cdr:transferred>
<cdr:estimated>false</cdr:estimated>
<cdr:interim>false</cdr:interim>
<cdr:segments>
<cdr:segment>
<cdr:originationTimestamp>2009-07-08T15:08:56.727+01:00</cdr:originationTimestamp>
<cdr:initiation>Dialed</cdr:initiation>
<cdr:call_id>2040-1247062136726-5485131</cdr:call_id>
<cdr:originator type="sipGateway">
<cdr:sipGateway id="3">
<cdr:name>Audiocodes-91</cdr:name>
</cdr:sipGateway>
</cdr:originator>
<cdr:termination_attempt>
<cdr:termination_timestamp>2009-07-08T15:08:56.728+01:00</cdr:termination_timestamp>
<cdr:terminator type="group">
<cdr:group>
<cdr:tenant id="17">
<cdr:name>SpiriTel plc</cdr:name>
</cdr:tenant>
<cdr:type>Broadcast</cdr:type>
<cdr:extension>6024</cdr:extension>
<cdr:name>OLD PMS DDIS DO NOT USE</cdr:name>
</cdr:group>
</cdr:terminator>
<cdr:provided_address>01942760142</cdr:provided_address>
<cdr:direction>Inbound</cdr:direction>
<cdr:disposition>No Answer</cdr:disposition>
</cdr:termination_attempt>
</cdr:segment>
</cdr:segments>
</cdr:cdr>
...
</cdr:cdr_set>
Each entry is essentially the same but there are sometimes differences such as some of the fields may be missing, if they aren't required.
These values in an xml file are identifiers, not locators. Unless you are expecting to download a schema, it is not needed at all, and can be "flibble" if needed. I expect the best thing would be to just load it into XmlDocument / XDocument and try to access the data.
For example:
XmlDocument doc = new XmlDocument();
doc.Load("cdr.xml");
XmlNamespaceManager ns = new XmlNamespaceManager(doc.NameTable);
ns.AddNamespace("cdr", "http://www.naturalconvergence.com/schema/cdr/v3/cdr");
XmlElement el = (XmlElement)doc.SelectSingleNode(
"cdr:cdr_set/cdr:cdr/cdr:originator", ns);
Console.WriteLine(el.GetAttribute("type"));
or to loop over the cdr elements:
foreach (XmlElement cdr in doc.SelectNodes("/cdr:cdr_set/cdr:cdr", ns))
{
Console.WriteLine(cdr.SelectSingleNode("cdr:call_id", ns).InnerText);
}
Note that the aliases used in the document are largely unrelated to the aliases used in the XmlNamespaceManager, hence you need to re-declare it. I could have used x as my alias in the C# just as easily.
Of course, if you prefer to work with an object model; run it through xsd (where cdr.xml is your example file):
xsd cdr.xml
xsd cdr.xsd /classes
Now you can load it with XmlSerializer.
alternativley load it into an Xdocument and use linq2XML? ... although you might just get the same error.
I don't know what data you want, so its hard to suggest a query.
I personally prefer the use of XDocument to xmlDocument now in most cases.
the only problem with the automatic generation of an XSD is that it can get your datatypes pretty badly wrong if you are not using a good sized chunk of sample data.

Categories