union of unknown set of xmls - c#

I have set of XML's ( varies between 2 and 6) that needs to be processed(traversed and checked for certain data and relations within) - The XML's have some "Recursive Data"
here is a simple example involving a test data for explanation - 2 files considered as example
File1.xml:
<some root------standard header not entered for the example----->
<parent>
<ID>AB-1234</ID>
<Description>Good book</Description>
<Date_Created>08-10-2011</Date_Created>
<child>
<ID>BC-0001</ID>
<Description>Nice</Description>
</child>
</parent>
<parent>
<ID>BC-0001</ID>
<Description>Work Together</Description>
<Date_Created>08-10-2011</Date_Created>
<child>
<ID>DC-0011</ID>
<Description>Happy</Description>
</child>
</parent>
File2.xml:
<some root------standard header not entered for the example----->
<parent>
<ID>DC-0011</ID>
<Description> book</Description>
<Date_Created>08-10-2011</Date_Created>
<child>
<ID>EF-0001</ID>
<Description>Nice</Description>
</child>
</parent>
<parent>
<ID>EF-0001</ID>
<Description>Work Together</Description>
<Date_Created>08-10-2011</Date_Created>
<child>
<ID>PQ-0011</ID>
<Description>Happy</Description>
</child>
</parent>
code I am using involves 1) loading both the XML files and combining them
XDocument test1doc = XDocument.Load(#"d:\File1.xml");
XDocument test2doc = XDocument.Load(#"d:\File2.xml");
IEnumerable<XElement> testElist1 = test1doc.decendants("parent");
IEnumerable<XElement> testElist2 = test2doc.decendants("parent");
IEnumerable<XElement> testElistcombo = testElist1.union(testElist2);
2) use the testElistcombo to navigate the elements using foreach - 2 foreach loops (one for the parent and second for the child)
3) while traversing use an if condition to check whether parent ID and Child ID are equal.
I am able to build the hierarchy - no problem with that.
I was able to print the hierarchy along with the level value of the hierarchy.by including a counter in each of the foreach loops.
my output looks like
AB-1234[level-0]
>>BC-0001[level-1]
>>DC-0011[level-3]
..... and so on.
as i said no problem with that. -
Following is the area where i would like some help:
1) when the number of files increases to more than 2 to a max 6, i am using a union in the following manner
XDocument test1doc = XDocument.Load(#"d:\File1.xml");
XDocument test2doc = XDocument.Load(#"d:\File2.xml");
XDocument test3doc = XDocument.Load(#"d:\File3.xml");
XDocument test4doc = XDocument.Load(#"d:\File4.xml");
XDocument test5doc = XDocument.Load(#"d:\File5.xml");
XDocument test6doc = XDocument.Load(#"d:\File6.xml");
IEnumerable<XElement> testElist1 = test1doc.decendants("parent");
IEnumerable<XElement> testElist2 = test2doc.decendants("parent");
IEnumerable<XElement> testElist3 = test3doc.decendants("parent");
IEnumerable<XElement> testElist4 = test4doc.decendants("parent");
IEnumerable<XElement> testElist5 = test5doc.decendants("parent");
IEnumerable<XElement> testElist6 = test6doc.decendants("parent");
IEnumerable<XElement> testElistcombo1 = testElist1.union(testElist2);
IEnumerable<XElement> testElistcombo2 = testElistcombo1.union(testElist3);
IEnumerable<XElement> testElistcombo3 = testElistcombo2.union(testElist4);
IEnumerable<XElement> testElistcombo4 = testElistcombo3.union(testElist5);
IEnumerable<XElement> testElistcombo5 = testElistcombo4.union(testElist6);
and use the testElistcombo5.for processing.
help required: an alternative way to load and combine the XML's to for processing.
2) The process is resource intensive and take a fair bit of time to complete the hierarchy building
help required: is there an alternative way to process the xml's for building hierarchy in Recursive Data.

Question 1: you can do this using the Enumerable.Aggregate function to aggregate the elements for each document into one set of elements:
IEnumerable<string> filenames = { "filename1.xml", "filename2.xml" };
IEnumerable<XDocument> documents = filenames.Select(XDocument.Load);
IEnumerable<IEnumerable<XElement>> documentsElements = documents.Select(document => document.Descendants("parent"));
IEnumerable<XElement> elements = documentsElements.Aggregate((working, next) => working.Union(next));

Related

Extracting the current XPath of the node from XPathIterator C#

I am writing a C# function where I need to fetch the value of a node using Xpath and if the fetched values matches a given string, I need to pass the Xpath to an API which replaces the current value of the node with value stored in Database.
The problem is there are multiple matches for the given Xpath, I filter those out with the string matching criteria and am able to figure out the node, however, I am not getting how to capture the exact Xpath of matching node and pass it to the API for it to work.
Lets take this XML as an example
<GrandParent>
<Parent>
<Child1>John
</Child1>
<Child2>Emily
</Child2>
</Parent>
<Parent>
<Child1>Frank
</Child1>
<Child2>Niki
</Child2>
</Parent>
<Parent>
<Child1>Mia
</Child1>
<Child2>Noah
</Child2>
</Parent>
</GrandParent>
Now I will have to fetch the node with Xpath /GrandParent/Parent/Child1 whose value would be John.
I am doing that in C# using XPathNavigator and XPathIterator
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(requestXML);
XPathNavigator nav;
nav = xmlDoc.CreateNavigator();
XPathNodeIterator allMatchingNodes = nav.Select(SourceXPath);
int countNodes = 1;
foreach (XPathNavigator node in allMatchingNodes)
{
if(node.Value.Equals("John"))
{
Xpath = SourceXPath + "[" + countNodes + "]";
break;
}
}
However, this would be an incorrect approach as it will create xpath as /GrandParent/Parent/Child1[1] and hence the subsequeent API replaces is incorrectly.
I would want xpath as /GrandParent/Parent[1]/Child1 is there someway of doing that without using multiple foreach?

XPath 1.0 select siblings with namespaces

I have the following xml file
<root xmlns="http://mynamespace">
<parent>
<first>text</first>
<second>more</second>
</parent>
<parent>
<first>2</first>
<second>3</second>
</parent>
<parent>
<first>aa</first>
<second>bb</second>
</parent>
</root>
I'm trying to get first and second children of parent.
C# seems to have problems with the following code (the error is on the last line):
var rawXml = #"<root xmlns=""http://mynamespace"">
<parent>
<first>text</first>
<second>more</second>
<third>hello</third>
</parent>
<parent>
<first>2</first>
<second>3</second>
<parent>
<first>a</first>
<second>b</second>
</parent>
</parent>
<parent>
<first>aa</first>
<second>bb</second>
</parent>
</root>";
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml(rawXml);
var ns = new XmlNamespaceManager(xmlDoc.NameTable);
ns.AddNamespace("m", "http://mynamespace");
var nav = xmlDoc.CreateNavigator();
var parents = nav.Select("//m:parent", ns);
Console.Write($"Got {parents.Count} parents.");
// this does not work
// error: Expression must evaluate to a node-set.
//var siblings = nav.Select("//m:parent/(m:first|m:second)", ns);
// but this does
var siblings = nav.Select("//m:parent/m:first|//m:parent/m:second", ns);
Console.Write($"Got {siblings.Count} children.");
Am I missing something? Is the first XPath expression wrong?
Is the first XPath expression wrong?
Yes, it's not valid XPath 1.0 syntax. You can't have a ( after a / in XPath 1.0.
You can achieve what you're trying to do, without repeating any node names, by using this path:
/m:root/m:parent/*[self::m:first or self::m:second]
Side note: avoid using // unless you have a specific reason to use it. It's bad for performance.

I am having an xml inside xml and want to test a condition is met inside the inner xml. C# solution required

<?xml version="1.0"?>
<TextType IsKey="false" Name="XMLReport"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Providers
xmlns="Reporting"/>
<Sales
xmlns="Reporting"/>
<Value
xmlns="Reporting">
<?xml version="1.0" encoding="utf-8"?>
<TestReport>
<StudyUid>
<![CDATA[123]]>
</StudyUid>
<Modality>
<![CDATA[XYZ]]>
</Modality>
<StudyDate format="DICOM">123456</StudyDate>
<StudyTime format="DICOM">6789</StudyTime>
<AccessionNumber>
<![CDATA[123]]>
</AccessionNumber>
<StudyDescription>
<![CDATA[abc def]]>
</StudyDescription>
<OperatorName format="xyz">
<![CDATA[abc]]>
</OperatorName>
<PhysicianReadingStudy format="xyz">
<![CDATA[^^^^]]>
</PhysicianReadingStudy>
<InstitutionName>
<![CDATA[xyz]]>
</InstitutionName>
<HospitalName>
<![CDATA[Hospital Name]]>
</HospitalName>
<ReportSet>
<MyReport ID="1">
<ReportStatus>
<![CDATA[Done]]>
</ReportStatus>
</MyReport>
<MyReport ID="2">
<ReportStatus>
<![CDATA[Done]]>
</ReportStatus>
</MyReport>
<MyReport ID="3">
<ReportStatus>
<![CDATA[Initial]]>
</ReportStatus>
</MyReport>
</ReportSet>
<ReportImageSet />
<FetusSet />
</TestReport>
</Value>
<WhoSetMe xmlns="Reporting">NotSpecified
</WhoSetMe>
</TextType>
I want to parse the xml above in C# and check whether "ReportStatus" is "Done" for all the ReportStatus under MyReport/ReportSet. One more twist here is the xml contains one more xml starts at "Value" tag as in above example.It may contatin many ReportStatus tag under ReportSet tag. Can someone please help me?
// Can you try this? I tried to do it with LINQ to XML.
// I assume you have multiple <TestReport /> elements in <Value /> tag
// and var xelement is your xml variable
// First we get all TestReport elemnts
IEnumerable<XElement> allReports =
from el in xelement.Elements("TextType/Value/TestReport")
select el;
// From allReports we get all MyReport elemnts
IEnumerable<XElement> allMyReports =
from el in allReports.Elements("ReportSet/MyReport")
select el;
// From allReports we also get all MyReport elemnts with element ReportStatus value equals "Done"
IEnumerable<XElement> allDoneMyReports =
from el in allMyReports
where (string)el.Element("ReportStatus") == "Done"
select el;
// Now we compare allMyReport with allDoneMyReports
if (allMyReports.Count() == allDoneMyReports.Count())
{
//DO Somehing
}
Your XML document is invalid. You need to fix it before trying to parse it. The issue is that a document can only have one top-level element; you have 2 <TextType> and <Providers>.
Most of your elements are the namespace Reporting. You need to use it when referencing the element.
XNamespace ns = "Reporting";
var value = doc.Element("Value" + ns);
Update
Just use the namespace for each element
XNamespace ns = "Reporting";
var value = xelement.Elements("Value" + ns);
Another Update
The XML document is considered invalid because it has multiple XML declarations; there is no way to disable this. I suggest you pre-process the document to remove the extra declarations. Here's an example (https://dotnetfiddle.net/UnuAF6)
var xml = "<?xml version='1.0'?><a> <?xml version='1.0'?><b id='b' /></a>";
var doc = XDocument.Parse(xml.Replace(" <?xml version='1.0'?", " "));
var bs = doc.Descendants("b");
Console.WriteLine("{0} 'b' elements", bs.Count());

XDocument AncestorAndSelf

I currently have an XML Structure that looks something like this
<Parent>
<Info>
<Info-Data></Info-Data>
<Info-Data2></Info-Data2>
</Info>
<Message>
<Foo></Foo>
<Bar></Bar>
</Message>
<Message>
<Foo/>
<Bar/>
</Message>
</Parent>
What I'm trying to accomplish is split each Message into it's own unique XDocument. I want it to be
<Parent>
<Info />
<Message />
</Parent>
I tried to do the following.
XDocument xDoc = XDocument.Parse(myXMLString);
IEnumerable<XElement> elements = xDoc.Descendants(xDoc.Root.Name.NameSpace + "Message");
foreach(XElement element in elements)
{
XDocument newDoc = XDocument.Parse(element.ToString());
}
Obviously this only gets me everything from Message and below. I tried using Ancestors and AncestorsAndSelf but they always include BOTH Messages. Is there a different call I should be making?
If your format is fixed like this, it's not so bad:
foreach(XElement element in elements)
{
XDocument newDoc = new XDocument
(new XElement(xDoc.Root.Name,
xDoc.Root.Element("Info"),
element));
// ...
}
It's not great, but it's not horrendous. An alternative is to clone the original document, remove all the Message elements, then repeatedly clone the "gutted" version and add one element at a time to the new clone:
XDocument gutted = new XDocument(xDoc);
gutted.Descendants(xDoc.Root.Name.Namespace + "Message").Remove();
foreach(XElement element in elements)
{
XDocument newDoc = new XDocument(gutted);
newDoc.Root.Add(element);
// ...
}

Inserting and saving xml using Linq to XML

If i have an XML file settings.xml like below
<Root>
<First>
</First>
</Root>
I Load the XML first using XDocument settings = XDocument.Load("settings.xml")
How should I insert a XML node inside the node First and save it using LINQ-to-XML?
First you need to find the First element. Then you can add other elements and attributes to it.
There are more than one way to find an element in the xml: Elements, Descendants, XPathSelectElement, etc.
var firstElement = settings.Descendants("First").Single();
firstElement.Add(new XElement("NewElement"));
settings.Save(fileName);
// or
var newXml = settings.ToString();
Output:
<Root>
<First>
<NewElement />
</First>
</Root>
Or element with attribute:
firstElement.Add(
new XElement("NewElement", new XAttribute("NewAttribute", "TestValue")));
Output:
<Root>
<First>
<NewElement NewAttribute="TestValue" />
</First>
</Root>
[Edit] The answer to the bonus question. What to do if the first element does not exist and I want to create it:
var root = settings.Element("Root");
var firstElement = root.Element("First");
if (firstElement == null)
{
firstElement = new XElement("First");
root.Add(firstElement);
}
firstElement.Add(new XElement("NewElement"));

Categories