XML node parsing using C# linq - c#

i have xml document like this:
<?xml version="1.0" encoding="utf-8" ?>
<demographics>
<country id="1" value="USA">
<state id ="1" value="California">
<city>Long Beach</city>
<city>Los Angeles</city>
<city>San Diego</city>
</state>
<state id ="2" value="Arizona">
<city>Tucson</city>
<city>Phoenix</city>
<city>Tempe</city>
</state>
</country>
<country id="2" value="Mexico">
<state id ="1" value="Baja California">
<city>Tijuana</city>
<city>Rosarito</city>
</state>
</country>
</demographics>
How to select everything starting from demographics node using XML linq queries
something like this:
var node=from c in xmldocument.Descendants("demographics") ??

XDocument xDoc = XDocument.Parse(xml);
var demographics = xDoc
.Descendants("country")
.Select(c => new
{
Country = c.Attribute("value").Value,
Id = c.Attribute("id").Value,
States = c.Descendants("state")
.Select(s => new
{
State = s.Attribute("value").Value,
Id = s.Attribute("id").Value,
Cities = s.Descendants("city").Select(x => x.Value).ToList()
})
.ToList()
})
.ToList();

Related

Select in XML to a list

I just want to select the content of user list="default" or user list="otherListName" from a variable.
Like when my variable is equal to default I want to select the content of user list="default". By content I mean:
<list nom="Nom" description="Description" image="no_image.png"/>
And I want this content to be parse into a list
<list nom="" description="" image=""/>
<list nom="" description="" image=""/>
<?xml version="1.0" encoding="utf-8"?>
<database>
<user list="default">
<list nom="Nom" description="Description" image="no_image.png"/>
</user>
<user list="otherListName">
<list nom="" description="" image=""/>
<list nom="" description="" image=""/>
</user>
</database>`
I hope that my question is understandable.
You can use LINQ-to-XML, for example, assuming that doc is an XDocument variable containing the original XML :
var listName = "default";
var result = doc.Root
.Elements("user")
.Where(o => (string)o.Attribute("list") == listName)
.Elements("list");
See live demo in dotnetfiddle :
var raw = #"<?xml version='1.0' encoding='utf-8'?>
<database>
<user list='default'>
<list nom='Nom' description='Description' image='no_image.png'/>
</user>
<user list='otherListName'>
<list nom='' description='' image=''/>
<list nom='' description='' image=''/>
</user>
</database>";
var doc = XDocument.Parse(raw);
var listName = "default";
var result = doc.Root
.Elements("user")
.Where(o => (string)o.Attribute("list") == listName)
.Elements("list");
foreach(var r in result)
{
Console.WriteLine(r.ToString());
}
output : (for listName = "default")
<list nom="Nom" description="Description" image="no_image.png" />

Descendants not found even if they exist in XDocument in C#

I am having problems with getting descendant with specific name. I have hugh XML that basically is made of lots of this elements:
<?xml version="1.0" encoding="utf-8"?>
<Search_Results xmlns="https://support.bridgerinsight.lexisnexis.com/downloads/xsd/4.5/OutputFile.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://support.bridgerinsight.lexisnexis.com/downloads/xsd/4.5/OutputFile.xsd https://support.bridgerinsight.lexisnexis.com/downloads/xsd/4.5/OutputFile.xsd">
<Entity Record="28" ResultID="12460985">
<GeneralInfo>
<EntityType>Individual</EntityType>
<Name>Jón Jónsson</Name>
<DOB>01/01/0001</DOB>
<DOBParsed />
<AccountID>ABS-ASSOC-10-109</AccountID>
<IDLabel>Account ID</IDLabel>
<IDNumber>ABS-ASSOC-10-109</IDNumber>
<AddressType>Current</AddressType>
<PostalCode>Somalia</PostalCode>
</GeneralInfo>
<RecordDetailInfo>
<EntityType>Individual</EntityType>
<SearchDate>2016-05-13 09:53:50Z</SearchDate>
<Origin>Automatic Batch</Origin>
<FirstName>Jón</FirstName>
<LastName>Jónsson</LastName>
<FullName>Jón Jónsson</FullName>
<AdditionalInfo>
<Type>Date of Birth</Type>
<Information>01/01/0001</Information>
</AdditionalInfo>
<Addresses>
<Type>Current</Type>
<PostalCode>Somalia</PostalCode>
</Addresses>
<Identifications>
<Type>Account ID</Type>
<Number>ABS10-109</Number>
</Identifications>
</RecordDetailInfo>
<WatchList>
<Match ID="1">
<EntityName>Jonsson</EntityName>
<EntityScore>96</EntityScore>
<BestName>Jonsson, Jon Orn</BestName>
<BestNameScore>96</BestNameScore>
<FileName>WorldCompliance - Full.BDF</FileName>
<SourceDate>2016-05-11 05:01:00Z</SourceDate>
<DistributionDate>2016-05-12 14:59:39Z</DistributionDate>
<ResultDate>2016-05-13 09:53:50Z</ResultDate>
<EntityUniqueID>WX0003219444</EntityUniqueID>
<MatchDetails>
<Entity Type="2">
<Number>3219444</Number>
<Date>9/3/2012</Date>
<Reason>International</Reason>
<CheckSum>69185</CheckSum>
<Gender>Male</Gender>
<Name>
<First>Jon Orn</First>
<Last>Jonsson</Last>
<Full>Jon Orn Jonsson</Full>
</Name>
<Notes>Source.</Notes>
<Addresses>
<Address ID="1" Type="4">
<Country>Iceland</Country>
</Address>
</Addresses>
<IDs>
<ID ID="1" Type="27">
<Number>3219444</Number>
</ID>
</IDs>
<Descriptions>
<Description ID="1" Type="10">
<Value>Honorary Consul of Iceland in Saskatchewan, Canada</Value>
<Notes>Starting 2002 Ending 2014</Notes>
</Description>
<Description ID="2" Type="22">
<Value>Link to WorldCompliance Online Database</Value>
<Notes>Jonsson, Jon Orn | https://members.worldcompliance.com/metawatch2.aspx?id=e0399c29-7c5e-4674-874c-f36fdb19052e</Notes>
</Description>
<Description ID="3" Type="22">
<Value>Sources of Record Information</Value>
<Notes>http://brunnur.mfa.is/interpro/utanr/HBvefur.nsf/Pages/IslSendiradIsl?OpenDocument&amp | CountryNr=1(Canada)&amp | Lang=44') | http://www.international.gc.ca/protocol-protocole/assets/pdfs/Diplomatic_List.pdf | http://www.inlofna.org/Elfros/newsletter%20January%202010.pdf | http://publications.gc.ca/collections/Collection/E12-3-2002E.pdf | http://www.onlygolfnews.com/golf-canada-saskatchewan/saskatchewan-golf-first-fort-lacrosse-ted-brandon-over-new-last-snow.htm | http://www.ops.gov.sk.ca/Consular-Officers</Notes>
</Description>
</Descriptions>
</Entity>
</MatchDetails>
</Match>
</WatchList>
</Entity>
</Search_Results>
I am trying to reach all elements with name: Entity and later I want to go through all of them and get values from their descendants with name "Reason".
But non of the Entity elements is found with this line:
var entityList = xmlDoc.Descendants(nameSpace + "Entity").ToList();
This is a whole method I am using:
public static void GetIBANAndBicValuesFromXML(XDocument xmlDoc)
{
var reasons = new List<string>();
XNamespace nameSpace =
"https://support.bridgerinsight.lexisnexis.com/downloads/xsd/4.5/";
var entityList = xmlDoc.Descendants(nameSpace + "Entity").ToList();
if (entityList != null)
{
foreach (var reason in entityList.Select(entity => entity.Elements(nameSpace + "Reason"))
.Where(reasonsList => reasonsList != null).SelectMany(reasonsList => reasonsList))
{
string reasonValue = reason.Value;
reasons.Add(reasonValue);
}
}
}
And this is a call to this method:
private static void Main(string[] args)
{
var xmlFile = "C:\\temp\\indi2.xml";
var x = XDocument.Load("C:\\temp\\Individuals.xml");
XMLParse.GetIBANAndBicValuesFromXML(x);
}
I have tried namespace like this as well:
"https://support.bridgerinsight.lexisnexis.com/downloads/xsd/4.5/OutputFile.xsd"
But no success.
Anybody sees what I am doing wrongly?
You can use Linq to filter with LocalName:
string fileName = "1.txt";
var xDoc = XDocument.Load(fileName);
var neededElements = xDoc.Descendants().Where(x => x.Name.LocalName == "Entity");
Console.WriteLine("Found {0} Entitys", neededElements.Count());
foreach(var el in neededElements)
{
Console.WriteLine(el);
}

multi Elements to multi rows

I made some try with XML Reader, Xpath... and know linq
But wont find a way to solve these things.
I have to extract the information, for each Order into one row, in this row should be the Information of the first elements and the Items and the orders as well as the status of the Orders...
Is there a way to extract all these information to one row within one linq-query? Or do I have to build steps for this?
(Visualstudio 2010/2013 C# .Net 4)
<Account>
<Name>Name1</Name>
<InId>100</InId>
<CustomID>100000087</CustomID>
<ZipCode>zipcode</ZipCode>
<Items>
<Item>
<ItemID>700</ItemID>
<ItemName>Itemname1</ItemName>
<Orders>
<Order>
<IDIndex>1000</IDIndex>
<IDParam>T1</IDParam>
<Themes>
<Theme>
<Status>Alert</Status>
<Lastget>01.01.2015</Lastget>
</Theme>
</Themes>
</Order>
</Orders>
<Item>
<ItemID>800</ItemID>
<ItemName>Itemname2</ItemName>
<Orders>
<Order>
<IDIndex>5001</IDIndex>
<IDParam>T1</IDParam>
<Themes>
<Theme>
<Status>Alert1</Status>
<Lastget>01.01.2015</Lastget>
</Theme>
</Themes>
</Order>
<Order>
<IDIndex>5002</IDIndex>
<IDParam>T1</IDParam>
<Themes>
<Theme>
<Status>Alert1</Status>
<Lastget>01.01.2015</Lastget>
</Theme>
</Themes>
</Order>
<Order>
<IDIndex>5003</IDIndex>
<IDParam>T1</IDParam>
<Themes>
<Theme>
<Status>Alert2</Status>
<Lastget>01.01.2015</Lastget>
</Theme>
</Themes>
</Order>
</Orders>
</Item>
</Items>
</Account>
Following query will give you the required data:-
var result = xdoc.Root.Descendants("Item")
.Select(x => new
{
Name = (string)x.Document.Root.Element("Name"),
InId = (string)x.Document.Root.Element("InId"),
CustomID = (string)x.Document.Root.Element("CustomID"),
ItemID = (string)x.Element("ItemID"),
ItemName = (string)x.Element("ItemName"),
OrdersList = x.Descendants("Order")
.Select(y => new
{
IDIndex = (string)y.Element("IDIndex"),
IDParam = (string)y.Element("IDParam"),
ThemesList = y.Descendants("Theme")
.Select(z => new
{
Status = (string)z.Element("Status"),
Lastget = (string)z.Element("Lastget")
}).ToList()
}).ToList()
});
Please note that two lists will be created for 2 items, and for each item I am creating a list of orders and within each order list of themes.

Accessing xml elements using LINQ to XML

I have a xml document like this and I need to access the "employees", "employee" elements so I am trying to use linq's XDocument class to get the employee elements but it always returns empty value.
Sample xml:
<organization>
<metadata>
</metadata>
<main>
<otherInfo>
</otherInfo>
<employeeInfo>
<employees>
<employee>
<id>1</id>
<name>ABC</name>
</employee>
<employee>
<id>2</id>
<name>ASE</name>
</employee>
<employee>
<id>3</id>
<name>XYZ</name>
</employee>
</employees>
</employeeInfo>
</main>
</organization>
C# code:
XDocument xDoc = XDocument.Parse(xmlString);
var allEmployees = from d in xDoc.Descendants("employeeInfo")
from ms in d.Elements("employees")
from m in ms.Elements("employee")
select m;
It kind of depends on what information you need. Your select returns an IEnumerable list.
This code will print out each employee
string xmlString = #"<organization>
<metadata>
</metadata>
<main>
<otherInfo>
</otherInfo>
<employeeInfo>
<employees>
<employee>
<id>1</id>
<name>ABC</name>
</employee>
<employee>
<id>2</id>
<name>ASE</name>
</employee>
<employee>
<id>3</id>
<name>XYZ</name>
</employee>
</employees>
</employeeInfo>
</main>
</organization>";
XDocument xDoc = XDocument.Parse(xmlString);
var allEmployees = from d in xDoc.Descendants("employeeInfo")
from ms in d.Elements("employees")
from m in ms.Elements("employee")
select m;
foreach (var emp in allEmployees) {
Console.WriteLine(emp);
}
Console.Read();
XDocument xDoc = XDocument.Parse(xmlString);
var allEmployees = (from r in xDoc.Descendants("employee")
select new
{
Id = r.Element("id").Value,
Name = r.Element("name").Value
}).ToList();
foreach (var r in allEmployees)
{
Console.WriteLine(r.Id + " " + r.Name);
}
Just use Descendants("Employee");
XDocument xDoc = XDocument.Parse(xmlString);
var allEmployees = xDoc.Descendants("employee").ToList();

Xml simplification/extraction of distinct values - possible LINQ

Sorry for this long post....But i have a headache from this task.
I have a mile long xml document where I need to extract a list, use distinct values, and pass for transformation to web.
I have completed the task using xslt and keys, but the effort is forcing the server to its knees.
Description:
hundreds of products in xml, all with a number of named and Id'ed cattegories, all categories with at least one subcategory with name and id.
The categories are unique with ID, all subcategories are unique WITHIN that category:
Simplified example form the huge file (left our tons of info irrelevant to the task):
<?xml version="1.0" encoding="utf-8"?>
<root>
<productlist>
<product id="1">
<name>Some Product</name>
<categorylist>
<category id="1">
<name>cat1</name>
<subcategories>
<subcat id="1">
<name>subcat1</name>
</subcat>
<subcat id="2">
<name>subcat1</name>
</subcat>
</subcategories>
</category>
<category id="2">
<name>cat1</name>
<subcategories>
<subcat id="1">
<name>subcat1</name>
</subcat>
</subcategories>
</category>
<category id="3">
<name>cat1</name>
<subcategories>
<subcat id="1">
<name>subcat1</name>
</subcat>
</subcategories>
</category>
</categorylist>
</product>
<product id="2">
<name>Some Product</name>
<categorylist>
<category id="1">
<name>cat1</name>
<subcategories>
<subcat id="2">
<name>subcat2</name>
</subcat>
<subcat id="4">
<name>subcat4</name>
</subcat>
</subcategories>
</category>
<category id="2">
<name>cat2</name>
<subcategories>
<subcat id="1">
<name>subcat1</name>
</subcat>
</subcategories>
</category>
<category id="3">
<name>cat3</name>
<subcategories>
<subcat id="1">
<name>subcat1</name>
</subcat>
</subcategories>
</category>
</categorylist>
</product>
</productlist>
</root>
DESIRED RESULT:
<?xml version="1.0" encoding="utf-8"?>
<root>
<maincat id="1">
<name>cat1</name>
<subcat id="1"><name>subcat1</name></subcat>
<subcat id="2"><name>subcat2</name></subcat>
<subcat id="3"><name>subcat3</name></subcat>
</maincat>
<maincat id="2">
<name>cat2</name>
<subcat id="1"><name>differentsubcat1</name></subcat>
<subcat id="2"><name>differentsubcat2</name></subcat>
<subcat id="3"><name>differentsubcat3</name></subcat>
</maincat>
<maincat id="2">
<name>cat2</name>
<subcat id="1"><name>differentsubcat1</name></subcat>
<subcat id="2"><name>differentsubcat2</name></subcat>
<subcat id="3"><name>differentsubcat3</name></subcat>
</maincat>
</root>
(original will from 2000 products produce 10 categories with from 5 to 15 subcategories)
Things tried:
Xslt with keys - works fine, but pooooor performance
Played around with linq:
IEnumerable<XElement> mainCats =
from Category1 in doc.Descendants("product").Descendants("category") select Category1;
var cDoc = new XDocument(new XDeclaration("1.0", "utf-8", null), new XElement("root"));
cDoc.Root.Add(mainCats);
cachedCategoryDoc = cDoc.ToString();
Result was a "categories only" (not distinct values of categories or subcategories)
Applied the same xlst to that, and got fairly better performance..... but still far from usable...
Can i apply some sort of magic with the linq statement to have the desired output??
A truckload of good karma goes out to the ones that can point me in det right direction..
//Steen
NOTE:
I am not stuck on using linq/XDocument if anyone has better options
Currently on .net 3.5, can switch to 4 if needed
If I understood your question corectly, here's a LINQ atempt.
The query below parses your XML data and creates a custom type which represents a category and contains the subcategories of that element.
After parsing, the data is grouped by category Id to get distinct subcategories for each category.
var doc = XElement.Load("path to the file");
var results = doc.Descendants("category")
.Select(cat => new
{
Id = cat.Attribute("id").Value,
Name = cat.Descendants("name").First().Value,
Subcategories = cat.Descendants("subcat")
.Select(subcat => new
{
Id = subcat.Attribute("id").Value,
Name = subcat.Descendants("name").First().Value
})
})
.GroupBy(x=>x.Id)
.Select(g=>new
{
Id = g.Key,
Name = g.First().Name,
Subcategories = g.SelectMany(x=>x.Subcategories).Distinct()
});
From the results above you can create your document using the code below:
var cdoc = new XDocument(new XDeclaration("1.0", "utf-8", null), new XElement("root"));
cdoc.Root.Add(
results.Select(x=>
{
var element = new XElement("maincat", new XAttribute("id", x.Id));
element.Add(new XElement("name", x.Name));
element.Add(x.Subcategories.Select(c=>
{
var subcat = new XElement("subcat", new XAttribute("id", c.Id));
subcat.Add(new XElement("name", c.Name));
return subcat;
}).ToArray());
return element;
}));
Try this i have done something for it.. attributes are missing you can add them using XElement ctor
var doc = XDocument.Load(reader);
IEnumerable<XElement> mainCats =
doc.Descendants("product").Descendants("category").Select(r =>
new XElement("maincat", new XElement("name", r.Element("name").Value),
r.Descendants("subcat").Select(s => new XElement("subcat", new XElement("name", s.Element("name").Value)))));
var cDoc = new XDocument(new XDeclaration("1.0", "utf-8", null), new XElement("root"));
cDoc.Root.Add(mainCats);
var cachedCategoryDoc = cDoc.ToString();
Regards.
This will parse your xml into a dictionary of categories with all the distinct subcategory names. It uses XPath from this library: https://github.com/ChuckSavage/XmlLib/
XElement root = XElement.Load(file);
string[] cats = root.XGet("//category/name", string.Empty).Distinct().ToArray();
Dictionary<string, string[]> dict = new Dictionary<string, string[]>();
foreach (string cat in cats)
{
// Get all the categories by name and their subcat names
string[] subs = root
.XGet("//category[name={0}]/subcategories/subcat/name", string.Empty, cat)
.Distinct().ToArray();
dict.Add(cat, subs);
}
Or the parsing as one statement:
Dictionary<string, string[]> dict = root
.XGet("//category/name", string.Empty)
.Distinct()
.ToDictionary(cat => cat, cat => root
.XGet("//category[name={0}]/subcategories/subcat/name", string.Empty, cat)
.Distinct().ToArray());
I give you the task of assembling your resulting xml from the dictionary.

Categories