Using LINQ to aggregate multiple nested elements in XDocument - c#

I have the following XML (parsed into an XDocument):
XDocument nvdXML = XDocument.Parse(#"<entry id='CVE-2016-1926'>
<vulnerable-configuration id='http://www.nist.gov/'>
<logical-test operator='OR' negate='false'>
<fact-ref name='A'/>
<fact-ref name='B'/>
<fact-ref name='C'/>
</logical-test>
</vulnerable-configuration>
<vulnerable-configuration id='http://www.nist.gov/'>
<logical-test operator='OR' negate='false'>
<fact-ref name='X'/>
<fact-ref name='Y'/>
<fact-ref name='Z'/>
</logical-test>
</vulnerable-configuration></entry>");
I want to get a single collection/list of every "name" attribute for each entry (in this case there is only one entry, whose name list would consist of ['A','B','C','X','Y','Z'])
Here is the code I have:
var entries = from entryNodes in nvdXML.Descendants("entry")
select new CVE
{
//VulnerableConfigurations = (from vulnCfgs in entryNodes.Descendants(vulnNS + "vulnerable-configuration").Descendants(cpeNS + "logical-test")
// select new VulnerableConfiguration
// {
// Name = vulnCfgs.Element(cpeNS + "fact-ref").Attribute("name").Value
// }).ToList()
VulnerableConfigurations = (from vulnCfgs in entryNodes.Descendants("vulnerable-configuration")
from logicalTest in vulnCfgs.Descendants("logical-test")
select new VulnerableConfiguration
{
Name = logicalTest.Element("fact-ref").Attribute("name").Value
}).ToList()
};
Unfortunately, this (both commented and uncommented) query only results in VulnerableConfigurations ['A','X'], instead of ['A','B','C','X','Y','Z']
How do I modify my query such that every element of every list is selected (assuming there could be 1+ nested lists)?
Note, I did search for dup's, and although there are similar questions, most are very specific, and ask for grouping/summing/manipulation, or are not related to XML parsing.
Final working code (thanks to accepted answer):
var entries = from entryNodes in nvdXML.Descendants("entry")
select new CVE
{
VulnerableConfigurations = (from vulnCfgs in entryNodes.Descendants("fact-ref")
select new VulnerableConfiguration
{
Name = vulnCfgs.Attribute("name").Value
}).ToList()
};

You can try this if you have only one entry:
var entries =(from fact in nvdXML.Descendants("fact-ref")
select new VulnerableConfiguration
{
Name = fact.Attribute("name").Value
}).ToList();
The Descendants method is going to return all descendant elements that match with that name in document order.
And if you have more than one entry and you want to return a list for each entry, you can do the following:
var entries =(from entry in nvdXML.Descendants("entry")
select entry.Descendants("fact-ref").Select(f=>f.Attribute("name").Value).ToList()
).ToList();
In this case you are going to get a list of lists (List<List<string>>)
Update
Your issue was because you are flattering your query over the logical-test elements and in your xml you have two of them. Now in your select you are using Element method, which give you only one element, that's way you have A and X as result, that are the first fact-ref elements inside your logical-test elements

Related

Trying to get a list of a single field from all the documents in my Mongo database

I'm using the last driver. My documents are of the form
{
"ItemID": 292823,
....
}
First problem: I'm attempting to get a list of all the ItemIDs, and then sort them. However, my search is just pulling back all the _id, and none of the ItemIDs. What am I doing wrong?
var f = Builders<BsonDocument>.Filter.Empty;
var p = Builders<BsonDocument>.Projection.Include(x => x["ItemID"]);
var found= collection.Find(f).Project<BsonDocument>(p).ToList().ToArray();
When I attempt to query the output, I get the following.
found[0].ToJson()
"{ \"_id\" : ObjectId(\"56fc4bd9ea834d0e2c23a4f7\") }"
It's missing ItemID, and just has the mongo id.
Solution: I messed up the case. It's itemID, not ItemID. I'm still having trouble with the sorting.
Second problem: I tried changing the second line to have x["ItemID"].AsInt32, but then I got an InvalidOperationException with the error
Rewriting child expression from type 'System.Int32' to type
'MongoDB.Bson.BsonValue' is not allowed, because it would change the
meaning of the operation. If this is intentional, override
'VisitUnary' and change it to allow this rewrite.
I want them as ints so that I can add a sort to the query. My sort was the following:
var s = Builders<BsonDocument>.Sort.Ascending(x => x);
var found= collection.Find(f).Project<BsonDocument>(p).Sort(s).ToList().ToArray();
Would this be the correct way to sort it?
Found the solution.
//Get all documents
var f = Builders<BsonDocument>.Filter.Empty;
//Just pull itemID
var p = Builders<BsonDocument>.Projection.Include(x => x["itemID"]);
//Sort ascending by itemID
var s = Builders<BsonDocument>.Sort.Ascending("itemID");
//Apply the builders, and then use the Select method to pull up the itemID's as ints
var found = collection.Find(f)
.Project<BsonDocument>(p)
.Sort(s)
.ToList()
.Select(x=>x["itemID"].AsInt32)
.ToArray();

How should I organize two objects in order to be able to join them on a key?

So basically, I am reading in two XML docs. The first has two values that need to be stored: Name and Value. The second has four values: Name, DefaultValue, Type, and Limit. When reading in the docs, I want to store each into some object. I need to be able to then combine the two objects into one that has 5 values stored in it. The XML docs are different lengths, but the second will always be AT LEAST the size of the first.
EXAMPLE:
<XML1>
<Item1>
<Name>Cust_No</Name>
<Value>10001</Value>
</Item1>
<Item4>
ITEM4 NAME AND VALUE
</Item4>
<Item7>
ITEM 7 NAME AND VALUE
</Item7>
</XML1>
<XML2>
<Item1>
<Name>Cust_No</Name>
<DefaultValue></DefaultValue>
<Type>varchar</Type>
<Limit>15</Limit>
</Item1>
6 MORE TIMES ITEMS 2-7
</XML2>
I already have code looping through the XML. I really just need thoughts on what the best way to store the data it. Ultimately, I want to be able to join the two objects on the Name Key. I tried string[] and arrayList[], but I ran into difficulty combining them. I also read up on the Dictionary, but had trouble implementing that, too (I've never used the Dictionary before).
Here is Linq to Xml query, which will join two XDocuments and select anonymous objects for joined items. Each object will have five properties:
var query =
from i1 in xdoc1.Root.Elements()
join i2 in xdoc2.Root.Elements()
on (string)i1.Element("Name") equals (string)i2.Element("Name") into g
let j = g.SingleOrDefault() // get joined element from second file, if any
select new {
Name = g.Key,
Value = (int)i1.Element("Value"),
DefaultValue = (j == null) ? null : (string)j.Element("DefaultValue"),
Type = (j == null) ? null : (string)j.Element("Type"),
Limit = (j == null) ? null : (string)j.Element("Limit")
};
XDocuments created like this:
var xdoc1 = XDocument.Load(path_to_xml1);
var xdoc2 = XDocument.Load(path_to_xml2);
Usage of query:
foreach(var item in query)
{
// use string item.Name
// integer item.Value
// string item.DefaultValue
// string item.Type
// string item.Limit
}

Why does this work but the other one fails( Linq To Xml)

I tried this simple query on LinqPad and scratched by head as to why the second query works but the first one just does not alter the list(descen).
How do i make it work. I use this code to clone and modify xml attribute,value
Fails
var doc = XElement.Parse("<Root><Descendants>Welcome</Descendants><Descendants>Stack</Descendants><Descendants>Overflow</Descendants></Root>");
var descen = (from des in doc.Descendants("Descendants") select new XElement(des));
foreach (var desc in descen)
{
desc.Value += DateTime.UtcNow;
}
descen.Dump();
doc.Dump();
Works
var doc = XElement.Parse("<Root><Descendants>Welcome</Descendants><Descendants>Stack</Descendants><Descendants>Overflow</Descendants></Root>");
var descen = (from des in doc.Descendants("Descendants") select new XElement(des));
foreach (var desc in doc.Descendants("Descendants"))
{
desc.Value += DateTime.UtcNow;
}
descen.Dump();
doc.Dump();
Stall's my PC ?? WTH
var doc = XElement.Parse("<Root><Descendants>Welcome</Descendants><Descendants>Stack</Descendants><Descendants>Overflow</Descendants></Root>");
var descen = (from des in doc.Descendants("Descendants") select des);
foreach (var desc in descen)
{
desc.Value += DateTime.UtcNow;
}
var instance = from t in descen select new XElement(t);
doc.Elements("Descendants").LastOrDefault().AddAfterSelf(instance);
descen.Dump();
In the first query, descen is a projection from existing elements to new elements. It's lazily evaluated - each time you iterate over it, it will create new elements.
In the first case, you iterate over descen, and modify the new elements as you go. However, those modified elements are effectively thrown away very soon after they're created - they're not part of the original document.
In the second case, you modify the elements in the original document, so when you iterate over descen, it will create a copy of the modified elements, and displays those.
EDIT: If you want to change descen but not the original document, you can just add ToList() at the end of the initialization part of descen. For example:
var descen = (from des in doc.Descendants("Descendants")
select new XElement(des)).ToList();
Or, more readably IMO:
var descen = doc.Descendants("Descendants")
.Select(x => new XElement(x))
.ToList();
The reason your final code hangs is that as AddAfterSelf iterates over instance, it's adding elements... but those elements are then part of the document, which means they're part of the descen query, which means they're part of the instance query... so iterating over instance will never complete.

LINQ - Select * from XML elements with a certain tag

I've been looking at an example of LINQ from the following link; I've posted the code below the link.
Is it possible to modify this example so that the items returned in var contain all sub elements found in the items matching the doc.Descendants("person") filter? I basically want this XML query to act like a SQL select * so I don't have to explicitly specify field names like they've done with drink, moneySpent, and zipCode.
http://broadcast.oreilly.com/2010/10/understanding-c-simple-linq-to.html#example_1
static void QueryTheData(XDocument doc)
{
// Do a simple query and print the results to the console
var data = from item in doc.Descendants("person")
select new
{
drink = item.Element("favoriteDrink").Value,
moneySpent = item.Element("moneySpent").Value,
zipCode = item.Element("personalInfo").Element("zip").Value
};
foreach (var p in data)
Console.WriteLine(p.ToString());
}
The OP said he liked the answer posted, so I'll just resubmit it for science :)
var data = from item in doc.Descendants("person")
select item;
The only problem with this is that data is an IEnumerable<XElement>, and you'll have to query the fields by string names.
// Do a simple query and print the results to the console
var data = from item in doc.Descendants("person")
select item;

return all xml items with same name in linq

how can I query an xml file where I have multiple items with the same name, so that I can get all items back. Currently I only get the first result back.
I managed to get it to work with the following code, but this returns all items where the specific search criteria is met.
What I want as output is to get two results back where the location is Dublin for example.
The question is how can I achieve this with linq to xml
Cheers Chris,
Here is the code
string location = "Oslo";
var training = (from item in doc.Descendants("item")
where item.Value.Contains(location)
select new
{
event = item.Element("event").Value,
event_location = item.Element("location").Value
}).ToList();
The xml file looks like this
<training>
<item>
<event>C# Training</event>
<location>Prague</location>
<location>Oslo</location>
<location>Amsterdam</location>
<location>Athens</location>
<location>Dublin</location>
<location>Helsinki</location>
</item>
<item>
<event>LINQ Training</event>
<location>Bucharest</location>
<location>Oslo</location>
<location>Amsterdam</location>
<location>Helsinki</location>
<location>Brussels</location>
<location>Dublin</location>
</item>
</training>
You're using item.Element("location") which returns the first location element under the item. That's not necessarily the location you were looking for!
I suspect you actually want something more like:
string location = "Oslo";
var training = from loc in doc.Descendants("location")
where loc.Value == location
select new
{
event = loc.Parent.Element("event").Value,
event_location = loc.Value
};
But then again, what value does event_location then provide, given that it's always going to be the location you've passed into the query?
If this isn't what you want, please give more details - your question is slightly hard to understand at the moment. Details of what your current code gives and what you want it to give would be helpful - as well as what you mean by "name" (in that it looks like you actually mean "value").
EDIT: Okay, so it sounds like you want:
string location = "Oslo";
var training = from loc in doc.Descendants("location")
where loc.Value == location
select new
{
event = loc.Parent.Element("event").Value,
event_locations = loc.Parent.Elements("location")
.Select(e => e.Value)
};
event_locations will now be a sequence of strings. You can get the output you want with:
for (var entry in training)
{
Console.WriteLine("Event: {0}; Locations: {1}",
entry.event,
string.Join(", ", entry.event_locations.ToArray());
}
Give that a try and see if it's what you want...
This might not be the most efficient way of doing it, but this query works:
var training = (from item in root.Descendants("item")
where item.Value.Contains(location)
select new
{
name = item.Element("event").Value,
location = (from node in item.Descendants("location")
where node.Value.Equals(location)
select node.Value).FirstOrDefault(),
}).ToList();
(Note that the code wouldn't compile if the property name was event, so I changed it to name.)
I believe the problem with your code was that the location node retrieved when creating the anonymous type didn't search for the node with the desired value.

Categories