LINQ to XML: Collapse mutliple levels to single list - c#

I'm currently working on a Silverlight app and need to convert XML data into appropriate objects to data bind to. The basic class definition for this discussion is:
public class TabularEntry
{
public string Tag { get; set; }
public string Description { get; set; }
public string Code { get; set; }
public string UseNote { get; set; }
public List<string> Excludes { get; set; }
public List<string> Includes { get; set; }
public List<string> Synonyms { get; set; }
public string Flags { get; set; }
public List<TabularEntry> SubEntries { get; set; }
}
An example of the XML that might come in to feed this object follows:
<I4 Ref="1">222.2
<DX>Prostate</DX>
<EX>
<I>adenomatous hyperplasia of prostate (600.20-600.21)</I>
<I>prostatic:
<I>adenoma (600.20-600.21)</I>
<I>enlargement (600.00-600.01)</I>
<I>hypertrophy (600.00-600.01)</I>
</I>
</EX>
<FL>M</FL>
</I4>
So, various nodes map to specific properties. The key ones for this question are the <EX> and <I> nodes. The <EX> nodes will contain a collection of one or more <I> nodes and in this example matches up to the 'Excludes' property in the above class definition.
Here comes the challenge (for me). I don't have control over the web service that emits this XML, so changing it isn't an option. You'll notice that in this example one <I> node also contains another collection of one or more <I> nodes. I'm hoping that I could use a LINQ to XML query that will allow me to consolidate both levels into a single collection and will use a character that will delimit the lower level items, so in this example, when the LINQ query returned a TablularEntry object, it would contain a collection of Exclude items that would appear as follows:
adenomatous hyperplasia of prostate
(600.20-600.21)
prostatic:
*adenoma (600.20-600.21)
*enlargement (600.00-600.01)
*hypertrophy (600.00-600.01)
So, in the XML the last 3 entries are actually child objects of the second entry, but in the object's Excludes property, they are all part of the same collection, with the former child objects containing an identifier character/string.
I have the beginnings of the LINQ query I'm using below, I can't quite figure out the bit that will consolidate the child objects for me. The code as it exists right now is:
List<TabularEntry> GetTabularEntries(XElement source)
{
List<TabularEntry> result;
result = (from tabularentry in source.Elements()
select new TabularEntry()
{
Tag = tabularentry.Name.ToString(),
Description = tabularentry.Element("DX").ToString(),
Code = tabularentry.FirstNode.ToString(),
UseNote = tabularentry.Element("UN") == null ? null : tabularentry.Element("UN").Value,
Excludes = (from i in tabularentry.Element("EX").Elements("I")
select i.Value).ToList()
}).ToList();
return result;
}
I'm thinking that I need to nest a FROM statement inside the
Excludes = (from i...)
statement to gather up the child nodes, but can't quite work it through. Of course, that may be because I'm off in the weeds a bit on my logic.
If you need more info to answer, feel free to ask.
Thanks in advance,
Steve

Try this:
List<TabularEntry> GetTabularEntries(XElement source)
{
List<TabularEntry> result;
result = (from tabularentry in source.Elements()
select new TabularEntry()
{
Tag = tabularentry.Name.ToString(),
Description = tabularentry.Element("DX").ToString(),
Code = tabularentry.FirstNode.ToString(),
UseNote = tabularentry.Element("UN") == null ? null : tabularentry.Element("UN").Value,
Excludes = (from i in tabularentry.Element("EX").Descendants("I")
select (i.Parent.Name == "I" ? "*" + i.Value : i.Value)).ToList()
}).ToList();
return result;
}
(edit)
If you need the current nested level of "I" you could do something like:
List<TabularEntry> GetTabularEntries(XElement source)
{
List<TabularEntry> result;
result = (from tabularentry in source.Elements()
select new TabularEntry()
{
Tag = tabularentry.Name.ToString(),
Description = tabularentry.Element("DX").ToString(),
Code = tabularentry.FirstNode.ToString(),
UseNote = tabularentry.Element("UN") == null ? null : tabularentry.Element("UN").Value,
Excludes = (from i in tabularentry.Element("EX").Descendants("I")
select (ElementWithPrefix(i, '*'))).ToList()
}).ToList();
return result;
}
string ElementWithPrefix(XElement element, char c)
{
string prefix = "";
for (XElement e = element.Parent; e.Name == "I"; e = e.Parent)
{
prefix += c;
}
return prefix + ExtractTextValue(element);
}
string ExtractTextValue(XElement element)
{
if (element.HasElements)
{
return element.Value.Split(new[] { '\n' })[0].Trim();
}
else
return element.Value.Trim();
}
Input:
<EX>
<I>adenomatous hyperplasia of prostate (600.20-600.21)</I>
<I>prostatic:
<I>adenoma (600.20-600.21)</I>
<I>enlargement (600.00-600.01)</I>
<I>hypertrophy (600.00-600.01)
<I>Bla1</I>
<I>Bla2
<I>BlaBla1</I>
</I>
<I>Bla3</I>
</I>
</I>
</EX>
Result:
* adenomatous hyperplasia of prostate (600.20-600.21)
* prostatic:
* *adenoma (600.20-600.21)
* *enlargement (600.00-600.01)
* *hypertrophy (600.00-600.01)
* **Bla1
* **Bla2
* ***BlaBla1
* **Bla3

Descendants will get you all of the I children. The FirstNode will help seperate the value of prostatic: from the values of its children. The there's a return character in the value of prostatic:, which I removed with Trim.
XElement x = XElement.Parse(#"
<EX>
<I>adenomatous hyperplasia of prostate (600.20-600.21)</I>
<I>prostatic:
<I>adenoma (600.20-600.21)</I>
<I>enlargement (600.00-600.01)</I>
<I>hypertrophy (600.00-600.01)</I>
</I>
</EX>");
//
List<string> result = x
.Descendants(#"I")
.Select(i => i.FirstNode.ToString().Trim())
.ToList();
Here's a hacky way to get those asterisks in. I don't have time to improve it.
List<string> result2 = x
.Descendants(#"I")
.Select(i =>
new string(Enumerable.Repeat('*', i.Ancestors(#"I").Count()).ToArray())
+ i.FirstNode.ToString().Trim())
.ToList();

Related

How do I read Child Nodes in Xml file using MVC

## XML FİLE ##
<Title month="1" year="2016">
<film day="1">
<morning>
Fight Club
</morning>
<night>
Inceptıon
</night>
</film>
<film day="2">
<morning>
xyzasda
</morning>
<night>
czxsadasas
</night>
</film>
</Title>
MY CLASS
public class FilmController : Controller
{
public ActionResult DisplayXML()
{
var data = new List<Films>();
data = ReturnData();
return View(data);
}
private List<Films> ReturnData(){
string xmldata = "myxmldata.xml";
DataSet ds = new DataSet();
ds.ReadXml(xmldata);
var filmlist= new List<Films>();
filmlist= (from rows in ds.Tables[0].AsEnumerable()
select new Films
{
month= Convert.ToInt32(rows[0].ToString()),
year= rows[1].ToString(),
film= rows[2].ToString(),
morning= rows[3].ToString(),
night= rows[4].ToString(),
}).ToList();
return filmlist;
}
}
Model
public int month{ get; set; }
public string year{ get; set; }
public string day { get; set; }
public string morning { get; set; }
public string night{ get; set; }
How to read child node? I want to create a table. I will create a table using this data. I want to keep it on a list.
I edited..
Error: Additional information: Cannot find column 3.
where is the error? I want to read the xml file.
You can parse your XML with the following:
var xml = XDocument.Load(xmlFile);
var films = xml.Descendants("film").Select(d => new Films()
{
month = Convert.ToInt32(d.Parent.Attribute("month").Value),
year = d.Parent.Attribute("year").Value,
day = d.Attribute("day").Value,
morning = d.Element("morning").Value,
night = d.Element("night").Value
});
See it in action HERE.
You can retrieve Films collection using Linq-To-XML easily like this:
XDocument xdoc = XDocument.Load(xmldata);
List<Films> result = xdoc.Descendants("film")
.Select(x =>
{
var film = x;
var title = x.Parent;
return new Film
{
month = (int)title.Attribute("month"),
year = (string)title.Attribute("year"),
day = (string)film.Attribute("day"),
morning = (string)film.Element("morning"),
night = (string)film.Element("night")
};
}
).ToList();
This will return two films, and each will have month & year based on Title node.
Code Explanation:
First we are finding all the film nodes, then projecting it using Select. In the select clause we can save variable for each film (you can think of film inside select method like alias in foreach loop). Also, we are storing the parent Title node in title variable. After this all we need to do is read the elements & attributes.
If understanding Method syntax is difficult, then here is the equivalent query syntax:
List<Films> result2 = (from x in xdoc.Descendants("film")
let film = x
let title = x.Parent
select new Film
{
month = (int)title.Attribute("month"),
year = (string)title.Attribute("year"),
day = (string)film.Attribute("day"),
morning = (string)film.Element("morning"),
night = (string)film.Element("night")
}).ToList();
Working Fiddle
When i read your xml in a dataset i get 2 tables
Table 1 "Title" with one row and 3 columns
Table 2 "film" with two rows and 4 columns
you read on Tables[0] (3 columns) - and you try to read 4th column of 3..
you need to change your loop - since you need to load Data from two tables.

How do I nest this LINQ query?

I have an XML document from a web service that I am trying to query. However, I am not sure how to query the XML when it has elements nested inside other elements.
Here is a section of the XML file (I haven't included all of it because it's a long file):
<response>
<display_location>
<full>London, United Kingdom</full>
<city>London</city>
<state/>
<state_name>United Kingdom</state_name>
<country>UK</country>
<country_iso3166>GB</country_iso3166>
<zip>00000</zip>
<magic>553</magic>
<wmo>03772</wmo>
<latitude>51.47999954</latitude>
<longitude>-0.44999999</longitude>
<elevation>24.00000000</elevation>
</display_location>
<observation_location>
<full>London,</full>
<city>London</city>
<state/>
<country>UK</country>
<country_iso3166>GB</country_iso3166>
<latitude>51.47750092</latitude>
<longitude>-0.46138901</longitude>
<elevation>79 ft</elevation>
</observation_location>
I can query "one section at a time" but I'm constructing an object from the LINQ. For example:
var data = from i in weatherResponse.Descendants("display_location")
select new Forecast
{
DisplayFullName = i.Element("full").Value
};
var data = from i in weatherResponse.Descendants("observation_location")
select new Forecast
{
ObservationFullName = i.Element("full").Value
};
And my "Forecast" class is basically just full of properties like this:
class Forecast
{
public string DisplayFullName { get; set; };
public string ObservationFullName { get; set; };
//Lots of other properties that will be set from the XML
}
However, I need to "combine" all of the LINQ together so that I can set all the properties of the object. I have read about nested LINQ but I do not know how to apply it to this particular case.
Question: How do I go about "nesting/combining" the LINQ so that I can read the XML and then set the appropriate properties with said XML?
One possible way :
var data = from i in weatherResponse.Descendants("response")
select new Forecast
{
DisplayFullName = (string)i.Element("display_location").Element("full"),
ObservationFullName = (string)i.Element("observation_location").Element("full")
};
Another way ... I prefer using the Linq extension methods in fluent style
var results = weatherResponse.Descendants()
.SelectMany(d => d.Elements())
.Where(e => e.Name == "display_location" || e.Name == "observation_location")
.Select(e =>
{
if(e.Name == "display_location")
{
return new ForeCast{ DisplayFullName = e.Element("full").Value };
}
else if(e.Name == "observation_location")
{
return new ForeCast{ ObservationFullName = e.Element("full").Value };
}
else
{
return null;
}
});

Updating entire node with mutating cypher in Neo4jclient

I need to update all the properties of a given node, using mutating cypher. I want to move away from Node and NodeReference because I understand they are deprecated, so can't use IGraphClient.Update. I'm very new to mutating cypher. I'm writing in C#, using Neo4jclient as the interface to Neo4j.
I did the following code which updates the "Name" property of a "resunit" where property "UniqueId" equals 2. This works fine. However,
* my resunit object has many properties
* I don't know which properties have changed
* I'm trying to write code that will work with different types of objects (with different properties)
It was possible with IGraphClient.Update to pass in an entire object and it would take care of creating cypher that sets all properies.
Can I somehow pass in my object with mutating cypher as well?
The only alternative I can see is to reflect over the object to find all properties and generate .Set for each, which I'd like to avoid. Please tell me if I'm on the wrong track here.
string newName = "A welcoming home";
var query2 = agencyDataAccessor
.GetAgencyByKey(requestingUser.AgencyKey)
.Match("(agency)-[:HAS_RESUNIT_NODE]->(categoryResUnitNode)-[:THE_UNIT_NODE]->(resunit)")
.Where("resunit.UniqueId = {uniqueId}")
.WithParams(new { uniqueId = 2 })
.With("resunit")
.Set("resunit.Name = {residentialUnitName}")
.WithParams(new { residentialUnitName = newName });
query2.ExecuteWithoutResults();
It is indeed possible to pass an entire object! Below I have an object called Thing defined as such:
public class Thing
{
public int Id { get; set; }
public string Value { get; set; }
public DateTimeOffset Date { get; set; }
public int AnInt { get; set; }
}
Then the following code creates a new Thing and inserts it into the DB, then get's it back and updates it just by using one Set command:
Thing thing = new Thing{AnInt = 12, Date = new DateTimeOffset(DateTime.Now), Value = "Foo", Id = 1};
gc.Cypher
.Create("(n:Test {thingParam})")
.WithParam("thingParam", thing)
.ExecuteWithoutResults();
var thingRes = gc.Cypher.Match("(n:Test)").Where((Thing n) => n.Id == 1).Return(n => n.As<Thing>()).Results.Single();
Console.WriteLine("Found: {0},{1},{2},{3}", thingRes.Id, thingRes.Value, thingRes.AnInt, thingRes.Date);
thingRes.AnInt += 100;
thingRes.Value = "Bar";
thingRes.Date = thingRes.Date.AddMonths(1);
gc.Cypher
.Match("(n:Test)")
.Where((Thing n) => n.Id == 1)
.Set("n = {thingParam}")
.WithParam("thingParam", thingRes)
.ExecuteWithoutResults();
var thingRes2 = gc.Cypher.Match("(n:Test)").Where((Thing n) => n.Id == 1).Return(n => n.As<Thing>()).Results.Single();
Console.WriteLine("Found: {0},{1},{2},{3}", thingRes2.Id, thingRes2.Value, thingRes2.AnInt, thingRes2.Date);
Which gives:
Found: 1,Foo,12,2014-03-27 15:37:49 +00:00
Found: 1,Bar,112,2014-04-27 15:37:49 +00:00
All properties nicely updated!

List<object> Self-Filter

I have a list like
List<VoieData> listVoieData = new List<VoieData>();
and in VoieData Class I have :
public class VoieData
{
public int Depart { set; get; }
public int Arrive { set; get; }
public int DistanceDepart { set; get; }
public int DistanceArrive { set; get; }
}
Since I have a massive values I want to only consider all my Depart number , I would like to filter the listVoieData by finding the Arrive only have the same value as the
Depart
for example I have
listVoieData.Select(p=>p.Depart).ToList()= List<int>{1,2,3};
listVoieData.Select(p=>p.Arrive).ToList()= List<int>{1,2,3,4,5};
I need to throw away the entire VoieData which contain {4,5} as Arrive
right now my soulution is like this , but it' s not correct ;
List<VoieData> listVoieDataFilter = listVoieData .Join(listVoieData , o1 => o1.Arrive, o2 => o2.Depart, (o1, o2) => o1).ToList();
Sorry for the confusing question ;
I want to remove Arrive which is different from all the Depart in the list list , and return the new
List
it 's not only in one VoieData;
Arrive!=Depart
Thanks
I think you want to remove all objects where Arrive is not in any of the Depart from any object. In that case, first get all Depart and then filter by Arrive:
HashSet<int> allDepart = new HashSet<int>(listVoieData.Select(x => x.Depart));
var result = listVoieData.Where(v => !allDepart.Contains(v.Arrive))
We use a HashSet<int> for efficiency.
Use LINQ Where:
var records = listVoieData.Where(x => x.Arrive == x.Depart);
This will return results where both Arrive and Depart are the same.
That would be a typical case to use linq.
something like:
var res = from data in listVoieData
where data.Depart == data.Arrive
select data;
and then optionally just use res.ToArray() to run the query and get the array.
Since you've stated that you want:
I want to remove Arrive which is different from all the Depart
This can be re-phrased as, "The set of all arrivals except those in the set of departures", which translates very nicely into the following LINQ query:
var arrivalsWithNoDepartures = listVoieData.Select(p=>p.Arrive)
.Except(listVoieData.Select(p=>p.Depart));

Using Contains() list method to evaluate list contents

I have a list that contains 3 items, two of type_1, and one of type_2. I want to return a second list that contains the type and number of that type that exists. When stepping through the breakpoints set at the foreach loop, the IF statement is never true. I assume there is something wrong with my attempt to use Contains() method.
The output should be something like:
type_1 2
type_2 1
Instead, it evaluates as:
type_1 1
type_1 1
type_2 1
Is my use of Contains() not correct?
public List<item_count> QueryGraphListingsNewAccountReport()
List<item> result = new List<items>();
var type_item1 = new item { account_type = "Type_1" };
var type_item2 = new item { account_type = "Type_1" };
var type_item3 = new item { account_type = "Type_2" };
result.Add(type_item1);
result.Add(type_item2);
result.Add(type_item3);
//Create a empty list that will hold the account_type AND a count of how many of that type exists:
List<item_count> result_count = new List<item_count>();
foreach (var item in result)
{
if (result_count.Contains(new item_count { account_type = item.account_type, count = 1 } ) == true)
{
var result_item = result_count.Find(x => x.account_type == item.account_type);
result_item.count += 1;
result_count.Add(result_item);
}
else
{
var result_item = new item_count { account_type = item.account_type, count = 1 };
result_count.Add(result_item);
}
}
return result_count;
}
public class item
{
public string account_type { get; set; }
}
public class item_count
{
public int count {get; set;}
public string account_type { get; set; }
}
I think your problem is that you don't want to use contains at all. You are creating a new object in your contains statement and, obviously, it isn't contained in your list already because you only just created it. The comparison is comparing references, not values.
Why not just use the find statement that you do in the next line instead? If it returns null, then you know there isn't an item already with that type.
So you could do something like this:
var result_item = result_count.Find(x => x.account_type == item.account_type);
if (result_item != null)
{
result_item.count++;
// note here you don't need to add it back to the list!
}
else
{
// create your new result_item here and add it to your list.
}
Note: Find is o(n), so this might not scale well if you have a really large set of types. In that case, you might be better off with Saeed's suggestion of grouping.
You can do:
myList.GroupBy(x=>x.type).Select(x=>new {x.Key, x.Count()});
If you want use for loop, it's better to use linq Count function to achieve this, If you want use Contains you should implement equal operator as the way you used.

Categories