Complex/Nested XML Reading in C# - c#

I have been trying to read this XML file however it is complex/nested a good amount compared to the examples I have seen online. I have tried using LINQ and XMLReader with no luck.
LINQ will read each OrderScreen; however, when it comes to the Cell of each OrderScreen it loads all possible Cells into each OrderScreen even if the Cell does not belong to that OrderScreen. I understand why it does it, but I am fairly new to LINQ and most of the examples I see are not this complex and do not really cover this.
XMLReader works pretty well but it does not continue reading the next Cell after it completed the reading of one OrderScreen, it just reads the first Cell of the next OrderScreen then assumes it is at the end of the document. I did not include that code because all the searches I have seen people using LINQ over XMLReader.
XML is below first, most recent LINQ code after that
Any help is greatly appreciated!
<Screens>
<DeleteScreens></DeleteScreens>
<NewScreens>
<OrderScreen>
<ScreenNumber></ScreenNumber>
<Title></Title>
<NumberOfColumns></NumberOfColumns>
<OptionScreen></OptionScreen>
<ShowQuantityButtons></ShowQuantityButtons>
<PrepSequenceScreen></PrepSequenceScreen>
<Cell>
<CellNumber></CellNumber>
<CellName></CellName>
<InventoryNumber></InventoryNumber>
...more Cell elements..
<OptionGroup>
<Type></Type>
<ScreenNumber></ScreenNumber>
<Cells></Cells>
</OptionGroup>
...more OptionGroups...
</Cell>
...more Cells...
</OrderScreen>
...more OrderScreens...
</NewScreens>
<UpdateMenus>
<Menu>
<MenuNumber></MenuNumber>
<MenuTitle></MenuTitle>
...more Menu elements...
</Menu>
...more Menus...
</UpdateMenus>
<Screens>
XDocument xdoc;
xdoc = XDocument.Load(#"C:\Users\Kwagstaff\Desktop\PMM_3.0\PMM_3.0\XML\Screens.xml");
var ORDERSCREENS = from a in xdoc.Descendants("OrderScreen")
select new
{
ScreenNumber = a.Element("ScreenNumber").Value,
Title = a.Element("Title").Value,
NumberOfColumns = a.Element("NumberOfColumns").Value,
OptionScreen = a.Element("OptionScreen").Value,
ShowQuantityButtons = a.Element("ShowQuantityButtons").Value,
PrepSequenceScreen = a.Element("PrepSequenceScreen").Value,
Cell = from b in xdoc.Descendants("Cell")
select new
{
CellNumber = b.Element("CellNumber"),
}
};

In my opinion, the proper way to do that is with entities and decorators, you will need to do some research but as example
for something like
<MyComplexXML>
....
<xalAddress>...</xalAddress>
<multiPoint>
<MultiPoint>...</MultiPoint>
</multiPoint>
...
</MyComplexXML>
First, you create your classes like this
using System.Xml.Serialization;
namespace MyComplexXML_Model
{
/// <summary>
/// Address field for MyComplexXML
/// </summary>
public class Address
{
/// <summary>
/// XalAddress
/// </summary>
[XmlElement("xalAddress")]
public XalAddress XalAddress;
[XmlElement("multiPoint")]
public MultiPointAddress MultiPointAddress;
}
}
and

using System.Xml.Serialization;
namespace MyComplexXML_Model
{
public class MultiPointAddress
{
[XmlElement("MultiPoint", Namespace = "http://www.sample.net/sample")]
public MultiPoint Multipoint;
}
}
and when your complete hierarchies are in place you can call your root element like this
var ns = new XmlSerializerNamespaces();
ns.Add("sample", "http://www.sample.net/sample");
...
var ms = new MemoryStream();
var sw = new StreamWriter(ms);
//Deserialize from file
var sr = new StreamReader(#"myfile.xml");
var city = (MyComplexXML)new XmlSerializer(typeof(MyComplexXML)).Deserialize(sr);
Hope this point you in the right direction.

Related

How can I make XmlSerializer Deserialize tell me about typos on tag names

I know that most of the time, the Deserialize method of XmlSerializer will complain if there's something wrong (for example, if there is a typo). However, I've found an example where it doesn't complain, when I would have expected it to; and I'd like to know if there's a way of being told about the problem.
The example code below contains 3 things: an good example which works as expected, and example which would complain (commented out) and an example which does not complain, which is the one I want to know how to tell that there is something wrong.
Note: I appreciate that one possible route would be XSD validation; but that really feels like a sledgehammer to crack what seems like a simpler problem. For example, if I was writing a deserializer which had unexpected data that it didn't know what to do with, I'd make my code complain about it.
I've used NUnit (NuGet package) for assertions; but you don't really need it, just comment out the Assert lines - you can see what I'm expecting.
using System.IO;
using System.Linq;
using System.Text;
using System.Xml.Serialization;
using NUnit.Framework;
public static class Program
{
public static void Main()
{
string goodExampleXml = #"<?xml version=""1.0"" encoding=""utf-8""?><Example><Weathers><Weather>Sunny</Weather></Weathers></Example>";
var goodExample = Load(goodExampleXml);
Assert.That(goodExample, Is.Not.Null);
Assert.That(goodExample.Weathers, Is.Not.Null);
Assert.That(goodExample.Weathers, Has.Length.EqualTo(1));
Assert.That(goodExample.Weathers.First(), Is.EqualTo(Weather.Sunny));
string badExampleXmlWhichWillComplainXml = #"<?xml version=""1.0"" encoding=""utf-8""?><Example><Weathers><Weather>Suny</Weather></Weathers></Example>";
// var badExampleWhichWillComplain = Load(badExampleXmlWhichWillComplainXml); // this would complain, quite rightly, so I've commented it out
string badExampleXmlWhichWillNotComplain = #"<?xml version=""1.0"" encoding=""utf-8""?><Example><Weathers><Weathe>Sunny</Weathe></Weathers></Example>";
var badExample = Load(badExampleXmlWhichWillNotComplain);
Assert.That(badExample, Is.Not.Null);
Assert.That(badExample.Weathers, Is.Not.Null);
// clearly, the following two assertions will fail because I mis-typed the tag name; but I want to know there has been a problem before this point.
Assert.That(badExample.Weathers, Has.Length.EqualTo(1));
Assert.That(badExample.Weathers.First(), Is.EqualTo(Weather.Sunny));
}
private static Example Load(string serialized)
{
byte[] byteArray = Encoding.UTF8.GetBytes(serialized);
var xmlSerializer = new XmlSerializer(typeof(Example));
using var stream = new MemoryStream(byteArray, false);
return (Example)xmlSerializer.Deserialize(stream);
}
}
public enum Weather
{
Sunny,
Cloudy,
Rainy,
Windy,
Stormy,
Snowy,
}
public class Example
{
[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Performance", "CA1819:PropertiesShouldNotReturnArrays", Justification = "Serialized XML")]
[XmlArray("Weathers")]
[XmlArrayItem("Weather")]
public Weather[] Weathers { get; set; }
}
Having looked at Microsoft's published source code for XmlSerializer, it became apparent that there are events that you can subscribe to (which is what I was hoping for); but they aren't exposed on the XmlSerializer itself... you have to inject a struct containing them into the constructor.
So I've been able to modify the code from the question to have an event handler which gets called when an unknown node is encountered (which is exactly what I was after). You need one extra using, over the ones given in the question...
using System.Xml;
and then here is the modified "Load" method...
private static Example Load(string serialized)
{
XmlDeserializationEvents events = new XmlDeserializationEvents();
events.OnUnknownNode = (sender, e) => System.Diagnostics.Debug.WriteLine("Unknown Node: " + e.Name);
var xmlSerializer = new XmlSerializer(typeof(Example));
using var reader = XmlReader.Create(new StringReader(serialized));
return (Example)xmlSerializer.Deserialize(reader, events);
}
So now I just need to do something more valuable than just write a line to the Debug output.
Note that more events are available, as described on the XmlDeserializationEvents page, and I'll probably pay attention to each of them.
I tested following and it works
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string xml =#"<?xml version=""1.0"" encoding=""utf-8"" ?>
<Example>
<Weathers>Sunny</Weathers>
<Weathers>Cloudy</Weathers>
<Weathers>Rainy</Weathers>
<Weathers>Windy</Weathers>
<Weathers>Stormy</Weathers>
<Weathers>Snowy</Weathers>
</Example>";
StringReader sReader = new StringReader(xml);
XmlReader reader = XmlReader.Create(sReader);
XmlSerializer serializer = new XmlSerializer(typeof(Example));
Example example = (Example)serializer.Deserialize(reader);
}
}
public enum Weather
{
Sunny,
Cloudy,
Rainy,
Windy,
Stormy,
Snowy,
}
public class Example
{
[XmlElement("Weathers")]
public Weather[] Weathers { get; set; }
}
}

XML dialogue tree in Unity not parsing information correctly

I have tried to setup a Dialogue tree within unity using XML (I have not used XML much before so am unsure if the way i am going is correct at all)
So I am trying to get the first text element from this dialogue tree but when i call the XML file and say where it is i am getting the everything stored in that branch.
Am i using the correct .XML to be able to do this also as i seen people say use .XML.LINQ or .XML.Serialization not just .XML is this correct for my case ??
Code:
using UnityEngine;
using System.Collections;
using System.IO;
using System.Xml;
using UnityEngine.UI;
using System.Collections.Generic;
public class DialogTree
{
public string text;
public List<string> dialogText;
public List<DialogTree> nodes;
public void parseXML(string xmlData)
{
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(new StringReader(xmlData));
XmlNode node = xmlDoc.SelectSingleNode("dialoguetree/dialoguebranch");
text = node.InnerXml;
XmlNodeList myNodeList = xmlDoc.SelectNodes("dialoguebranch/dialoguebranch");
foreach (XmlNode node1 in myNodeList)
{
if (node1.InnerXml.Length > 0)
{
DialogTree dialogtreenode = new DialogTree();
dialogtreenode.parseXML(node1.InnerXml);
nodes.Add(dialogtreenode);
}
}
}
}
And here is a picture of the XML.
So i am trying to grab the first element of text then late on there response it will go to branch 1 or 2
<?xml version='1.0'?>
<dialoguetree>
<dialoguebranch>
<text>Testing if the test prints</text>
<dialoguebranch>
<text>Branch 1</text>
<dialoguebranch>
<text>Branch 1a</text>
</dialoguebranch>
<dialoguebranch>
<text>Branch 1b</text>
</dialoguebranch>
</dialoguebranch>
<dialoguebranch>
<text>Branch 2</text>
</dialoguebranch>
</dialoguebranch>
</dialoguetree>
You're getting everything in that branch because XmlNode.InnerXML returns everything in that node. See the documentation for more information on that.
You should use the branch as the base for only looking at its children, instead of starting at xmlDoc every time. Also, you need an entry point to get inside of the first dialoguetree element and then ignore that. Finally, I would only create one XmlDocument and just pass around nodes in your recursion.
Altogether, this might look like this:
public class DialogTree
{
public string text;
public List<DialogTree> nodes = new List<DialogTree>();
public static DialogTree ParseXMLStart(string xmlData)
{
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(new Stringreader(xmlData));
XmlNode rootNode = xmlDoc.SelectSingleNode("dialoguetree/dialoguebranch");
DialogTree dialogTree = new DialogTree();
dialogTree.ParseXML(rootNode);
return dialogTree;
}
public void ParseXML(XmlNode parentNode)
{
XmlNode textNode = parentNode.SelectSingleNode("text");
text = textNode.InnerText;
XmlNodeList myNodeList = parentNode.SelectNodes("dialoguebranch");
foreach (XmlNode curNode in myNodeList)
{
if (curNode.InnerXml.Length > 0)
{
DialogTree dialogTree = new DialogTree();
dialogTree.ParseXML(curNode);
nodes.Add(dialogTree);
}
}
}
}
And you could use it like so:
string xmlStringFromFile;
DialogTree dialogue = DialogTree.ParseXMLStart(xmlStringFromFile);
All of this code is untested but I hope the general idea is clear. Let me know if you find any errors in the comments below and I will try to fix them.

FlatFile library, delimited layout, wrong parsing when multiple fields are empty at the end of the row

We use in some of our applications the FlatFile library (https://github.com/forcewake/FlatFile) to parse some files delimited with separator (";"), since a lot of time without problems.
We faced yesterday a problem receiving files having multiple fields empty at the end of the row.
I replicated the problem with short console application to show and permit you to verify in a simple way:
using FlatFile.Delimited;
using FlatFile.Delimited.Attributes;
using FlatFile.Delimited.Implementation;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
namespace FlatFileTester
{
class Program
{
static void Main(string[] args)
{
var layout = GetLayout();
var factory = new DelimitedFileEngineFactory();
using (MemoryStream ms = new MemoryStream())
using (FileStream file = new FileStream(#"D:\shared\dotnet\FlatFileTester\test.csv", FileMode.Open, FileAccess.Read))
{
byte[] bytes = new byte[file.Length];
file.Read(bytes, 0, (int)file.Length);
ms.Write(bytes, 0, (int)file.Length);
var flatFile = factory.GetEngine(layout);
ms.Position = 0;
List<TestObject> records = flatFile.Read<TestObject>(ms).ToList();
foreach(var record in records)
{
Console.WriteLine(string.Format("Id=\"{0}\" - DescriptionA=\"{1}\" - DescriptionB=\"{2}\" - DescriptionC=\"{3}\"", record.Id, record.DescriptionA, record.DescriptionB, record.DescriptionC));
}
}
Console.ReadLine();
}
public static IDelimitedLayout<TestObject> GetLayout()
{
IDelimitedLayout<TestObject> layout = new DelimitedLayout<TestObject>()
.WithDelimiter(";")
.WithQuote("\"")
.WithMember(x => x.Id)
.WithMember(x => x.DescriptionA)
.WithMember(x => x.DescriptionB)
.WithMember(x => x.DescriptionC)
;
return layout;
}
}
[DelimitedFile(Delimiter = ";", Quotes = "\"")]
public class TestObject
{
[DelimitedField(1)]
public int Id { get; set; }
[DelimitedField(2)]
public string DescriptionA { get; set; }
[DelimitedField(3)]
public string DescriptionB { get; set; }
[DelimitedField(4)]
public string DescriptionC { get; set; }
}
}
This is an example of file:
1;desc1;desc1;desc1
2;desc2;desc2;desc2
3;desc3;;desc3
4;desc4;desc4;
5;desc5;;
So the first 4 rows are parsed as expected:
All fields with values in the first and second row
empty string for third field of third row
empty string for fouth field of fourth row
in the fifth row we expect empty string on third and fourth field, like this:
Id=5
DescriptionA="desc5"
DescriptionB=""
DescriptionC=""
instead we receive this:
Id=5
DescriptionA="desc5"
DescriptionB=";" // --> THE SEPARATOR!!!
DescriptionC=""
We can't understand if is a problem of configuration, bug of the library, or some other problem in the code...
Anyone have some similar experiences with this library, or can note some problem in the code above not linked with the library but causing the error...?
I took a look and debug the source code of the open source library: https://github.com/forcewake/FlatFile.
It seems there's a problem, in particular in this case, in witch there are 2 empty fields, at the end of a row, the bug take effects on the field before the last of the row.
I opened an issue for this libray, hoping some contributor of the library could invest some time to investigate, and, if it is so, to fix: https://github.com/forcewake/FlatFile/issues/80
For now we decided to fix the wrong values of the list, something like:
string separator = ",";
//...
//...
//...
records.ForEach(x => {
x.DescriptionC = x.DescriptionC.Replace(separator, "");
});
For our case, anyway, it make not sense to have a character corresponding to the separator as value of that field...
...even if it would be better to have bug fixing of the library

How to write the Child Elements using XDocument

I have revised the question and included the code I have write so far for this question. Below is an example of what the output must look like to be compatible with the VeriFone MX915 Payment terminal. In this specific part, I am trying to send the POS register items to the display.
<TRANSACTION>
<FUNCTION_TYPE>LINE_ITEM</FUNCTION_TYPE>
<COMMAND>ADD</COMMAND>
<COUNTER>1</COUNTER>
<MAC> … </MAC>
<MAC_LABEL>REG2</MAC_LABEL>
<RUNNING_SUB_TOTAL>7.00</RUNNING_SUB_TOTAL>
<RUNNING_TRANS_AMOUNT>7.42</RUNNING_TRANS_AMOUNT>
<RUNNING_TAX_AMOUNT>0.42</RUNNING_TAX_AMOUNT>
<LINE_ITEMS>
<MERCHANDISE>
<UNIT_PRICE>5.00</UNIT_PRICE>
<DESCRIPTION>#1 Combo Meal</DESCRIPTION>
<LINE_ITEM_ID>1695155651</LINE_ITEM_ID>
<EXTENDED_PRICE>5.00</EXTENDED_PRICE>
<QUANTITY>1</QUANTITY>
</MERCHANDISE>
</LINE_ITEMS>
</TRANSACTION>
The SDK supplied by VeriFone has made some of the methods needed to communicate with the device. So the following code has method calls and class level variables that are written but not included in the following example:
/// <summary>
/// Send 1 or more items to the Verifone display. Include subtotal, tax and total
/// </summary>
/// <param name="nSubTotal">Subtotal of transaction</param>
/// <param name="nTax">Current Tax of transaction</param>
/// <param name="nTotal">Total of transaction</param>
/// <param name="UPC">Item Code</param>
/// <param name="ShortDescription">Small description</param>
/// <param name="nItemAmount">Item Amt</param>
/// <param name="nQty">Quantity</param>
/// <param name="nExtendedAmount">Quantity X Item Amt</param>
/// <returns></returns>
public bool AddLineItem(double nSubTotal, double nTax, double nTotal, Int32 nItemID, string UPC, string ShortDescription, double nItemAmount, Int32 nQty, double nExtendedAmount)
{
// get counter and calculate Mac
var nextCounter = (++counter).ToString();
var mac = PrintMacAsBase64(macKey, nextCounter);
// build request
var request = new XDocument();
using (var writer = request.CreateWriter())
{
//populate the elements
writer.WriteStartDocument();
writer.WriteStartElement("TRANSACTION");
writer.WriteElementString("FUNCTION_TYPE", "LINE_ITEM");
writer.WriteElementString("COMMAND", "ADD");
writer.WriteElementString("MAC_LABEL", macLabel);
writer.WriteElementString("COUNTER", nextCounter);
writer.WriteElementString("MAC", mac);
writer.WriteElementString("RUNNING_SUB_TOTAL",nSubTotal.ToString("c"));
writer.WriteElementString("RUNNING_TAX_AMOUNT",nTax.ToString("c"));
writer.WriteElementString("RUNNING_TRANS_AMOUNT",nTotal.ToString("c"));
//HERE IS WHERE I NEED TO WRITE THE CHILD ELEMENT(s):
//example below of what they or it would look like
//(collection of merchandise elements under Line_items)
//I know this example will have only one item because of parameters
/*
<LINE_ITEMS>
<MERCHANDISE>
<LINE_ITEM_ID>10</LINE_ITEM_ID>
<DESCRIPTION>This is a dummy</DESCRIPTION>
<UNIT_PRICE>1.00</UNIT_PRICE>
<QUANTITY>1</QUANTITY>
<EXTENDED_PRICE>1.00</EXTENDED_PRICE>
</MERCHANDISE>
</LINE_ITEMS>
*/
writer.WriteEndElement();
writer.WriteEndDocument();
}
// transmit to Point Solution and interrogate the response
var responseXml = Send(address, port, request);
//DO SOMETHING HERE WITH THE RESPONSE
// validate that the RESULT_CODE came back a SUCCESS
if ("-1" != responseXml.Element("RESPONSE").Element("RESULT_CODE").Value)
{
throw new Exception(responseXml.Element("RESPONSE").Element("RESULT_TEXT").Value ?? "unknown error");
}
return true;
}
If any one can help me understand how to populate the child elements as indicated where I put comments in the code I will be very thankful.
Thanks to the moderators for requesting more information on this.
I think using XmlSerializer is a good idea, maybe codes below could help you.
XmlSerializer is more intuitive than XmlWriter to generate Xml. : )
using System;
using System.Collections.Generic;
using System.Xml.Serialization;
class Program
{
static void Main(string[] args)
{
Transaction transaction = new Transaction();
transaction.Function_Type = "LINE_ITEM";
transaction.LineItems = new List<Merchandise>();
transaction.LineItems.Add(new Merchandise() { UnitPrice = "5.00" });
//Create our own namespaces for the output
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
//Add an empty namespace and empty value
ns.Add("", "");
using (XmlWriter writer = XmlWriter.Create(Console.Out, new XmlWriterSettings { OmitXmlDeclaration = true }))
{
new XmlSerializer(typeof(Transaction)).Serialize(writer, transaction, ns);
}
Console.Read();
}
}
[XmlRoot("TRANSACTION")]
public class Transaction
{
[XmlElement("FUNCTION_TYPE")]
public string Function_Type { get; set; }
[XmlArray("LINE_ITEMS")]
[XmlArrayItem("MERCHANDISE")]
public List<Merchandise> LineItems { get; set; }
}
[XmlRoot("MERCHANDISE")]
public class Merchandise
{
[XmlElement("UNIT_PRICE")]
public string UnitPrice { get; set; }
}
Results:
You could use XmlDocument
XmlDocument xmlDoc = new XmlDocument();
XmlNode rootNode = xmlDoc.CreateElement("RootNode");
xmlDoc.AppendChild(rootNode);
foreach (Class objItem in classArray)
{
XmlNode firstNode= xmlDoc.CreateElement("First");
XmlNode second= xmlDoc.CreateElement("Second");
second.InnerText = objItem .Text;
firstNode.AppendChild(second);
rootNode.AppendChild(firstNode);
}
StringWriter stringWriter = new StringWriter();
XmlTextWriter textWriter = new XmlTextWriter(stringWriter);
xmlDoc.WriteTo(textWriter);

Troubles with HtmlAgilityPack

I can't figure out what goes wrong. i just create the poject to test HtmlAgilityPack and what i've got.
using System;
using System.Collections.Generic;
using System.Text;
using HtmlAgilityPack;
namespace parseHabra
{
class Program
{
static void Main(string[] args)
{
HTTP net = new HTTP(); //some http wraper
string result = net.MakeRequest("http://stackoverflow.com/", null);
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(result);
//Get all summary blocks
HtmlNodeCollection news = doc.DocumentNode.SelectNodes("//div[#class=\"summary\"]");
foreach (HtmlNode item in news)
{
string title = String.Empty;
//trouble is here for each element item i get the same value
//all the time
title = item.SelectSingleNode("//a[#class=\"question-hyperlink\"]").InnerText.Trim();
Console.WriteLine(title);
}
Console.ReadLine();
}
}
}
It looks like i make xpath not for each node i've selected but to whole document. Any suggestions why it so ? Thx in advance.
I have not tried your code, but from the quick look I suspect the problem is that the // is searching from the root of the entire document and not the root of the current element as I guess you are expecting.
Try putting a . before the //
".//a[#class=\"question-hyperlink\"]"
I'd rewrite your xpath as a single query to find all the question titles, rather than finding the summaries then the titles. Chris' answer points out the problem which could have easily been avoided.
var web = new HtmlWeb();
var doc = web.Load("http://stackoverflow.com");
var xpath = "//div[starts-with(#id,'question-summary-')]//a[#class='question-hyperlink']";
var questionTitles = doc.DocumentNode
.SelectNodes(xpath)
.Select(a => a.InnerText.Trim());

Categories