Loop through large XML file using XDocument

Loop through large XML file using XDocument - c#

I have to copy nodes from an existing XML file to a newly created XML file.
I'm using an XDocument instance to access the existing XML file. The problem is the XML file can be quite large (lets say 500K lines; Openstreetmap data).
What would be the best way to loop through large XML files without causing memory errors?
I currently just use XDocument.Load(path) and loop through doc.Descendants(), but this causes the program to freeze until the loop is done. So I think I have to loop async, but I don't know the best way to achieve this.

You can use XmlReader and IEnumerable<XElement> iterator to yield elements you need.
This approach isn't asynchronous but it saves memory, because you don't need load whole file in the memory for handling. Only elements you select to copy.
public IEnumerable<XElement> ReadFile(string pathToTheFile)
{
using (XmlReader reader = XmlReader.Create(pathToTheFile))
{
reader.MoveToContent();
while (reader.Read())
{
If (reader.NodeType == XmlNodeType.Element)
{
if (reader.Name.Equals("yourElementName"))
{
XElement element = XElement.ReadFrom(reader) as XElement;
yield return element ;
}
}
}
}
}
You can read files asynchronously
public async Task<IEnumerable<XElement>> ReadFileAsync(string pathToTheFile)
{
var elements = new List<XElement>();
var xmlSettings = new XmlReaderSettings { Async = true };
using (XmlReader reader = XmlReader.Create(pathToTheFile, xmlSettings))
{
await reader.MoveToContentAsync();
while (await reader.ReadAsync())
{
If (reader.NodeType == XmlNodeType.Element)
{
if (reader.Name.Equals("yourElementName"))
{
XElement element = XElement.ReadFrom(reader) as XElement;
elements.Add(element);
}
}
}
}
return elements;
}
Then you can loop all files asynchronously and await for the result
var fileTask1 = ReadFileAsync(filePath1);
var fileTask2 = ReadFileAsync(filePath2);
var fileTask3 = ReadFileAsync(filePath3);
await Task.WhenAll(new Task[] { fileTask1, fileTask2, fileTask3} );
// use results
var elementsFromFile1 = fileTask1.Result;

Related

twice reading xml file with linq

I am trying to read xml file. and then extract some useful data to draw graphs.. I have achieved the desired output.. But the problem is my program is twice reading the xml file to extract the useful data.. This takes some extra time. Is there some other way to read the file once only. ? Thanks
<?xml version="1.0" encoding="UTF-8"?>
<CanConformanceTesterLog Version="4.1">
<TestProperties>
<Item name="IUT Name" value="Reference"/>
<Item name="PG Clock Period" value="1000 ns"/>
</TestProperties>
<SignalData SamplingPeriod="1000.000 ns" DataWidth="16 bit">
<Signal>
<Id>IUT_RX</Id>
<InitState>1</InitState>
<![CDATA[HQFPAVkBiwGVAZ8BqQHHAdEBAwINAjUCPwJxAnsCrQK3AsEC1QLzAv0CEQMbAzkDTQNrA3UDfwOJA7sDxQPtA/cDKQQzBEcEUQSDBI0EtQTJBN0E5wTxBAUFDwUZBS0FNwVBBUsFVQWHFZEVmxWlFa8VuRXDFc0V1xXhFesV9RX/FTEWOxZFFk8WgRaLFpUWnxapFscW0RbbFuUW7xYDFyEXPxdJF1MXGhgkGC4YTBhWGHQYfhiwGLoY2BjiGBQZHhkoGTIZUBlaGXgZghmgGaoZvhnbGeUZ9RwTHR0dTx1ZHYsdlR29Hccd+R0DHg0eFx5JHlMeZx6ZHsEe6R4lH5Qfsh+8H+4f+B8qIDQgXCBmIJggoiCsILYg6CDyIAYhOCFgIYghxCEzIlEiWyKNIpciySLTIvsiBSM3I0EjSyNVI4cjmyOlI9cj/yMTJB0kpyaxJrsmxSbPJgEnCyc9J0cnZSdvJ6EnqyfdJ+cnGSgjKC0oQShLKF8ocyiHKJsopSivKLkowyjWKOAo8jQkNS41YDVqNZw1pjXENc41ADYKNhQ2HjZGNlA2WjZkNm42eDaWNqo2tDbHNtE2uDd=]]>
</Signal>
<Signal>
<Id>IUT_TX</Id>
<InitState>1</InitState>
<![CDATA[SwVVBYcVkRWbFaUVrxW5FcMVzRXXFeEV6xX1Ff8VMRY7FkUWTxaBFosWlRafFqkWxxbRFtsW5RbvFgMXIRc/FxoYJBguGEwYVhh0GH4YsBi6GNgY4hgUGR4ZKBkyGVAZWhl4GYIZoBmqGb4Z6B4kH4ghxCETJB0kpyaxJrsmxSbPJgEnCyc9J0cnZSdvJ6EnqyfdJ+cnGSgjKC0oQShLKF8ocyiHKJsopSivKLkowyjyNCQ1LjVgNWo1nDWmNcQ1zjUANgo2FDYeNkY2UDZaNmQ2bjZ4NpY2qja0Nrg3]]>
</Signal>
</SignalData>
</CanConformanceTesterLog>
I have function that reads the data of tag "SignalData".then after reading this data it calls another function and pass the name of xml file,dataWidth,samplingPeriod.
The second function then reads "Signal" tag.. and then extract the data from every "Signal". Finally when everything is done then a function is called to draw the graphs...
private bool SignalDataInfo(string fileName)
{
var xdoc = XDocument.Load(fileName);
if (xdoc != null)
{
var signalData = xdoc.Descendants("SignalData");
foreach (var signal in signalData)
{
var width = signal.Attribute("DataWidth").Value;
string dataWidth = width.Substring(0, width.IndexOf(" "));
var period = signal.Attribute("SamplingPeriod").Value;
string samplingPeriod = period.Substring(0, period.IndexOf(" "));
SignalData(fileName,dataWidth, samplingPeriod);
}
return true;
}
else
return false;
}
public bool SignalData(string fileName,string width, string period)
{var xdoc = XDocument.Load(fileName);
if (xdoc != null)
{
var signalData = xdoc.Descendants("Signal");
foreach (var signal in signalData)
{ // extract data from every signal }
return true;
else false;
}

Here I create a sample console app for you to extract data from both of your SignalData and Signal.
I think you would be in search of code like below.
In below code snippet you would use result in your program where you want to read data inside xml.
So by this way you don't need to write two different methods and load your xml each time when your methods called.
class Program
{
static void Main(string[] args)
{
XDocument doc = XDocument.Load(#"Your xml document path");
var result = (from o in doc.Descendants("SignalData")
from i in o.Descendants("Signal")
select new
{
dataWidth = o.Attribute("DataWidth").Value.Substring(0, o.Attribute("DataWidth").Value.IndexOf(" ")),
period = o.Attribute("SamplingPeriod").Value.Substring(0, o.Attribute("SamplingPeriod").Value.IndexOf(" ")),
Id = i.Elements("Id").Select(item => (string)item).FirstOrDefault(),
InitState = i.Elements("InitState").Select(item => (string)item).FirstOrDefault(),
cdata = i.Value
}).ToList();
foreach (var item in result)
{
Console.WriteLine($"dataWidth: { item.dataWidth}, \t period: {item.period}, \t Id: {item.Id}, \t InitState: {item.InitState}");
}
Console.ReadLine();
}
}
Output: cdata excluded from output.

You can just load the XML file once by saving the XDocument that was loaded in a class variable (say private XDocument xDoc;) instead of creating an instance on each method. Also, it would help if you just retrieve the XML data separately, in this case, the loadData() method which you can use to initialize the data once. This will also give your code, somehow, separation of concerns. See code below:
private XDocument xDoc;
private void loadData(string fileName)
{
xDoc = XDocument.Load(fileName);
}
private bool SignalDataInfo()
{
if (xDoc != null)
{
var signalData = xDoc.Descendants("SignalData");
foreach (var signal in signalData)
{
var width = signal.Attribute("DataWidth").Value;
string dataWidth = width.Substring(0, width.IndexOf(" "));
var period = signal.Attribute("SamplingPeriod").Value;
string samplingPeriod = period.Substring(0, period.IndexOf(" "));
SignalData(fileName,dataWidth, samplingPeriod);
}
return true;
}
else
return false;
}
public bool SignalData(string width, string period)
{
if (xDoc != null)
{
var signalData = xDoc.Descendants("Signal");
foreach (var signal in signalData)
{ // extract data from every signal }
return true;
else false;
}
Hope this helps!

Fastest way to parse byte array?

I'm currently trying to parse an XML string to get various datapoints. My code below works but is eating up a ton of CPU usage so I want to optimize it anyway possible.
public static List<Purchase> ParsePurchases(Profile profile, byte[] data)
{
// Parse the profile XML and extract purchases
using (var ms = new MemoryStream(data))
{
using (var reader = new StreamReader(ms, Encoding.UTF8))
{
// read the data into a string
var xmlString = reader.ReadToEnd();
// create the DOM over it
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlString);
var purchaseElements = doc.GetElementsByTagName("purchase");
List<Purchase> purchases = new List<Purchase>();
for(var e = 0; e < purchaseElements.Count; e++)
{
var ele = (XmlElement)purchaseElements[e];
purchases.Add(
new Purchase(
profile,
Int32.Parse(ele.GetAttribute("id")),
Int32.Parse(((XmlElement)ele.GetElementsByTagName("price")[0]).InnerText),
Int32.Parse(((XmlElement)ele.GetElementsByTagName("quantity")[0]).InnerText),
((XmlElement)ele.GetElementsByTagName("description")[0]).InnerText
));
}
return purchases;
}
}
}
My LoadXml call is eating up the most CPU usage, around 44%, and my ReadToEnd call is eating up another 22%. Any ideas of how to optimize this?

How to create a CSV file from a XML file

I am very new at C#. In my project I need to create a csv file which will get data from a xml data. Now, I can get data from XML, and print in looger for some particulaer attributes from xml. But I am not sure how can I store my Data into CSV file for that particular attribues.
Here is my XML file that I need to create a CSV file.
<?xml version="1.0" encoding="utf-8"?>
<tlp:WorkUnits xmlns:tlp="http://www.timelog.com/XML/Schema/tlp/v4_4"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.timelog.com/XML/Schema/tlp/v4_4 http://www.timelog.com/api/xsd/WorkUnitsRaw.xsd">
<tlp:WorkUnit ID="130">
<tlp:EmployeeID>3</tlp:EmployeeID>
<tlp:AllocationID>114</tlp:AllocationID>
<tlp:TaskID>239</tlp:TaskID>
<tlp:ProjectID>26</tlp:ProjectID>
<tlp:ProjectName>LIK Template</tlp:ProjectName>
<tlp:CustomerId>343</tlp:CustomerId>
<tlp:CustomerName>Lekt Corp Inc.</tlp:CustomerName>
<tlp:IsBillable>1</tlp:IsBillable>
<tlp:ApprovedStatus>0</tlp:ApprovedStatus>
<tlp:LastModifiedBy>AL</tlp:LastModifiedBy>
</tlp:WorkUnit>
And my Code where I am getting this value in logger.But I am not sure how can I create a csv file that stores that value in order.
Edited
namespace TimeLog.ApiConsoleApp
{
/// <summary>
/// Template class for consuming the reporting API
/// </summary>
public class ConsumeReportingApi
{
private static readonly ILog Logger = LogManager.GetLogger(typeof(ConsumeReportingApi));
public static void Consume()
{
if (ServiceHandler.Instance.TryAuthenticate())
{
if (Logger.IsInfoEnabled)
{
Logger.Info("Successfully authenticated on reporting API");
}
var customersRaw = ServiceHandler.Instance.Client.GetWorkUnitsRaw(ServiceHandler.Instance.SiteCode,
ServiceHandler.Instance.ApiId,
ServiceHandler.Instance.ApiPassword,
WorkUnit.All,
Employee.All,
Allocation.All,
Task.All,
Project.All,
Department.All,
DateTime.Now.AddDays(-5).ToString(),
DateTime.Now.ToString()
);
if (customersRaw.OwnerDocument != null)
{
var namespaceManager = new XmlNamespaceManager(customersRaw.OwnerDocument.NameTable);
namespaceManager.AddNamespace("tlp", "http://www.timelog.com/XML/Schema/tlp/v4_4");
var workUnit = customersRaw.SelectNodes("tlp:WorkUnit", namespaceManager);
var output = new StringBuilder();
output.AppendLine("AllocationID,ApprovedStatus,CustomerId,CustomerName,EmployeeID");
if (workUnit != null)
{
foreach (XmlNode customer in workUnit)
{
var unit = new WorkUnit();
var childNodes = customer.SelectNodes("./*");
if (childNodes != null)
{
foreach (XmlNode childNode in childNodes)
{
if (childNode.Name == "tlp:EmployeeID")
{
unit.EmployeeID = Int32.Parse(childNode.InnerText);
}
if (childNode.Name == "tlp:EmployeeFirstName")
{
unit.EmployeeFirstName = childNode.InnerText;
}
if (childNode.Name == "tlp:EmployeeLastName")
{
unit.EmployeeLastName = childNode.InnerText;
}
if (childNode.Name == "tlp:AllocationID")
{
unit.AllocationID = Int32.Parse(childNode.InnerText);
}
if (childNode.Name == "tlp:TaskName")
{
unit.TaskName = childNode.InnerText;
}
}
}
output.AppendLine($"{unit.EmployeeID},{unit.EmployeeFirstName},{unit.EmployeeLastName},{unit.AllocationID},{unit.TaskName}");
//Console.WriteLine("---");
}
Console.WriteLine(output.ToString());
File.WriteAllText("c:\\...\\WorkUnits.csv", output.ToString());
}
}
else
{
if (Logger.IsWarnEnabled)
{
Logger.Warn("Failed to authenticate to reporting API");
}
}
}
}
}
}

You want to write the columns in the correct order to the CSV (of course), so you need to process them in the correct order. Two options:
intermediate class
Create a new class (let's call it WorkUnit) with properties for each of the columns that you want to write to the CSV. Create a new instance for every <tlp:WorkUnit> node in your XML and fill the properties when you encounter the correct subnodes. When you have processed the entire WorkUnit node, write out the properties in the correct order.
var output = new StringBuilder();
foreach (XmlNode customer in workUnit)
{
// fresh instance of the class that holds all columns (so all properties are cleared)
var unit = new WorkUnit();
var childNodes = customer.SelectNodes("./*");
if (childNodes != null)
{
foreach (XmlNode childNode in childNodes)
{
if(childNode.Name== "tlp:EmployeeID")
{
// employeeID node found, now write to the corresponding property:
unit.EmployeeId = childNode.InnerText;
}
// etc for the other XML nodes you are interested in
}
// all nodes have been processed for this one WorkUnit node
// so write a line to the CSV
output.AppendLine($"{unit.EmployeeId},{unit.AllocationId}, etc");
}
read in correct order
Instead of using foreach to loop through all subnodes in whatever order they appear, search for specific subnodes in the order you want. Then you can write out the CSV in the same order. Note that even when you don't find some subnode, you still need to write out the separator.
var output = new StringBuilder();
foreach (XmlNode customer in workUnit)
{
// search for value for first column (EmployeeID)
var node = workUnit.SelectSingleNode("tlp:EmployeeID");
if (node != null)
{
output.Append(node.InnerText).Append(',');
}
else
{
output.Append(','); // no content, but we still need a separator
}
// etc for the other columns
And of course watch out for string values that contain the separator.

Assuming that you put your XML data into List
StringBuilder str = new StringBuilder();
foreach (var fin list.ToList())
{
str.Append(fin.listfield.ToString() + ",");
}
to create a new line:
str.Replace(",", Environment.NewLine, str.Length - 1, 1);
to save:
string filename=(DirectoryPat/filename.csv");
File.WriteAllText(Filename, str.ToString());

Try this:
var output = new StringBuilder();
output.AppendLine("AllocationID,ApprovedStatus,CustomerId,CustomerName,EmployeeID");
if (workUnit != null)
{
foreach (XmlNode customer in workUnit)
{
var unit = new WorkUnit();
var childNodes = customer.SelectNodes("./*");
if (childNodes != null)
{
for (int i = 0; i<childNodes.Count; ++i)
{
XmlNode childNode = childNodes[i];
if (childNode.Name == "tlp:EmployeeID")
{
unit.EmployeeID = Int32.Parse(childNode.InnerText);
}
if (childNode.Name == "tlp:EmployeeFirstName")
{
unit.EmployeeFirstName = childNode.InnerText;
}
if (childNode.Name == "tlp:EmployeeLastName")
{
unit.EmployeeLastName = childNode.InnerText;
}
if (childNode.Name == "tlp:AllocationID")
{
unit.AllocationID = Int32.Parse(childNode.InnerText);
}
if (childNode.Name == "tlp:TaskName")
{
unit.TaskName = childNode.InnerText;
}
output.Append(childNode.InnerText);
if (i<childNodes.Count - 1)
output.Append(",");
}
output.Append(Environment.NewLine);
}
}
Console.WriteLine(output.ToString());
File.WriteAllText("c:\\Users\\mnowshin\\projects\\WorkUnits.csv", output.ToString());
}

You can use this sequence:
a. Deserialize (i.e. convert from XML to C# objects) your XML.
b. Write a simple loop to write the data to a file.
The advantages of this sequence:
You can use a list of your data/objects "readable" that you can add any other access code to it.
If you XML schema changed at any time, you can maintain the code very easily.
The solution
a. Desrialize:
Copy you XML file contents. Note You should modify your XML input before coping it.. You should double the WorkUnit node, in order to tell Visual Studio that you would have a list of this node nested inside WorkUnits node.
From Visual Studio Menus select Edit -> Paste Special -> Paste XML as Classes.
Use the deserialize code.
var workUnitsNode = customersRaw.SelectSingleNode("tlp:WorkUnits", namespaceManager);
XmlSerializer ser = new XmlSerializer(typeof(WorkUnits));
WorkUnits workUnits = (WorkUnits)ser.Deserialize(workUnitsNode);
b. Write the csv file
StringBuilder csvContent = new StringBuilder();
// add the header line
csvContent.AppendLine("AllocationID,ApprovedStatus,CustomerId,CustomerName,EmployeeID");
foreach (var unit in workUnits.WorkUnit)
{
csvContent.AppendFormat(
"{0}, {1}, {2}, {3}, {4}",
new object[]
{
unit.AllocationID,
unit.ApprovedStatus,
unit.CustomerId,
unit.CustomerName,
unit.EmployeeID
// you get the idea
});
csvContent.AppendLine();
}
File.WriteAllText(#"G:\Projects\StackOverFlow\WpfApp1\WorkUnits.csv", csvContent.ToString());

You can use Cinchoo ETL - if you have room to use open source library
using (var csvWriter = new ChoCSVWriter("sample1.csv").WithFirstLineHeader())
{
using (var xmlReader = new ChoXmlReader("sample1.xml"))
csvWriter.Write(xmlReader);
}
Output:
ID,tlp_EmployeeID,tlp_AllocationID,tlp_TaskID,tlp_ProjectID,tlp_ProjectName,tlp_CustomerId,tlp_CustomerName,tlp_IsBillable,tlp_ApprovedStatus,tlp_LastModifiedBy
130,3,114,239,26,LIK Template,343,Lekt Corp Inc.,1,0,AL
Disclaimer: I'm the author of this library.

Read element within elements using XmlTextReader

I am reading XML data and retrieving the values based on the element. There is one element named <UniqueColumns> which can have child element called <string>. I want to read those values and add it to the ObservableCollection<String>. If there are no values then don't anything. There are three scenarios as following:
Scenario - 1: More than 1 child elements.
<IndexId>4</IndexId>
<UniqueColumns>
<string>Dir_nbr</string>
<string>Dir_name</string>
</UniqueColumns>
<SelectedTableForUniqColumn>TBDirectory</SelectedTableForUniqColumn>
Scenario - 2: Only one child element.
<IndexId>4</IndexId>
<UniqueColumns>
<string>Dir_nbr</string>
</UniqueColumns>
<SelectedTableForUniqColumn>TBDirectory</SelectedTableForUniqColumn>
Scenario - 3: No child element.
<IndexId>4</IndexId>
<UniqueColumns/>
<SelectedTableForUniqColumn>TBDirectory</SelectedTableForUniqColumn>
Code:
//This is a user defined data object and it has parameter which is type of `ObservableCollection<String>`.
ExternalDatabaseTableRequestDO req = new ExternalDatabaseTableRequestDO();
using (XmlTextReader reader = new XmlTextReader(new StringReader(xmlData)))
{
while (reader.Read())
{
int result;
long res;
string parameterValue;
ObservableCollection<String> parameterValueList = new ObservableCollection<String>();
switch (reader.Name.ToLower())
{
case "indexid":
parameterValue = reader.ReadString();
if (!String.IsNullOrWhiteSpace(parameterValue) && Int32.TryParse(parameterValue, out result))
req.IndexId = result;
break;
case "uniquecolumns":
//need loop logic here but not sure how to do that.
if (reader.NodeType == XmlNodeType.Element) // This will give me parent element which is <UniqueColumns>
{
//Stuck here. How to read child elements if exists.
}
break;
case "selectedtableforuniqcolumn":
parameterValue = reader.ReadString();
req.SelectedTableForUniqColumn = parameterValue;
break;
}
}
}
return req;

How about using Linq2Xml?
var xDoc = XDocument.Load(filename);
//var xDoc = XDocument.Parse(xmlstring);
var strings = xDoc.XPathSelectElements("//UniqueColumns/string")
.Select(x => x.Value)
.ToList();

//This will give me all child element values if they exists
var columnList = XElement.Parse(xmlData).Descendants("string").ToList();
if (columnList != null)
{
foreach (var column in columnList)
parameterValueList.Add(column.Value);
}

Retrieving Data From XML File

I seem to be having a problem with retrieving XML values with C#, which I know it is due to my very limited knowledge of C# and .XML.
I was given the following XML file
<PowerBuilderRunTimes>
<PowerBuilderRunTime>
<Version>12</Version>
<Files>
<File>EasySoap110.dll</File>
<File>exPat110.dll</File>
<File>pbacc110.dll</File>
</File>
</PowerBuilderRunTime>
</PowerBuilderRunTimes>
I am to process the XML file and make sure that each of the files in the exist in the folder (that's the easy part). It's the processing of the XML file that I have having a hard time with. Here is what I have done thus far:
var runtimeXml = File.ReadAllText(string.Format("{0}\\{1}", configPath, Resource.PBRuntimes));
var doc = XDocument.Parse(runtimeXml);
var topElement = doc.Element("PowerBuilderRunTimes");
var elements = topElement.Elements("PowerBuilderRunTime");
foreach (XElement section in elements)
{
//pbVersion is grabbed earlier. It is the version of PowerBuilder
if( section.Element("Version").Value.Equals(string.Format("{0}", pbVersion ) ) )
{
var files = section.Elements("Files");
var fileList = new List<string>();
foreach (XElement area in files)
{
fileList.Add(area.Element("File").Value);
}
}
}
My issue is that the String List is only ever populated with one value, "EasySoap110.dll", and everything else is ignored. Can someone please help me, as I am at a loss.

Look at this bit:
var files = section.Elements("Files");
var fileList = new List<string>();
foreach (XElement area in files)
{
fileList.Add(area.Element("File").Value);
}
You're iterating over each Files element, and then finding the first File element within it. There's only one Files element - you need to be iterating over the File elements within that.
However, there are definitely better ways of doing this. For example:
var doc = XDocument.Load(Path.Combine(configPath, Resource.PBRuntimes));
var fileList = (from runtime in doc.Root.Elements("PowerBuilderRunTime")
where (int) runtime.Element("Version") == pbVersion
from file in runtime.Element("Files").Elements("File")
select file.Value)
.ToList();
Note that if there are multiple matching PowerBuilderRunTime elements, that will create a list with all the files of all those elements. That may not be what you want. For example, you might want:
var doc = XDocument.Load(Path.Combine(configPath, Resource.PBRuntimes));
var runtime = doc.Root
.Elements("PowerBuilderRunTime")
.Where(r => (int) r.Element("Version") == pbVersion)
.Single();
var fileList = runtime.Element("Files")
.Elements("File")
.Select(x => x.Value)
.ToList();
That will validate that there's exactly one matching runtime.

The problem is, there's only one element in your XML, with multiple children. You foreach loop only executes once, for the single element, not for its children.
Do something like this:
var fileSet = files.Elements("File");
foreach (var file in fileSet) {
fileList.Add(file.Value);
}
which loops over all children elements.

I always preferred using readers for reading homegrown XML config files. If you're only doing this once it's probably over kill, but readers are faster and cheaper.
public static class PowerBuilderConfigParser
{
public static IList<PowerBuilderConfig> ReadConfigFile(String path)
{
IList<PowerBuilderConfig> configs = new List<PowerBuilderConfig>();
using (FileStream stream = new FileStream(path, FileMode.Open))
{
XmlReader reader = XmlReader.Create(stream);
reader.ReadToDescendant("PowerBuilderRunTime");
do
{
PowerBuilderConfig config = new PowerBuilderConfig();
ReadVersionNumber(config, reader);
ReadFiles(config, reader);
configs.Add(config);
reader.ReadToNextSibling("PowerBuilderRunTime");
} while (reader.ReadToNextSibling("PowerBuilderRunTime"));
}
return configs;
}
private static void ReadVersionNumber(PowerBuilderConfig config, XmlReader reader)
{
reader.ReadToDescendant("Version");
string version = reader.ReadString();
Int32 versionNumber;
if (Int32.TryParse(version, out versionNumber))
{
config.Version = versionNumber;
}
}
private static void ReadFiles(PowerBuilderConfig config, XmlReader reader)
{
reader.ReadToNextSibling("Files");
reader.ReadToDescendant("File");
do
{
string file = reader.ReadString();
if (!string.IsNullOrEmpty(file))
{
config.AddConfigFile(file);
}
} while (reader.ReadToNextSibling("File"));
}
}
public class PowerBuilderConfig
{
private Int32 _version;
private readonly IList<String> _files;
public PowerBuilderConfig()
{
_files = new List<string>();
}
public Int32 Version
{
get { return _version; }
set { _version = value; }
}
public ReadOnlyCollection<String> Files
{
get { return new ReadOnlyCollection<String>(_files); }
}
public void AddConfigFile(String fileName)
{
_files.Add(fileName);
}
}

Another way is to use a XmlSerializer.
[Serializable]
[XmlRoot]
public class PowerBuilderRunTime
{
[XmlElement]
public string Version {get;set;}
[XmlArrayItem("File")]
public string[] Files {get;set;}
public static PowerBuilderRunTime[] Load(string fileName)
{
PowerBuilderRunTime[] runtimes;
using (var fs = new FileStream(fileName, FileMode.Open, FileAccess.Read))
{
var reader = new XmlTextReader(fs);
runtimes = (PowerBuilderRunTime[])new XmlSerializer(typeof(PowerBuilderRunTime[])).Deserialize(reader);
}
return runtimes;
}
}
You can get all the runtimes strongly typed, and use each PowerBuilderRunTime's Files property to loop through all the string file names.
var runtimes = PowerBuilderRunTime.Load(string.Format("{0}\\{1}", configPath, Resource.PBRuntimes));

You should try replacing this stuff with a simple XPath query.
string configPath;
System.Xml.XPath.XPathDocument xpd = new System.Xml.XPath.XPathDocument(cofigPath);
System.Xml.XPath.XPathNavigator xpn = xpd.CreateNavigator();
System.Xml.XPath.XPathExpression exp = xpn.Compile(#"/PowerBuilderRunTimes/PwerBuilderRunTime/Files//File");
System.Xml.XPath.XPathNodeIterator iterator = xpn.Select(exp);
while (iterator.MoveNext())
{
System.Xml.XPath.XPathNavigator nav2 = iterator.Current.Clone();
//access value with nav2.value
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Loop through large XML file using XDocument - c#

Related

twice reading xml file with linq

Fastest way to parse byte array?

How to create a CSV file from a XML file

Read element within elements using XmlTextReader

Retrieving Data From XML File

Categories

Resources