Issue with processing an xml file in c# - c#

I have a 5gb xml file which need to be processed. So, I used XMLReader, but I am having hard time with processing files.
I have following part and I want to take the values of levelid,levelUl,primaryCode,primaryPower from sections coming under <ab:pin id="1022">,<ab:pin id="1023">,<ab:pin id="1024"> etc. But the problem I am now facing is there are different sections having the same element names like levelid,levelUl,primaryCode,primaryPower etc. with different values and I am getting incorrect values.
How do I correct to correct my code? Following is part of 5 GB xml file
<ab:pin id="1022">
<ab:attributes>
<ab:levelid>1022</ab:levelid>
<ab:levelUl>9837</ab:levelUl>
<ab:primaryCode>25</ab:primaryCode>
<ab:primaryPower>330</ab:primaryPower>
.
.
.
.
<ab:pin id="1023">
<ab:attributes>
<ab:levelid>1023</ab:levelid>
<ab:levelUl>9833</ab:levelUl>
<ab:primaryCode>35</ab:primaryCode>
<ab:primaryPower>340</ab:primaryPower>
Following is the code what i have done
XmlReader reader = XmlReader.Create(path);
reader.MoveToContent();
string nsUn = reader.LookupNamespace("ab");
while (!reader.EOF)
{
reader.ReadToFollowing("levelid", nsUn);
if (!reader.EOF)
{
XElement cell = (XElement)XElement.ReadFrom(reader);
level_id = cell.Value;
ins3gericson.Add(new TestField("level_id", level_id, 2));
}
reader.ReadToFollowing("levelUl", nsUn);
if (!reader.EOF)
{
XElement cell = (XElement)XElement.ReadFrom(reader);
ins3gericson.Add(new TestField("levelUl", cell.Value, 2));
}
reader.ReadToFollowing("primaryCode", nsUn);
if (!reader.EOF)
{
XElement cell = (XElement)XElement.ReadFrom(reader);
ins3gericson.Add(new TestField("primaryCode", cell.Value, 2));
}
reader.ReadToFollowing("primaryPower", nsUn);
if (!reader.EOF)
{
XElement cell = (XElement)XElement.ReadFrom(reader);
ins3gericson.Add(new TestField("primaryPower", cell.Value, 2));
}

Here is my suggestion:
using (XmlReader xr = XmlReader.Create("input.xml"))
{
xr.MoveToContent();
XNamespace ab = xr.LookupNamespace("ab");
while (xr.Read())
{
if (xr.NodeType == XmlNodeType.Element && xr.NamespaceURI == ab && xr.LocalName == "pin")
{
XElement pin = (XElement)XNode.ReadFrom(xr);
var data = from atts in pin.Elements(ab + "attributes") select new {
levelid = (string)atts.Element(ab + "levelid"),
levelUl = (string)atts.Element(ab + "levelUl"),
primaryCode = (string)atts.Element(ab + "primaryCode"),
primaryPower = (string)atts.Element(ab + "primaryPower")
};
Console.WriteLine("levelId: {0}; levelUl: {1}, ...", data.First().levelid, data.First().levelUl); // store/output values here
}
}
}
Obviously it all depends on how large the ab:pin elements are, but normally with huge XML input the individual elements fit well into memory. And be careful with the XmlReader, if you have adjacent ab:pin elements without any whitespace, then the above might skip an element, so there the code would need some more finetuning, along the lines of
using (XmlReader xr = XmlReader.Create("../../XMLFile1.xml"))
{
xr.MoveToContent();
XNamespace ab = xr.LookupNamespace("ab");
while (xr.Read())
{
while (xr.NodeType == XmlNodeType.Element && xr.NamespaceURI == ab && xr.LocalName == "pin")
{
XElement pin = (XElement)XNode.ReadFrom(xr);
var data = from atts in pin.Elements(ab + "attributes") select new {
levelid = (string)atts.Element(ab + "levelid"),
levelUl = (string)atts.Element(ab + "levelUl"),
primaryCode = (string)atts.Element(ab + "primaryCode"),
primaryPower = (string)atts.Element(ab + "primaryPower")
};
Console.WriteLine("levelId: {0}; levelUl: {1}, ...", data.First().levelid, data.First().levelUl);
}
}
}

Related

How to read really big KML File

So I have code that successfully reads in a kml file using XDocumnet, however, as the kml file is now significantly larger, I can't run the program without getting a System out of memory exception. Can anyone help me convert this code to a different reader, so I no longer get this exception
Here's what worked when the kml file was smaller:
Dictionary<string, List<string>> CountyCoordinates = new Dictionary<string, List<string>>();
List<string> locationList = new List<string>();
var doc = XDocument.Load("gadm36_USA.kml");
XNamespace ns = "http://www.opengis.net/kml/2.2";
var result = doc.Root.Descendants(ns + "Placemark");
List<XElement> extendedDatas = doc.Descendants(ns + "ExtendedData").ToList();
foreach (XElement extendedData in extendedDatas)
{
List<XElement> simpleFields = extendedData.Elements(ns + "SimpleField").ToList();
}
foreach (XElement xmlInfo in result)
{
var region = xmlInfo.Element(ns + "ExtendedData").Element(ns + "SchemaData").Value;
List<XElement> simpleFields = xmlInfo.Element(ns + "ExtendedData").Element(ns + "SchemaData").Elements(ns + "SimpleField").ToList();
//var country = region.Element(ns + "SimpleData").Value;
//var state = region.Element(ns + "SimpleData");
//var cityCounty = region.Element(ns + "SimpleData");
//<Polygon><outerBoundaryIs><LinearRing><coordinates>
locationList = xmlInfo.Element(ns + "MultiGeometry").Element(ns + "Polygon").Element(ns + "outerBoundaryIs").Element(ns + "LinearRing").Element(ns + "coordinates").Value.Split(' ').ToList();
region = region.ToLower();
region = Regex.Replace(region, #"[^\w]", string.Empty);
CountyCoordinates.Add(region, locationList);
}
return CountyCoordinates;
Here is an example of the klm file:
<Placemark>
<Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
<ExtendedData><SchemaData schemaUrl="#gadm36_USA_2">
<SimpleData name="NAME_0">United States</SimpleData>
<SimpleData name="NAME_1">Arizona</SimpleData>
<SimpleData name="NAME_2">Pima</SimpleData>
</SchemaData></ExtendedData>
<MultiGeometry><Polygon><outerBoundaryIs><LinearRing><coordinates>-112.539321899414,31.7949981689454 -112.604850769043,31.8155326843262 -112.626457214355,31.8217906951905 -112.635643005371,31.8249397277833 -112.753486633301,31.8611507415771 -112.9423828125,31.9185504913331 -113.081657409668,31.9619903564454 -113.08381652832,31.9624309539794 -113.33349609375,32.0400009155274 -113.333587646484,32.0450096130371 -113.333930969238,32.0983505249024 -113.333992004394,32.1020011901855 -113.33381652832,32.3513298034668 -113.333892822266,32.4205894470215 -113.333686828613,32.5053520202638 -113.14338684082,32.5051498413086 -113.133605957031,32.5052490234376 -112.963966369629,32.504539489746 -112.929733276367,32.5048217773438 -112.872100830078,32.5048294067383 -112.572570800781,32.5058212280275 -112.202919006348,32.507080078125 -112.099632263184,32.5076789855958 -111.793571472168,32.5066299438478 -111.789756774902,32.5066184997559 -111.756599426269,32.506549835205 -111.73974609375,32.5069694519043 -111.721809387207,32.5064697265625 -111.672340393066,32.5063400268555 -111.65495300293,32.5067405700683 -111.587532043457,32.5069808959962 -111.567413330078,32.5069007873536 -111.567436218262,32.5018920898438 -111.471229553223,32.5019416809083 -111.464157104492,32.5018997192383 -111.446762084961,32.5022697448731 -111.262466430664,32.5030517578125 -111.222785949707,32.5027809143066 -111.20539855957,32.5026588439942 -111.154846191406,32.5027503967285 -111.154747009277,32.5114097595215 -111.098213195801,32.5118904113769 -111.063407897949,32.512062072754 -110.950866699219,32.5124397277833 -110.854629516602,32.5119590759278 -110.840507507324,32.5113601684571 -110.840469360352,32.513641357422 -110.756187438965,32.5145797729493 -110.704528808594,32.5144500732423 -110.694198608398,32.5143318176271 -110.684951782227,32.5146789550781 -110.548492431641,32.5139198303223 -110.448432922363,32.5144195556642 -110.448112487793,32.474781036377 -110.447952270508,32.4273910522462 -110.447196960449,32.270149230957 -110.446952819824,32.2551002502443 -110.443702697754,32.2550506591798 -110.44425201416,32.1711921691896 -110.444358825684,32.1657218933105 -110.446006774902,32.0804901123048 -110.448188781738,32.0800704956055 -110.448440551758,32.0668487548828 -110.448196411133,32.05179977417 -110.447708129883,31.9929695129395 -110.446952819824,31.9209003448487 -110.446708679199,31.9053897857667 -110.448127746582,31.7758693695069 -110.448387145996,31.7621917724611 -110.448165893555,31.7462196350099 -110.448455810547,31.7307109832764 -110.534072875977,31.7309474945069 -110.616989135742,31.7306327819824 -110.66438293457,31.7303009033204 -110.669212341309,31.7308082580568 -110.68376159668,31.7305297851564 -110.690216064453,31.7306003570557 -110.704216003418,31.7307624816895 -110.794136047363,31.7308502197266 -110.852279663086,31.7310104370118 -110.851821899414,31.7255306243896 -110.871200561523,31.7257328033448 -110.890579223633,31.7254600524903 -110.955726623535,31.7247200012206 -111.003646850586,31.7247009277344 -111.161399841309,31.7241706848145 -111.161209106445,31.6388511657716 -111.161590576172,31.5507926940918 -111.159545898438,31.5402812957765 -111.16081237793,31.522029876709 -111.263381958008,31.5218391418457 -111.298286437988,31.5216102600098 -111.365417480469,31.5211029052736 -111.364036560059,31.4234199523926 -111.406181335449,31.4369182586671 -111.440193176269,31.4478206634521 -111.448059082031,31.4503402709962 -111.534820556641,31.4781227111816 -111.618415832519,31.5049114227295 -111.687202453613,31.5267524719238 -111.700218200684,31.5308895111085 -111.769149780273,31.5527782440186 -111.952102661133,31.6104793548585 -112.122657775879,31.664270401001 -112.223472595215,31.6960582733155 -112.539321899414,31.7949981689454</coordinates></LinearRing></outerBoundaryIs></Polygon></MultiGeometry>
</Placemark>
Edit so this is what I've done, this successfully does what I want, but it is very computationally expensive and often ends up using 100% of my cpu and crashing the program
public Dictionary<string, List<string>> LoadZipBoundaries()
{
Dictionary<string, List<string>> CountyCoordinates = new Dictionary<string, List<string>>();
List<string> locationList = new List<string>();
string zip = null;
using (XmlReader reader = XmlReader.Create("utah_zcta.kml"))
{
reader.MoveToContent();
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.Element)
{
if (reader.Name == "SchemaData")
{
XElement el = XNode.ReadFrom(reader) as XElement;
zip = el.Value.Replace("\n\t\t", ",").Split(',')[1];
}
if (reader.Name == "MultiGeometry")
{
XElement coord = XNode.ReadFrom(reader) as XElement;
locationList = coord.Value.Split(' ').ToList();
CountyCoordinates.Add(zip, locationList);
}
}
}
}
return CountyCoordinates;
}

How to retrieve all Elements from XML file using c#

I am trying to retrieve all elements from an XML file, but I just can reach one, is there any way I can retrieve all?
HttpWebResponse objResponse = (HttpWebResponse)objRequest.GetResponse();
using (XmlReader reader = XmlReader.Create(new StreamReader(objResponse.GetResponseStream())))
{
while (reader.Read())
{
#region Get Credit Score
//if (reader.ReadToDescendant("results"))
if (reader.ReadToDescendant("ssnMatchIndicator"))
{
string ssnMatchIndicator = reader.Value;
}
if (reader.ReadToDescendant("fileHitIndicator"))
{
reader.Read();//this moves reader to next node which is text
result = reader.Value; //this might give value than
Res.Response = true;
Res.SocialSecurityScore = result.ToString();
//break;
}
else
{
Res.Response = false;
Res.SocialSecurityScore = "Your credit score might not be available. Please contact support";
}
#endregion
#region Get fileHitIndicator
if (reader.ReadToDescendant("fileHitIndicator"))
{
reader.Read();
Res.fileHitIndicator = reader.Value;
//break;
}
#endregion
}
}
Can somebody help me out with this issue?
I am also using objResponse.GetResponseStream() because the XML comes from a response from server.
Thanks a lot in advance.
Try this :
XmlDataDocument xmldoc = new XmlDataDocument();
XmlNodeList xmlnode ;
int i = 0;
string str = null;
FileStream fs = new FileStream("product.xml", FileMode.Open, FileAccess.Read);
xmldoc.Load(fs);
xmlnode = xmldoc.GetElementsByTagName("Product");
for (i = 0; i <= xmlnode.Count - 1; i++)
{
xmlnode[i].ChildNodes.Item(0).InnerText.Trim();
str = xmlnode[i].ChildNodes.Item(0).InnerText.Trim() + " " + xmlnode[i].ChildNodes.Item(1).InnerText.Trim() + " " + xmlnode[i].ChildNodes.Item(2).InnerText.Trim();
MessageBox.Show (str);
}
I don't know why what you're doing is not working, but I wouldn't use that method. I've found the following to work well. Whether you're getting the xml from a stream, just put it into a string and bang...
StreamReader reader = new StreamReader(sourcepath);
string xml = reader.ReadToEnd();
reader.Close();
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
XmlNodeList list = doc.GetElementsByTagName("*");
foreach (XmlNode nd in list)
{
switch (nd.Name)
{
case "ContactID":
var ContactIdent = nd.InnerText;
break;
case "ContactName":
var ContactName = nd.InnerText;
break;
}
}
To capture what is between the Xml tags, if there are no child Xml tags, use the InnerText property, e.g. XmlNode.InnerText. To capture what is between the quotes in the nodes' attributes, use XmlAttribute.Value.
As for iterating through the attributes, if one of your nodes has attributes, such as the elements "Name", "SpectralType" and "Orbit" in the Xml here:
<System>
<Star Name="Epsilon Eridani" SpectralType="K2v">
<Planets>
<Planet Orbit="1">Bill</Planet>
<Planet Orbit="2">Moira</Planet>
</Planets>
</Star>
</System>
Detect them using the Attributes property, and iterate through them as shown:
if (nd.Attributes.Count > 0)
{
XmlAttributeCollection coll = nd.Attributes;
foreach (XmlAttribute cn in coll)
{
switch (cn.Name)
{
case "Name":
thisStar.Name = cn.Value;
break;
case "SpectralType":
thisStar.SpectralClass = cn.Value;
break;
}
}
}
You might find some more useful information HERE.

Creating a dynamic xml document

Hi you is it possible to create a dynamica xml with xdocument I've been trying but it appears that it returns an exception about having a wrong structure
My code is the following
public string ReadTest(Stream csvFile)
{
XDocument responseXml = new XDocument( new XDeclaration("1.0", "utf-8", "yes"));
try
{
if ( csvFile != null || csvFile.Length!=0)
{
responseXml.Add(new XElement("root"));
//using(CsvFileReader reader=new CsvFileReader(File.OpenRead(#"C:\Users\toshibapc\Documents\Visual Studio 2013\Projects\WCFLecturaCSV\WCFLecturaCSV\App_Data\archivo.csv"))){
using (CsvFileReader reader = new CsvFileReader(csvFile))
{
CsvRow row = new CsvRow();
List<String> headers = new List<string>();
while (reader.ReadRow(row))
{
int cont = 0;
XElement dato = new XElement("AccountInfos", new XElement("Info"));
XElement datos=null;
foreach (String s in row)
{
if(s.Equals("AccountIDToMove", StringComparison.OrdinalIgnoreCase)|| s.Contains("AccountNameToMove") || s.Contains("NewParentAccountID") || s.Contains("NewParentAccountName")){
headers.Add(s);
}
else{
if (s != String.Empty)
{
datos = new XElement(headers[cont], s); //.Append("<" + headers[cont] + ">" + s + "<" + headers[cont] + "/>");
dato.Add(datos);
}
}
cont++;
}
if (headers.Count == 4 && datos != null)
responseXml.Add(dato);
} // fin de while
}
} // Check if no file i sent or not info on file
}
catch (Exception ex) {
//oError = ex.Message;
}
return responseXml.ToString();
}
What i would like to acomplish by using this code is to get an xml like this
<xml version="1.0">
<root>
<AccountInfos>
<Info>
<AccountIDToMove>312456</AccountIDToMove>
<AccountNameToMove>Burger Count</AccountNameToMove>
<NewParentAccountID>453124</NewParentAccountID>
<NewParentAccountName> Testcom sales 1</NewParentAccountName>
</Info>
<Info>
<AccountIDToMove>874145</AccountIDToMove>
<AccountNameToMove>Mac Count</AccountNameToMove>
<NewParentAccountID>984145</NewParentAccountID>
<NewParentAccountName> Testcom sales 1</NewParentAccountName>
</Info>
</AccountInfos>
</root>
For any answer or help thank you so much
You are adding multiple roots to your document. You initially add one here:
responseXml.Add(new XElement("root"));
And later add more root elements in a loop here:
responseXml.Add(dato);
However, each XML document must have exactly one single root element. Thus you probably want to do:
responseXml.Root.Add(dato);

How To Remove Last Node In XML? C#

I am trying to remove the last node from an XML file, but cannot find any good answers for doing this. Here is my code:
XmlReader x = XmlReader.Create(this.PathToSpecialFolder + #"\" + Application.CompanyName + #"\" + Application.ProductName + #"\Recent.xml");
int c = 0;
while (x.Read())
{
if (x.NodeType == XmlNodeType.Element && x.Name == "Path")
{
c++;
if (c <= 10)
{
MenuItem m = new MenuItem() { Header = x.ReadInnerXml() };
m.Click += delegate
{
};
openRecentMenuItem.Items.Add(m);
}
}
}
x.Close();
My XML node structure is as follows...
<RecentFiles>
<File>
<Path>Text Path</Path>
</File>
</RecentFiles>
In my situation, there will be ten nodes maximum, and each time a new one is added, the last must be removed.
You can try this
XmlDocument doc = new XmlDocument();
doc.Load(fileName);
XmlNodeList nodes = doc.SelectNodes("/RecentFiles/File");
nodes[nodes.Count].ParentNode.RemoveChild(nodes[nodes.Count]);
doc.Save(fileName);
It sounds like you want something like:
var doc = XDocument.Load(path);
var lastFile = doc.Descendants("File").LastOrDefault();
if (lastFile != null)
{
lastFile.Remove();
}
// Now save doc or whatever you want to do with it...

Read the data from XML file into flat files(txt) and with formated data

I have an XML file with nodes and data...I need to write that into a text file as normal data. The nodes being the headers of the data
that follow.
EG XML:
<Bank>
<accountholder>Georgina Wax</accountholder>
<accountnumber>408999703657</accountnumber>
<accounttype>cheque</accounttype>
<bankname>National Bank</bankname>
<branch>Africa</branch>
<amount>2750.00</amount>
<date>12/01/2012</date>
</Bank>
To txt file and formatted as :
accountholder accountnumber accounttype bankname
Georgina Wax 408999703657 cheque National Bank
I can't seem to have it to have spaces between the data and hearders.
Below is what I tried :
StreamWriter writer = File.CreateText(#"C:\\Test.txt");
XmlDocument doc = new XmlDocument();
doc.Load(#"C:\\bank.xml");
writer.WriteLine(string.Join("|",doc.SelectSingleNode("/debitorders/deduction").ChildNodes.C ast<XmlElement>().Select(e => doc.SelectSingleNode("/debitorders/deduction/bankname").ToString())));
foreach (XmlElement book in doc.SelectNodes("/debitorders/deduction"))
{
writer.WriteLine(book.ChildNodes.Cast<XmlElement>().Select(e => e.InnerText).ToArray());
}
Please help.
This will produce output like you want.
private static void LoadAndWriteXML()
{
string headerFiles = "";
string values = "";
using (XmlReader reader = XmlReader.Create(#"C:\\bank.xml"))
{
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.Element && !reader.Name.Equals("Bank")) // we have to skip root node means bank node.
{
headerFiles += reader.Name + " ";
values += reader.ReadString() + " ";
}
}
}
StreamWriter writer = new StreamWriter(#"C:\\Test.txt");
writer.WriteLine(headerFiles.Trim());
writer.WriteLine(values.Trim());
writer.Close();
}
XDocument xdoc = new XDocument();
xdoc = XDocument.Load(fname);
xdoc.Save(fname1);
will save the file with the tags alignment formating

Categories