Split xml into files - method takes less memory - c#

I need to split my XML into files.
This is structure of my sample XML:
<Data Code="L6POS1">
<Lots RowVersion="464775">
<Lot Id="5" Quantity="10068.0000" GUID="AA616D3D-F442-6AEE-0BAB-1D13F6961C2A" />
<Lot Id="99" Quantity="0.0000" GUID="24A9C957-EC98-85D5-8F96-0120F6E8A572" />
<Lot Id="101" Quantity="0.0000" GUID="124D17A2-1568-DB02-4327-4669FE00F741" />
<Lot Id="103" Quantity="0.0000" GUID="DD1730FF-27CF-1269-7AC2-3152CB6FDC46" />
<Lot Id="105" Quantity="0.0000" GUID="1F25378F-30D4-E4E0-9939-1E9E69C806C1" />
<Lot Id="188" Quantity="0.0000" GUID="2E860029-29B3-54C2-B8D1-0C6ABDA42DFF" />
<Lot Id="189" Quantity="0.0000" GUID="D3C58850-BC23-E8DE-A919-09CCB3F8A1D3" />
</Lots>
Expected result: FirstFile:
<Data Code="L6POS1">
<Lots RowVersion="464775">
<Lot Id="5" Quantity="10068.0000" GUID="AA616D3D-F442-6AEE-0BAB-1D13F6961C2A" />
<Lot Id="99" Quantity="0.0000" GUID="24A9C957-EC98-85D5-8F96-0120F6E8A572" />
<Lot Id="101" Quantity="0.0000" GUID="124D17A2-1568-DB02-4327-4669FE00F741" />
<Lot Id="103" Quantity="0.0000" GUID="DD1730FF-27CF-1269-7AC2-3152CB6FDC46" />
</Lots>
</Data>
And SecondFile:
<Data Code="L6POS1">
<Lots RowVersion="464775">
<Lot Id="105" Quantity="0.0000" GUID="1F25378F-30D4-E4E0-9939-1E9E69C806C1" />
<Lot Id="188" Quantity="0.0000" GUID="2E860029-29B3-54C2-B8D1-0C6ABDA42DFF" />
<Lot Id="189" Quantity="0.0000" GUID="D3C58850-BC23-E8DE-A919-09CCB3F8A1D3" />
</Lots>
</Data>
Actually I'm using:
private IEnumerable<XElement> CreateXMLPackagesByType(string syncEntityName, XElement root)
{
var xmlList = new List<XElement>();
IEnumerable<XElement> childNodes = root.Elements();
var childsCount = childNodes.Count();
var skip = 0;
var take = ConfigurationService.MaxImportPackageSize;
var rootAttributes = root.Attributes();
XElement rootWithoutDescendants;
while (skip < childsCount)
{
rootWithoutDescendants = new XElement(root.Name);
rootWithoutDescendants.Add(rootAttributes);
var elems = childNodes.Skip(skip).Take(take);
skip += take;
xmlList.Add(CreatePackage(rootWithoutDescendants, elems));
}
return xmlList;
}
private XElement CreatePackage(XElement type, IEnumerable<XElement> elems)
{
type.Add(elems);
var root = new XElement("Data", type);
root.Add(new XAttribute("Code", ConfigurationService.Code));
return root;
}
Unfortunately, in this way a get OutOfMemoryException with bigger XML files on older hardware. It is better way to split XElement?

Previous comments suggesting to use a SAX parser are correct -- that way you get each event (element, etc) one at a time, and you don't have to keep anything around afterwards.
If you're absolutely certain that your data is as neatly broken into lines as your example, a quick-and-dirty method would be to not even parse, but just read a line at a time. Handle the first two, then break up the rest how you want, then handle the last two. But be really sure (in other words, check) that every <Lot> element takes up exactly one physical line; as you probably already know, there's no reason they have to be that way in XML in general.

Related

How to access a specific attribute using LINQ to XML

I wish to access some specific attribute (Tag name) i an XML file, and place them in a list but i cant get i right. What am I doing wrong??
The list should look like this:
Tag_1
Tag_2
Tag_3
Code:
XElement xelement = XElement.Load("C:/...../Desktop/Testxml.xml");
var tagNames = from tag in xelement.Elements("tagGroup")
select tag.Attribute("name").Value;
foreach (XElement xEle in tagNames)
{
//....
}
Here is the XML file:
<configuration>
<logGroup>
<group name="cpm Log 1h 1y Avg" logInterval="* 1 * * * ?" />
<group name="cpm Log 1d 2y Avg" logInterval="* 10 * * * ?" />
</logGroup>
<tagGroup>
<tag name="Tag_1">
<property name="VALUE">
<logGroup name="cpm Log 1h 1y Avg" />
<logGroup name="cpm Log 1d 2y Avg" />
</property>
</tag>
<tag name="Tag_2">
<property name="VALUE">
<logGroup name="cpm Log 1h 1y Avg" />
<logGroup name="cpm Log 1d 2y Avg" />
</property>
</tag>
<tag name="Tag_3">
<property name="VALUE">
<logGroup name="cpm Log 1h 1y Avg" />
<logGroup name="cpm Log 1d 2y Avg" />
</property>
</tag>
</tagGroup>
</configuration>
just change your linq query for:
var tagNames = from tag in xelement.Elements("tagGroup").Elements("tag")
select tag.Attribute("name").Value;
then tagName is an IEnumerable<string> and you can iterating like this:
foreach (var element in tagNames)
{
//element is a string
}
Your code enumerates through the elements called tagGroup, and then attempts to get the attribute in called name. There is no attribute in tagGroup. In fact tagGroup has descendants two levels deep called logGroup. It's logGroup that has the name attribute.
This code will not work:
XElement xelement = XElement.Load("C:/...../Desktop/Testxml.xml");
var tagNames = from tag in xelement.Elements("tagGroup")
select tag.Attribute("name").Value;
What you need is something like
var tagGroups = xelement.Descendants("tag").Select(x => x.Attribute("name")).ToList();
Or if you want to the others:
var tagGroups = xelement.Descendants("logGroup").Select(x => x.Attribute("name")).ToList();
var tagGroups = xelement.Elements("tagGroup").ToList();
var logGroups = tagGroups.SelectMany (g => g.Descendants("logGroup")).ToList();
var logAttributes = tagGroups.SelectMany (g => g.Descendants("logGroup").Select(x => x.Attribute("name"))).ToList();
Something Like :
var tagNames = xe.Element("tagGroup").Elements("tag").Select(a => a.Attribute("name").Value);
foreach (var xEle in tagNames)
{
Console.WriteLine(xEle);
}
Try this...
var tagNames = from tag in xelement.Elements("tagGroup").Elements("tag")
select tag.Attribute("name").Value;
or
var tagNames = xelement.Elements("tagGroup")
.Elements("tag")
.Attribute("name").Value;

Getting data from xml with specific data name and value key

I want to get data from xml but there are lots of tags, fields and value keys. I couldn't select the value which I want. How can I select the "Error" value from this XML with C#?
<?xml version="1.0" encoding="UTF-8"?>
<Database xmlns="http://www.example.com/2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Datas>
<Data name="sMsg" access="private" xsi:type="collection" type="string">
<Value key="Cycle" value="deger1" />
<Value key="Error" value="deger2" />
<Value key="Info" value="deger3" />
<Value key="Jog" />
<Value key="Warning" />
</Data>
<Data name="tTabla" access="private" xsi:type="array" type="tabla" size="1">
<Field name="dddd" xsi:type="array" type="bool" size="1" />
<Field name="ssss" xsi:type="array" type="bool" size="1" />
<Field name="aaaa" xsi:type="array" type="num" size="1" />
<Field name="rrrr" xsi:type="collection" type="num">
<Value key="Actuel" />
<Value key="Expected" />
</Field>
</Data>
</Datas>
</Database>
You can try this way :
var doc = XDocument.Parse(xml);
XNamespace d = doc.Root.GetDefaultNamespace();
var result = (string)
doc.Descendants(d + "Data")
.Elements(d + "Value")
.FirstOrDefault(o => (string) o.Attribute("key") == "Error")
.Attribute("value");
Console.WriteLine(result);
Try this it returns the XElement with Key equal to Error
XDocument m = XDocument.Load(#"Path");
var res = m.Descendants().Where(x => x.Name.LocalName.Equals("Value") && x.Attribute("key") != null && x.Attribute("key").Value.Equals("Error")).FirstOrDefault();
If there are multiple "Error" values for your attributes, you can do:
IEnumerable<XAttribute> answer = xml.Descendants().Attributes().Where(node => node.Value == "Error");
foreach (var xAttribute in answer)
{
Console.WriteLine(xAttribute.Value);
}
If you only want the first or there is only one:
string answer = xml.Descendants().Attributes().FirstOrDefault(node => node.Value == "Error");
Note FirstOrDefault may yield null if it doesn't find any "Error" values inside your xml.
These queries are done using LINQ To XML, i strongly encourage you to read up.

Reading XML node attribute values from nodes with the same name

My XML looks like this:
<Settings>
<Display_Settings>
<Screen>
<Name Name="Screen" />
<ScreenTag Tag="Screen Tag" />
<LocalPosition X="12" Y="81" Z="28" />
<Width Width="54" />
<Height Height="912" />
</Screen>
<Screen>
<Name Name="Screen" />
<ScreenTag Tag="Screen Tag" />
<LocalPosition X="32" Y="21" Z="28" />
<Width Width="54" />
<Height Height="912" />
</Screen>
</Display_Settings>
</Settings>
How am I able to read in the two different Local Position X attribute values from two different nodes that have the same name?
Edit
Sorry, forgot to add the code I have at the moment that reads in a singular local position attribute value from one screen node:
var xdoc = XDocument.Load("C:\\Test.xml");
var screenPosition = xdoc.Descendants("Screen").First().Element("LocalPosition");
int screenX1 = int.Parse(screenPosition1.Attribute("X").Value);
XPath would look like this:
/Settings/Display_Settings/Screen/LocalPosition/#X
You can use online tool like this: http://www.freeformatter.com/xpath-tester.html#ad-output to test your XPath's.
Also, there's a good tutorial here: http://www.w3schools.com/xpath/
As long, as question was updated, code:
var xdoc = XDocument.Load(#"C:\darbai_test\so_Test.xml");
var screenPosition = xdoc
.Descendants("Screen")
.Descendants("LocalPosition")
.Attributes("X");
foreach (var xAttribute in screenPosition)
{
Console.WriteLine(xAttribute.Value);
}
Console.ReadKey();

Getting the attribute value from two XML nodes with same data

First off, I'm sorry for the name. I couldn't think of a way to describe my issue in a question form. But this is what I'm trying to do.
Here is what my xml is looking like:
<Settings>
<Display_Settings>
<Screen>
<Name Name="Screen" />
<ScreenTag Tag="Screen Tag" />
<LocalPosition X="12" Y="81" Z="28" />
<Width Width="54" />
<Height Height="912" />
</Screen>
<Camera_Name Name="Camera">
<CameraTag Tag="Camera Tag" />
<LocalPosition X="354" Y="108" Z="Z Local Position" />
<Far Far="98" />
<Near Near="16" />
<FOV FOV="78" />
<AspectRatio AspectRatio="1" />
<ScreenDistance ScreenDistance="2" />
</Camera_Name>
</Display_Settings>
</Settings>
What I want is to access the attribute values stored within my local position node. I got some help with this and I can access the screens local position attribute value with this code:
var xdoc = XDocument.Load("C:\\Test.xml");
int x = int)xdoc.Descendants("LocalPosition").First().Attribute("X");
This happily returns 12 when I debug it. But, I also want to my cameras local position to be out putted as well.
Can someone please show me how to do this?
You can grab Camera and Screen position using Descendants and then accessing it components with Attribute. Code examples are given below:
var cameraPosition = xdoc.Descendants("Camera_Name")
.First()
.Element("LocalPosition");
var screenPosition = xdoc.Descendants("Screen")
.First()
.Element("LocalPosition");
//parsing and displaying data
int cameraX = int.Parse(cameraPosition.Attribute("X").Value);
int cameraY = int.Parse(cameraPosition.Attribute("Y").Value);
Console.WriteLine ("camera pos: X={0}; Y={1}", cameraX, cameraY);
int screenX = int.Parse(screenPosition.Attribute("X").Value);
int screenY = int.Parse(screenPosition.Attribute("Y").Value);
Console.WriteLine ("screen pos: X={0}; Y={1}", screenX, screenY);
prints:
screen pos: X=12; Y=81
camera pos: X=354; Y=108
If you use XPath you can target the nodes and retrieve them in to an iterator.
http://msdn.microsoft.com/en-us/library/0ea193ac.aspx

C# How to modify xml attributes based on searched string

I want to find all instances of an element attribute that contains a certain string and change it.
Sample xml would be:
<system>
<template>
<url address="http://localhost:7888/Application/basic" />
<url address="http://localhost:7997/sdk/basic" />
<url address="http://localhost:5855/htm/ws" />
<url address="net.tcp://localhost:5256/htm" />
<url address="http://localhost:5215/htm/basic" />
<url address="http://localhost:5235/htm/ws" />
<url address="net.tcp://localhost:5256/htm" />
<url address="http://localhost:5252/Projectappmgr/basic"/>
<url address="http://localhost:5295/Projectappmgr/ws" />
</template>
</system>
I have the following code:
XmlNodeList nodelist = doc.GetElementsByTagName("url");
foreach (XmlNode node in nodelist)
{
if (node.Attributes["address"].Value.Contains("localhost"))
{
string origValue = node.Attributes["address"].Value;
string modValue = String.Empty;
Console.WriteLine("Value of original address is: " + origValue);
modValue = origValue.Replace("localhost", "newURLName");
Console.WriteLine("Value of modified address is: " + modValue);
node.Attributes["address"].InnerText = modValue;
}
}
This modifies the address' value as expected.
<url address="http://newURLName:7888/Application/basic" />
But, what I really want is to replace the entire string "localhost:7888" with newURLName. Is there a way to specify the port numbers as wild characters since they will not all be the same as in the example xml block?
I know I need the replace value to be "localhost:xxxx", but "xxxx" is different in each instance and I'm sort of drawing a blank at the moment.
Thanks
Regular expressions should help here:
modValue = Regex.Replace(url, #"localhost(:\d+){0,1}", newUrlName)
Here you can find more exapmles. Also I would recommend using Expresso to get familiar with Regex.
You could use xpath to find nodes which contain your search string and then use UriBuilder class to modify your URLs:
var xdoc = XDocument.Parse(xml);
var nodes = xdoc.XPathSelectElements("//url[contains(#address, 'localhost')]");
foreach (var node in nodes)
{
var ub = new UriBuilder(node.Attribute("address").Value);
ub.Host = "newURLName";
node.SetAttributeValue("address", ub.ToString());
}
This will get you
<system>
<template>
<url address="http://newURLName:7888/Application/basic" />
<url address="http://newURLName:7997/sdk/basic" />
<url address="http://newURLName:5855/htm/ws" />
<url address="net.tcp://newURLName:5256/htm" />
<url address="http://newURLName:5215/htm/basic" />
<url address="http://newURLName:5235/htm/ws" />
<url address="net.tcp://newURLName:5256/htm" />
<url address="http://newURLName:5252/Projectappmgr/basic" />
<url address="http://newURLName:5295/Projectappmgr/ws" />
</template>
</system>
from your XML example even without using of regex.
Replace
modValue = origValue.Replace("localhost", "newURLName");
by
modValue = Regex.Replace(origValue, "localhost(:[0-9]+){0,1}", "newURLName");
An alternative to regex would be to use a strongly typed Uri object and the UriBuilder.
var origValue = new Uri(node.Attributes["address"].Value);
var uriBuilder = new UriBuilder(origValue);
uriBuilder.Host = newHost;
uriBuilder.Port = newPort;
modValue = uriBuilder.Uri;
This may seem long winded but it is an alternative to a simple regex, gives you something that is strongly typed and allows you to validate that the Uri is actually valid (see the Uri class methods / properties). You may also be able to do the host and port number in one step, I have not played around with that.

Categories