Reading single element innertext from large xml file

Reading single element innertext from large xml file - c#

I have a large single node .xml file that I have saved as a string. I want to parse the .xml file for a specific element read and output the innertext. EG: I want to read the FrameNo element and output BINGO to a messagebox. The desired element will only appear once in the .xml document. I prefer using XmlDocument.
I have tried numerous C# .xml examples but am unable to get a output.
xml text is
<Aircraft z:Id="i1" xmlns="http://xxx.yyyyycontract.gov/2018/03/Boeing.xxxxxxxxxxxxxx.Airframe"
xmlns:i="http://www.xxxxxxx.com/2019/XMLSchema-instance"
xmlns:z="http://xxxxxxx.xxxxxxxxx.com/2005/01/Serialization/"><Timestamp i:nil="true"/>
<Uuid>00000000-0000-0000-0000-000000000000</Uuid><Comments i:nil="true"/><Facility>..........
and so on to the end of the .xml
<FrameNo>BINGO</FrameNo><WDate i:nil="true"/></Aircraft>
this is the code section I want to have the code execute in.
private void buttonLoad_Click(object sender, EventArgs e)
{
}

I think, this is self-explanatory
using System.Xml.Linq;
XElement root = XElement.Load(textXML);
XElement myElement = root.Element("FrameNo");
if (myElement != null)
myData = myElement.InnerText;

Thanks to jdweng I wanted to share the final code for others to use. This will function in a method like below
private void buttonMaint_Click(object sender, EventArgs e)
{
XDocument doc = XDocument.Parse(xmlinputstr); // input string from memory or input file
XNamespace ns = doc.Root.GetDefaultNamespace();
string[] Frame = doc.Descendants(ns + "FrameNo").Select(x => (string)x).ToArray(); // selects element to read + trailing character of >
string frame = string.Join("", Frame); //converts from array to string
if (string.IsNullOrEmpty(frame)) // check for empty result
{
txtFrame.Text = "not found"; //outputs to textbox
}
else
{
txtFrame.Text = (frame); //outputs to textbox
}
}
Comments are there for clarity

You need to use the default namespace. See my xml linq solution below :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
string xml = File.ReadAllText(FILENAME);
XDocument doc = XDocument.Parse(xml);
XNamespace ns = doc.Root.GetDefaultNamespace();
XElement frameNo = doc.Descendants(ns + "FrameNo").FirstOrDefault();
string frame = (string)frameNo;
string[] serialNumbers = doc.Descendants(ns + "SerialNumber").Select(x => (string)x).ToArray();
}
}
}

Another weird snag has shown up. Some of the elements are named like this.
<a:SupplierServDoc>
the innertext contents of this element is a base64 packet. There is no problem processing the base64 packet.
The code from the above answers does output the base64 correctly but cannot handle the : in the element name. It throws a 3A hex character error.
I have this code that outputs the inntertext but not as a base64 packet. I have also looked into prefix to handle the : but with worse results. I am outputting the base 64 innertext as a .txt file when finished.
XNamespace ad = http://www.mmmmmmmmmm.com";
XName k = ad + "SupplierServDoc";
string[] WING = doc.Descendants(k).Select(x => (string)x).ToArray();
string wing = string.Join("", WING);
if (string.IsNullOrEmpty(syncd))
{
MessageBox.Show("a:SupplierServDoc Base 64 code not found");
}
else
{
MessageBox.Show("Test " + wing);
}

Related

How to parse a xml document without a root node?

I have an xml document which has no root node. It looks like this:
<?xml version="1.0"?>
<Line>
<City>Paris</City>
<Country>France</Country>
</Line>
<Line>
<City>Lissabon</City>
<Country>Spain</Country>
</Line>
No I want to read Line by Line and write the contents to a database. However, XmlDocument seems to insist that there must exist a root node. How can I process this file?

If you want to parse it as an XML document, you can add a root node like Denis proposed in his comment.
If you would just like to read each line and write it to a database, you can handle the file like an ordinary (text) file and read its contents line by line using a StreamReader.
This would look something like this:
string line;
// Read the file and process it line by line.
var reader = new StreamReader(FILEPATH);
while((line = reader.ReadLine()) != null)
{
// Depending on what you need, you could strip the XML tags
// And write the line to the database
}
reader.Close();

You could try something like this (simple WinForms app with a button and a rich text box to display output for testing):
using System;
using System.Text;
using System.Xml;
using System.Windows.Forms;
namespace WindowsFormsApp11
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
StringBuilder sb = new StringBuilder();
XmlReaderSettings settings = new XmlReaderSettings
{
ConformanceLevel = ConformanceLevel.Fragment
};
using (XmlReader reader = XmlReader.Create(#"c:\ab\countries.xml", settings))
{
while(reader.Read())
{
if (reader.Name != "Line") // Ignore the <Line> nodes
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
sb.Append(string.Format("{0}:", reader.Name));
break;
case XmlNodeType.Text:
sb.Append(string.Format(" {0}{1}", reader.Value, Environment.NewLine));
break;
}
}
}
}
richTextBox1.Text = sb.ToString();
}
}
}

May be not the best solution, but you could create a List (or array) from your XML and insert missing nodes:
// Read lines into List
var list = File.ReadLines("doc.xml").ToList();
// Insert missing nodes
list.Insert(1, "<root>"); // Use 1, because 0 is XML directive
list.Insert(list.Count, "</root>"); //Add closing tag to the end
// Create final XML string with LINQ
var xml_str = list.Aggregate("", (acc, s) => acc + s);
// Having a string, we can create, for instance, XElement (or XDocument)
var xml = XElement.Parse(xml_str);
Console.WriteLine(xml.Element("Line").Element("City").Value);
//Output: Paris

XmlDocumentFragment set InnerXml fails with not declared prefix

I have an XML document which basically looks like this:
<ArrayOfAspect xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<Aspect i:type="TransactionAspect">
...
</Aspect>
<Aspect i:type="TransactionAspect">
...
</Aspect>
</ArrayOfAspect>
And I want to append a new Aspect to this list.
In order to do so I load this xml from a file, create a XmlDocumentFragment and load the new Aspect from a file (which is basically a template I fill with data). Then I fill the document fragment with the new aspect and append it as a child.
But when I try to set the xml of this fragment it fails because the prefix i is not defined.
// Load all aspects
var aspectsXml = new XmlDocument();
aspectsXml.Load("aspects.xml");
// Create and fill the fragment
var fragment = aspectsXml.CreateDocumentFragment();
fragment.InnerXml = _templateIFilledWithData; // This fails because i is not defined
// Add the new child
aspectsXml.AppendChild(fragment)
This is how the template looks like:
<Aspect i:type="TransactionAspect">
<Value>$VALUES_PLACEHOLDER$</Value>
...
</Aspect>
Note that I don't want to create POCOs for this and serialize them since the aspects are actualy quite big and nested and I have the same problem with some other xml files as well.
EDIT:
jdweng proposed to use XmlLinq (Which is way better than what I used before, so thanks). Here is the code I try to use with XmlLinq (still failing because of undeclared prefix):
var aspects = XDocument.Load("aspects.xml");
var newAspects = EXlement.Parse(_templateIFilledWithData); // Fails here - Undeclared prefix 'i'
aspects.Root.add(newAspect);

Use xml linq :
using System.Collections.ObjectModel;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication57
{
class Program
{
const string URL = "http://goalserve.com/samples/soccer_inplay.xml";
static void Main(string[] args)
{
string xml =
"<ArrayOfAspect xmlns:i=\"http://www.w3.org/2001/XMLSchema-instance\">" +
"<Aspect i:type=\"TransactionAspect\">" +
"</Aspect>" +
"<Aspect i:type=\"TransactionAspect\">" +
"</Aspect>" +
"</ArrayOfAspect>";
XDocument doc = XDocument.Parse(xml);
XElement root = doc.Root;
XNamespace nsI = root.GetNamespaceOfPrefix("i");
root.Add(new XElement("Aspect", new object[] {
new XAttribute(nsI + "type", "TransactionAspect"),
new XElement("Value", "$VALUES_PLACEHOLDER$")
}));
}
}
}

Most efficient way to replace text in xml stream

I have a huge chunk of XML data that I need to "clean". The Xml looks something like this:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:body>
<w:p>
<w:t>F_ck</w:t>
<!-- -->
<w:t>F_ck</w:t>
<!-- -->
<w:t>F_ck</w:t>
</w:p>
</w:body>
</w:document>
I would like to identify the <w:t>-elements with the value "F_ck" and replace the value with something else. The elements I need to clean will be scattered throughout the document.
I need the code to run as fast as possible and with a memory footprint as small as possible, so I am reluctant to use the XDocument (DOM) approaches I have found here and elsewhere.
The data is given to me as a stream containing the Xml data, and my gut feeling tells me that I need the XmlTextReader and the XmlTextWriter.
My original idea was to do a SAX-mode, forward-only run through the Xml data and "pipe" it over to the XmlTextWriter, but I cannot find an intelligent way to do so.
I wrote this code:
var reader = new StringReader(content);
var xmltextReader = new XmlTextReader(reader);
var memStream = new MemoryStream();
var xmlWriter = new XmlTextWriter(memStream, Encoding.UTF8);
while (xmltextReader.Read())
{
if (xmltextReader.Name == "w:t")
{
//xmlWriter.WriteRaw("blah");
}
else
{
xmlWriter.WriteRaw(xmltextReader.Value);
}
}
The code above only takes the value of elements declaration etc, so no brackets or anything. I realize that I could write code that specifically executed .WriteElement(), .WriteEndElement() etc depending on the NodeType, but I fear that will quickly be a mess.
So the question is:
How do I - in a nice way - pipe the xml data read from the XmlTextReader to the XmlTextWriter while still being able to manipulate the data while piping?

Try this
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string xml =
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?>" +
"<w:document xmlns:w=\"http://schemas.openxmlformats.org/wordprocessingml/2006/main\">" +
"<w:body>" +
"<w:p>" +
"<w:t>F_ck</w:t>" +
"<!-- -->" +
"<w:t>F_ck</w:t>" +
"<!-- -->" +
"<w:t>F_ck</w:t>" +
"</w:p>" +
"</w:body>" +
"</w:document>";
XDocument doc = XDocument.Parse(xml);
XElement document = (XElement)doc.FirstNode;
XNamespace ns_w = document.GetNamespaceOfPrefix("w");
List<XElement> ts = doc.Descendants(ns_w + "t").ToList();
foreach (XElement t in ts)
{
t.Value = "abc";
}
}
}
}

How to read XML document from property in Episerver

I want to read XML document from a property which is created in edit mode of Episerver.
I have made one property of type 'URL to Document'.
When I try to fetch it from code behind, it gives only file path. I am not able to read the content of XML file which is uploaded in property.
string XMLContent = Currentpage.Getproperty<string>("XMLFile");
Can anyone help out on this?

You need to load the file as well. Something like this:
var path = CurrentPage["XMLFile"] as string;
if (HostingEnvironment.VirtualPathProvider.FileExists(path))
{
var file = HostingEnvironment.VirtualPathProvider.GetFile(path) as UnifiedFile;
if (file != null)
{
using (var stream = file.Open())
{
// Here is your XML document
var xml = XDocument.Load(stream);
}
}
}
You can also load the file content by using the local path on disk, file.LocalPath.

try this
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string XMLContent = "";
//using XML
XmlDocument doc1 = new XmlDocument();
doc1.LoadXml(XMLContent);
//using xml linq
XDocument doc2 = XDocument.Parse(XMLContent);
}
}
}

XML like data to CSV Conversion

So I have a device which has an inbuilt logger program which generates status messages about the device and keeps pushing them to a .txt file. These messages include information about the device status, network status amongst many other things. The data in the file looks something like the following:
<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>
last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>
<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>
last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>
... goes on
Note that it is not well formed XML. Also, one element can have multiple parameters and can also have blanks... for example: <NETWORKSTAT>1,456,3,6,,7</NETWORKSTAT>
What my objective is is to write something in C# WPF, that would take this text file, process the data in it and create a .csv file with each event per line.
For example, for the above given brief example, the first line in the csv file would be:
1,4,7,,5,hello,there,my,name,is,jack,,last,name,missing,above,3,6,7,,8,4
Also, I do not need help using basic C#. I know how to read a file, etc.. but I have no clue as to how I would approach this problem in regards to the parsing and processing and converting. I'm fairly new to C# so I'm not sure which direction to go. Any help will be appreciated!

Since each top-level XML node in your file is well-formed, you can use an XmlReader with XmlReaderSettings.ConformanceLevel = ConformanceLevel.Fragment to iterate through each top-level node in the file and read it with Linq-to-XML:
public static IEnumerable<string> XmlFragmentsToCSV(string path)
{
using (var textReader = new StreamReader(path, Encoding.UTF8))
foreach (var line in XmlFragmentsToCSV(textReader))
yield return line;
}
public static IEnumerable<string> XmlFragmentsToCSV(TextReader textReader)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
using (XmlReader reader = XmlReader.Create(textReader, settings))
{
while (reader.Read())
{ // Skip whitespace
if (reader.NodeType == XmlNodeType.Element)
{
using (var subReader = reader.ReadSubtree())
{
var element = XElement.Load(subReader);
yield return string.Join(",", element.DescendantNodes().OfType<XText>().Select(n => n.Value.Trim()).Where(t => !string.IsNullOrEmpty(t)).ToArray());
}
}
}
}
}
To precisely match the output you wanted I had to trim whitespaces at the beginning and end of each text node value.
Also, the Where(t => !string.IsNullOrEmpty(t)) clause is to skip the whitespace node corresponding to the space here: </ANOTHERTAG> </XML>. If that space doesn't exist in the real file, you can omit that clause.

Due to non standard format had to switch from an XML Linq solution to a standard XML solution. Linq doesn't support TEXT strings that are not in tags.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.csv";
static void Main(string[] args)
{
string input =
"<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>" +
"last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>" +
"<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>" +
"last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>";
input = "<Root>" + input + "</Root>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(input);
StreamWriter writer = new StreamWriter(FILENAME);
XmlNodeList rows = doc.GetElementsByTagName("XML");
foreach (XmlNode row in rows)
{
List<string> children = new List<string>();
foreach (XmlNode child in row.ChildNodes)
{
children.Add(child.InnerText.Trim());
}
writer.WriteLine(string.Join(",", children.ToArray()));
}
writer.Flush();
writer.Close();
}
}
}

Here is my solution that uses XML Linq. I create a XDocument by wrapping the fragments with a Root tag.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.csv";
static void Main(string[] args)
{
string input =
"<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>" +
"last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>" +
"<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>" +
"last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>";
input = "<Root>" + input + "</Root>";
XDocument doc = XDocument.Parse(input);
StreamWriter writer = new StreamWriter(FILENAME);
List<XElement> rows = doc.Descendants("XML").ToList();
foreach (XElement row in rows)
{
string[] elements = row.Elements().Select(x => x.Value).ToArray();
writer.WriteLine(string.Join(",", elements));
}
writer.Flush();
writer.Close();
}
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Reading single element innertext from large xml file - c#

I think, this is self-explanatory using System.Xml.Linq; XElement root = XElement.Load(textXML); XElement myElement = root.Element("FrameNo"); if (myElement != null) myData = myElement.InnerText;

Related

How to parse a xml document without a root node?

XmlDocumentFragment set InnerXml fails with not declared prefix

Most efficient way to replace text in xml stream

How to read XML document from property in Episerver

XML like data to CSV Conversion

Categories

Resources