How to parse a xml document without a root node? - c#

I have an xml document which has no root node. It looks like this:
<?xml version="1.0"?>
<Line>
<City>Paris</City>
<Country>France</Country>
</Line>
<Line>
<City>Lissabon</City>
<Country>Spain</Country>
</Line>
No I want to read Line by Line and write the contents to a database. However, XmlDocument seems to insist that there must exist a root node. How can I process this file?

If you want to parse it as an XML document, you can add a root node like Denis proposed in his comment.
If you would just like to read each line and write it to a database, you can handle the file like an ordinary (text) file and read its contents line by line using a StreamReader.
This would look something like this:
string line;
// Read the file and process it line by line.
var reader = new StreamReader(FILEPATH);
while((line = reader.ReadLine()) != null)
{
// Depending on what you need, you could strip the XML tags
// And write the line to the database
}
reader.Close();

You could try something like this (simple WinForms app with a button and a rich text box to display output for testing):
using System;
using System.Text;
using System.Xml;
using System.Windows.Forms;
namespace WindowsFormsApp11
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
StringBuilder sb = new StringBuilder();
XmlReaderSettings settings = new XmlReaderSettings
{
ConformanceLevel = ConformanceLevel.Fragment
};
using (XmlReader reader = XmlReader.Create(#"c:\ab\countries.xml", settings))
{
while(reader.Read())
{
if (reader.Name != "Line") // Ignore the <Line> nodes
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
sb.Append(string.Format("{0}:", reader.Name));
break;
case XmlNodeType.Text:
sb.Append(string.Format(" {0}{1}", reader.Value, Environment.NewLine));
break;
}
}
}
}
richTextBox1.Text = sb.ToString();
}
}
}

May be not the best solution, but you could create a List (or array) from your XML and insert missing nodes:
// Read lines into List
var list = File.ReadLines("doc.xml").ToList();
// Insert missing nodes
list.Insert(1, "<root>"); // Use 1, because 0 is XML directive
list.Insert(list.Count, "</root>"); //Add closing tag to the end
// Create final XML string with LINQ
var xml_str = list.Aggregate("", (acc, s) => acc + s);
// Having a string, we can create, for instance, XElement (or XDocument)
var xml = XElement.Parse(xml_str);
Console.WriteLine(xml.Element("Line").Element("City").Value);
//Output: Paris

Related

Reading single element innertext from large xml file

I have a large single node .xml file that I have saved as a string. I want to parse the .xml file for a specific element read and output the innertext. EG: I want to read the FrameNo element and output BINGO to a messagebox. The desired element will only appear once in the .xml document. I prefer using XmlDocument.
I have tried numerous C# .xml examples but am unable to get a output.
xml text is
<Aircraft z:Id="i1" xmlns="http://xxx.yyyyycontract.gov/2018/03/Boeing.xxxxxxxxxxxxxx.Airframe"
xmlns:i="http://www.xxxxxxx.com/2019/XMLSchema-instance"
xmlns:z="http://xxxxxxx.xxxxxxxxx.com/2005/01/Serialization/"><Timestamp i:nil="true"/>
<Uuid>00000000-0000-0000-0000-000000000000</Uuid><Comments i:nil="true"/><Facility>..........
and so on to the end of the .xml
<FrameNo>BINGO</FrameNo><WDate i:nil="true"/></Aircraft>
this is the code section I want to have the code execute in.
private void buttonLoad_Click(object sender, EventArgs e)
{
}
I think, this is self-explanatory
using System.Xml.Linq;
XElement root = XElement.Load(textXML);
XElement myElement = root.Element("FrameNo");
if (myElement != null)
myData = myElement.InnerText;
Thanks to jdweng I wanted to share the final code for others to use. This will function in a method like below
private void buttonMaint_Click(object sender, EventArgs e)
{
XDocument doc = XDocument.Parse(xmlinputstr); // input string from memory or input file
XNamespace ns = doc.Root.GetDefaultNamespace();
string[] Frame = doc.Descendants(ns + "FrameNo").Select(x => (string)x).ToArray(); // selects element to read + trailing character of >
string frame = string.Join("", Frame); //converts from array to string
if (string.IsNullOrEmpty(frame)) // check for empty result
{
txtFrame.Text = "not found"; //outputs to textbox
}
else
{
txtFrame.Text = (frame); //outputs to textbox
}
}
Comments are there for clarity
You need to use the default namespace. See my xml linq solution below :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
string xml = File.ReadAllText(FILENAME);
XDocument doc = XDocument.Parse(xml);
XNamespace ns = doc.Root.GetDefaultNamespace();
XElement frameNo = doc.Descendants(ns + "FrameNo").FirstOrDefault();
string frame = (string)frameNo;
string[] serialNumbers = doc.Descendants(ns + "SerialNumber").Select(x => (string)x).ToArray();
}
}
}
Another weird snag has shown up. Some of the elements are named like this.
<a:SupplierServDoc>
the innertext contents of this element is a base64 packet. There is no problem processing the base64 packet.
The code from the above answers does output the base64 correctly but cannot handle the : in the element name. It throws a 3A hex character error.
I have this code that outputs the inntertext but not as a base64 packet. I have also looked into prefix to handle the : but with worse results. I am outputting the base 64 innertext as a .txt file when finished.
XNamespace ad = http://www.mmmmmmmmmm.com";
XName k = ad + "SupplierServDoc";
string[] WING = doc.Descendants(k).Select(x => (string)x).ToArray();
string wing = string.Join("", WING);
if (string.IsNullOrEmpty(syncd))
{
MessageBox.Show("a:SupplierServDoc Base 64 code not found");
}
else
{
MessageBox.Show("Test " + wing);
}

XML like data to CSV Conversion

So I have a device which has an inbuilt logger program which generates status messages about the device and keeps pushing them to a .txt file. These messages include information about the device status, network status amongst many other things. The data in the file looks something like the following:
<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>
last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>
<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>
last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>
... goes on
Note that it is not well formed XML. Also, one element can have multiple parameters and can also have blanks... for example: <NETWORKSTAT>1,456,3,6,,7</NETWORKSTAT>
What my objective is is to write something in C# WPF, that would take this text file, process the data in it and create a .csv file with each event per line.
For example, for the above given brief example, the first line in the csv file would be:
1,4,7,,5,hello,there,my,name,is,jack,,last,name,missing,above,3,6,7,,8,4
Also, I do not need help using basic C#. I know how to read a file, etc.. but I have no clue as to how I would approach this problem in regards to the parsing and processing and converting. I'm fairly new to C# so I'm not sure which direction to go. Any help will be appreciated!
Since each top-level XML node in your file is well-formed, you can use an XmlReader with XmlReaderSettings.ConformanceLevel = ConformanceLevel.Fragment to iterate through each top-level node in the file and read it with Linq-to-XML:
public static IEnumerable<string> XmlFragmentsToCSV(string path)
{
using (var textReader = new StreamReader(path, Encoding.UTF8))
foreach (var line in XmlFragmentsToCSV(textReader))
yield return line;
}
public static IEnumerable<string> XmlFragmentsToCSV(TextReader textReader)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
using (XmlReader reader = XmlReader.Create(textReader, settings))
{
while (reader.Read())
{ // Skip whitespace
if (reader.NodeType == XmlNodeType.Element)
{
using (var subReader = reader.ReadSubtree())
{
var element = XElement.Load(subReader);
yield return string.Join(",", element.DescendantNodes().OfType<XText>().Select(n => n.Value.Trim()).Where(t => !string.IsNullOrEmpty(t)).ToArray());
}
}
}
}
}
To precisely match the output you wanted I had to trim whitespaces at the beginning and end of each text node value.
Also, the Where(t => !string.IsNullOrEmpty(t)) clause is to skip the whitespace node corresponding to the space here: </ANOTHERTAG> </XML>. If that space doesn't exist in the real file, you can omit that clause.
Due to non standard format had to switch from an XML Linq solution to a standard XML solution. Linq doesn't support TEXT strings that are not in tags.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.csv";
static void Main(string[] args)
{
string input =
"<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>" +
"last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>" +
"<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>" +
"last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>";
input = "<Root>" + input + "</Root>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(input);
StreamWriter writer = new StreamWriter(FILENAME);
XmlNodeList rows = doc.GetElementsByTagName("XML");
foreach (XmlNode row in rows)
{
List<string> children = new List<string>();
foreach (XmlNode child in row.ChildNodes)
{
children.Add(child.InnerText.Trim());
}
writer.WriteLine(string.Join(",", children.ToArray()));
}
writer.Flush();
writer.Close();
}
}
}
​
Here is my solution that uses XML Linq. I create a XDocument by wrapping the fragments with a Root tag.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.csv";
static void Main(string[] args)
{
string input =
"<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>" +
"last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>" +
"<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>" +
"last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>";
input = "<Root>" + input + "</Root>";
XDocument doc = XDocument.Parse(input);
StreamWriter writer = new StreamWriter(FILENAME);
List<XElement> rows = doc.Descendants("XML").ToList();
foreach (XElement row in rows)
{
string[] elements = row.Elements().Select(x => x.Value).ToArray();
writer.WriteLine(string.Join(",", elements));
}
writer.Flush();
writer.Close();
}
}
}
​

Xml gets corrupted each time I append a node

I have an Xml file as:
<?xml version="1.0"?>
<hashnotes>
<hashtags>
<hashtag>#birthday</hashtag>
<hashtag>#meeting</hashtag>
<hashtag>#anniversary</hashtag>
</hashtags>
<lastid>0</lastid>
<Settings>
<Font>Arial</Font>
<HashtagColor>red</HashtagColor>
<passwordset>0</passwordset>
<password></password>
</Settings>
</hashnotes>
I then call a function to add a node in the xml,
The function is :
public static void CreateNoteNodeInXDocument(XDocument argXmlDoc, string argNoteText)
{
string lastId=((Convert.ToInt32(argXmlDoc.Root.Element("lastid").Value)) +1).ToString();
string date = DateTime.Now.ToString("MM/dd/yyyy");
argXmlDoc.Element("hashnotes").Add(new XElement("Note", new XAttribute("ID", lastId), new XAttribute("Date",date),new XElement("Text", argNoteText)));
//argXmlDoc.Root.Note.Add new XElement("Text", argNoteText)
List<string> hashtagList = Utilities.GetHashtagsFromText(argNoteText);
XElement reqNoteElement = (from xml2 in argXmlDoc.Descendants("Note")
where xml2.Attribute("ID").Value == lastId
select xml2).FirstOrDefault();
if (reqNoteElement != null)
{
foreach (string hashTag in hashtagList)
{
reqNoteElement.Add(new XElement("hashtag", hashTag));
}
}
argXmlDoc.Root.Element("lastid").Value = lastId;
}
After this I save the xml.
Next time when I try to load the Xml, it fails with an exception:
System.Xml.XmlException: Unexpected XML declaration. The XML declaration must be the first node in the document, and no white space characters are allowed to appear before it.
Here is the code to load the XML:
private static XDocument hashNotesXDocument;
private static Stream hashNotesStream;
StorageFile hashNoteXml = await InstallationFolder.GetFileAsync("hashnotes.xml");
hashNotesStream = await hashNoteXml.OpenStreamForWriteAsync();
hashNotesXDocument = XDocument.Load(hashNotesStream);
and I save it using:
hashNotesXDocument.Save(hashNotesStream);
You don't show all of your code, but it looks like you open the XML file, read the XML from it into an XDocument, edit the XDocument in memory, then write back to the opened stream. Since the stream is still open it will be positioned at the end of the file and thus the new XML will be appended to the file.
Suggest eliminating hashNotesXDocument and hashNotesStream as static variables, and instead open and read the file, modify the XDocument, then open and write the file using the pattern shown here.
I'm working only on desktop code (using an older version of .Net) so I can't test this, but something like the following should work:
static async Task LoadUpdateAndSaveXml(Action<XDocument> editor)
{
XDocument doc;
var xmlFile = await InstallationFolder.GetFileAsync("hashnotes.xml");
using (var reader = new StreamReader(await xmlFile.OpenStreamForReadAsync()))
{
doc = XDocument.Load(reader);
}
if (doc != null)
{
editor(doc);
using (var writer = new StreamWriter(await xmlFile.OpenStreamForWriteAsync()))
{
// Truncate - https://stackoverflow.com/questions/13454584/writing-a-shorter-stream-to-a-storagefile
if (writer.CanSeek && writer.Length > 0)
writer.SetLength(0);
doc.Save(writer);
}
}
}
Also, be sure to create the file before using it.

C# - From XML to Database

I got an XML file which can have several nodes, child nodes, "child child nodes", ... and I'd like to figure out how to get these data in order to store them into my own SQL Server database.
I've read some tutos on internet and also tried some things. At the current moment, I'm able to open and read the file but not to retrieve data. Here's what I'm doing for instance :
class Program
{
static void Main(string[] args)
{
Person p = new Person();
string filePath = #"C:\Users\Desktop\ConsoleApplication1\XmlPersonTest.xml";
XmlDocument xmlDoc = new XmlDocument();
if(File.Exists(filePath))
{
xmlDoc.Load(filePath);
XmlElement elm = xmlDoc.DocumentElement;
XmlNodeList list = elm.ChildNodes;
Console.WriteLine("The root element contains {0} nodes",
list.Count);
}
else
{
Console.WriteLine("The file {0} could not be located",
filePath);
}
Console.Read();
}
}
And here's a small example of what my XML file looks like :
<person>
<name>McMannus</name>
<firstname>Fionn</firstname>
<age>21</age>
<nationality>Belge</nationality>
<car>
<mark>Audi</mark>
<model>A1</model>
<year>2013</year>
<hp>70</hp>
</car>
<car>
<mark>VW</mark>
<model>Golf 7</model>
<year>2014</year>
<hp>99</hp>
</car>
<car>
<mark>BMW</mark>
<model>Série 1</model>
<year>2013</year>
<hp>80</hp>
</car>
</person>
Any advice or tuto to do that guys?
I have made a little method for navigating through xml nodes, using XElement (Linq.Xml):
public string Get(XElement root, string path)
{
if (root== null)
return null;
string[] p = path.Split(new string[] { "/" }, StringSplitOptions.RemoveEmptyEntries);
XElement at = root;
foreach (string n in p)
{
at = at.Element(n);
if (at == null)
return null;
}
return at.Value;
}
Using this, you can get the value of an XElement node via Get(root, "rootNode/nodeA/nodeAChild/etc")
Well, having gone through something similar the other day. You should try the following, initially build a model:
Open your XML Document.
Copy your entire XML Document.
Open Visual Studio.
Click in an area out of your initial class (1b diagram)
Go to Edit in Visual Studio
Paste Special - Paste as XML Classes
1b:
namespace APICore
{
public class APIParser()
{
// Parse logic would go here.
}
// You would click here.
}
When you do that you'll end up with a valid XML Model, which can be accessed through your parser, how you choose to access the XML Web or Local will be up to you. For simplicity I'm going to choose a file:
public class APIParser(string file)
{
// Person should be Xml Root Element Class.
XmlSerializer serialize = new XmlSerializer(typeof(Person));
using(FileStream stream = new FileStream(file, FileMode.Open, FileAccess.ReadWrite, FileShare.ReadWrite))
using(XmlReader reader XmlReader.Create(stream))
{
Person model = serialize.Deserialize(reader) as Person;
}
}
So now you've successfully got the data to iterate through, so you can work with your data. Here is an example of how you would:
// Iterates through each Person
foreach(var people in model.Person)
{
var information = people.Cars.SelectMany(obj => new { obj.Mark, obj.model, obj.year, obj.hp }).ToList();
}
You would do something like that, then write to the database. This won't fit your example perfectly but should point you in a strong direction.

XML element change value in C#

the c# code as below:
using (XmlReader Reader = XmlReader.Create(DocumentPath, XmlSettings))
{
while (Reader.Read())
{
switch (Reader.NodeType)
{
case XmlNodeType.Element:
//doing function
break;
case XmlNodeType.Text:
if(Reader.Value.StartsWith("_"))
{
// I want to replace the Reader.Value to new string
}
break;
case XmlNodeType.EndElement:
//doing function
break;
default:
//doing function
break;
}
}
}
I want to set the new value when XmlNodeType = text.
There are 3 steps to this operations:
Load the document as an object model(XML is a hierarchy, if that makes it easier)
Alter the value in the object model
Resave the model as a file
The method of loading is up to you but i would recommend using XDocument and the related Linq-to-XML classes for smaller tasks like this. It is crazy easy as demonstrated in this stack.
Edit - a useful quote for your scenario
XML elements can contain text content. Sometimes the content is simple
(the element only contains text content), and sometimes the content is
mixed (the contents of the element contains both text and other
elements). In either case, each chunk of text is represented as an
XText node.
From MSDN - XText class
Here is a code sample for a console application using the xml from the comments:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Xml.Linq;
namespace TextNodeChange
{
class Program
{
static void Main(string[] args)
{
string input = #"<Info><title>a</title><content>texttexttexttexttext<tag>TAGNAME</tag>texttexttex‌​ttexttext</content>texttexttexttexttext</Info>";
XDocument doc = XDocument.Parse(input);
Console.WriteLine("Input document");
Console.WriteLine(doc);
//get the all of the text nodes in the content element
var textNodes = doc.Element("Info").Element("content").Nodes().OfType<XText>().ToList();
//update the second text node
textNodes[1].Value = "THIS IS AN ALTERED VALUE!!!!";
Console.WriteLine();
Console.WriteLine("Output document");
Console.WriteLine(doc);
Console.ReadLine();
}
}
}
Why are 3 steps required?
Elements of xml are of variable length in a file. Altering a value may change the length of that part of the file and overwrite other parts. Therefore, you have to deserialize the whole document, make the changes and save it again.
You cant replace the reader.value readonly property. You need to use XmlWriter to create your own xml and replace any value you want.

Categories