I have an xml document which has no root node. It looks like this:
<?xml version="1.0"?>
<Line>
<City>Paris</City>
<Country>France</Country>
</Line>
<Line>
<City>Lissabon</City>
<Country>Spain</Country>
</Line>
No I want to read Line by Line and write the contents to a database. However, XmlDocument seems to insist that there must exist a root node. How can I process this file?
If you want to parse it as an XML document, you can add a root node like Denis proposed in his comment.
If you would just like to read each line and write it to a database, you can handle the file like an ordinary (text) file and read its contents line by line using a StreamReader.
This would look something like this:
string line;
// Read the file and process it line by line.
var reader = new StreamReader(FILEPATH);
while((line = reader.ReadLine()) != null)
{
// Depending on what you need, you could strip the XML tags
// And write the line to the database
}
reader.Close();
You could try something like this (simple WinForms app with a button and a rich text box to display output for testing):
using System;
using System.Text;
using System.Xml;
using System.Windows.Forms;
namespace WindowsFormsApp11
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
StringBuilder sb = new StringBuilder();
XmlReaderSettings settings = new XmlReaderSettings
{
ConformanceLevel = ConformanceLevel.Fragment
};
using (XmlReader reader = XmlReader.Create(#"c:\ab\countries.xml", settings))
{
while(reader.Read())
{
if (reader.Name != "Line") // Ignore the <Line> nodes
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
sb.Append(string.Format("{0}:", reader.Name));
break;
case XmlNodeType.Text:
sb.Append(string.Format(" {0}{1}", reader.Value, Environment.NewLine));
break;
}
}
}
}
richTextBox1.Text = sb.ToString();
}
}
}
May be not the best solution, but you could create a List (or array) from your XML and insert missing nodes:
// Read lines into List
var list = File.ReadLines("doc.xml").ToList();
// Insert missing nodes
list.Insert(1, "<root>"); // Use 1, because 0 is XML directive
list.Insert(list.Count, "</root>"); //Add closing tag to the end
// Create final XML string with LINQ
var xml_str = list.Aggregate("", (acc, s) => acc + s);
// Having a string, we can create, for instance, XElement (or XDocument)
var xml = XElement.Parse(xml_str);
Console.WriteLine(xml.Element("Line").Element("City").Value);
//Output: Paris
I have an XML document which basically looks like this:
<ArrayOfAspect xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<Aspect i:type="TransactionAspect">
...
</Aspect>
<Aspect i:type="TransactionAspect">
...
</Aspect>
</ArrayOfAspect>
And I want to append a new Aspect to this list.
In order to do so I load this xml from a file, create a XmlDocumentFragment and load the new Aspect from a file (which is basically a template I fill with data). Then I fill the document fragment with the new aspect and append it as a child.
But when I try to set the xml of this fragment it fails because the prefix i is not defined.
// Load all aspects
var aspectsXml = new XmlDocument();
aspectsXml.Load("aspects.xml");
// Create and fill the fragment
var fragment = aspectsXml.CreateDocumentFragment();
fragment.InnerXml = _templateIFilledWithData; // This fails because i is not defined
// Add the new child
aspectsXml.AppendChild(fragment)
This is how the template looks like:
<Aspect i:type="TransactionAspect">
<Value>$VALUES_PLACEHOLDER$</Value>
...
</Aspect>
Note that I don't want to create POCOs for this and serialize them since the aspects are actualy quite big and nested and I have the same problem with some other xml files as well.
EDIT:
jdweng proposed to use XmlLinq (Which is way better than what I used before, so thanks). Here is the code I try to use with XmlLinq (still failing because of undeclared prefix):
var aspects = XDocument.Load("aspects.xml");
var newAspects = EXlement.Parse(_templateIFilledWithData); // Fails here - Undeclared prefix 'i'
aspects.Root.add(newAspect);
Use xml linq :
using System.Collections.ObjectModel;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication57
{
class Program
{
const string URL = "http://goalserve.com/samples/soccer_inplay.xml";
static void Main(string[] args)
{
string xml =
"<ArrayOfAspect xmlns:i=\"http://www.w3.org/2001/XMLSchema-instance\">" +
"<Aspect i:type=\"TransactionAspect\">" +
"</Aspect>" +
"<Aspect i:type=\"TransactionAspect\">" +
"</Aspect>" +
"</ArrayOfAspect>";
XDocument doc = XDocument.Parse(xml);
XElement root = doc.Root;
XNamespace nsI = root.GetNamespaceOfPrefix("i");
root.Add(new XElement("Aspect", new object[] {
new XAttribute(nsI + "type", "TransactionAspect"),
new XElement("Value", "$VALUES_PLACEHOLDER$")
}));
}
}
}
I have a huge chunk of XML data that I need to "clean". The Xml looks something like this:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:body>
<w:p>
<w:t>F_ck</w:t>
<!-- -->
<w:t>F_ck</w:t>
<!-- -->
<w:t>F_ck</w:t>
</w:p>
</w:body>
</w:document>
I would like to identify the <w:t>-elements with the value "F_ck" and replace the value with something else. The elements I need to clean will be scattered throughout the document.
I need the code to run as fast as possible and with a memory footprint as small as possible, so I am reluctant to use the XDocument (DOM) approaches I have found here and elsewhere.
The data is given to me as a stream containing the Xml data, and my gut feeling tells me that I need the XmlTextReader and the XmlTextWriter.
My original idea was to do a SAX-mode, forward-only run through the Xml data and "pipe" it over to the XmlTextWriter, but I cannot find an intelligent way to do so.
I wrote this code:
var reader = new StringReader(content);
var xmltextReader = new XmlTextReader(reader);
var memStream = new MemoryStream();
var xmlWriter = new XmlTextWriter(memStream, Encoding.UTF8);
while (xmltextReader.Read())
{
if (xmltextReader.Name == "w:t")
{
//xmlWriter.WriteRaw("blah");
}
else
{
xmlWriter.WriteRaw(xmltextReader.Value);
}
}
The code above only takes the value of elements declaration etc, so no brackets or anything. I realize that I could write code that specifically executed .WriteElement(), .WriteEndElement() etc depending on the NodeType, but I fear that will quickly be a mess.
So the question is:
How do I - in a nice way - pipe the xml data read from the XmlTextReader to the XmlTextWriter while still being able to manipulate the data while piping?
Try this
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string xml =
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?>" +
"<w:document xmlns:w=\"http://schemas.openxmlformats.org/wordprocessingml/2006/main\">" +
"<w:body>" +
"<w:p>" +
"<w:t>F_ck</w:t>" +
"<!-- -->" +
"<w:t>F_ck</w:t>" +
"<!-- -->" +
"<w:t>F_ck</w:t>" +
"</w:p>" +
"</w:body>" +
"</w:document>";
XDocument doc = XDocument.Parse(xml);
XElement document = (XElement)doc.FirstNode;
XNamespace ns_w = document.GetNamespaceOfPrefix("w");
List<XElement> ts = doc.Descendants(ns_w + "t").ToList();
foreach (XElement t in ts)
{
t.Value = "abc";
}
}
}
}
I want to read XML document from a property which is created in edit mode of Episerver.
I have made one property of type 'URL to Document'.
When I try to fetch it from code behind, it gives only file path. I am not able to read the content of XML file which is uploaded in property.
string XMLContent = Currentpage.Getproperty<string>("XMLFile");
Can anyone help out on this?
You need to load the file as well. Something like this:
var path = CurrentPage["XMLFile"] as string;
if (HostingEnvironment.VirtualPathProvider.FileExists(path))
{
var file = HostingEnvironment.VirtualPathProvider.GetFile(path) as UnifiedFile;
if (file != null)
{
using (var stream = file.Open())
{
// Here is your XML document
var xml = XDocument.Load(stream);
}
}
}
You can also load the file content by using the local path on disk, file.LocalPath.
try this
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string XMLContent = "";
//using XML
XmlDocument doc1 = new XmlDocument();
doc1.LoadXml(XMLContent);
//using xml linq
XDocument doc2 = XDocument.Parse(XMLContent);
}
}
}
So I have a device which has an inbuilt logger program which generates status messages about the device and keeps pushing them to a .txt file. These messages include information about the device status, network status amongst many other things. The data in the file looks something like the following:
<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>
last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>
<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>
last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>
... goes on
Note that it is not well formed XML. Also, one element can have multiple parameters and can also have blanks... for example: <NETWORKSTAT>1,456,3,6,,7</NETWORKSTAT>
What my objective is is to write something in C# WPF, that would take this text file, process the data in it and create a .csv file with each event per line.
For example, for the above given brief example, the first line in the csv file would be:
1,4,7,,5,hello,there,my,name,is,jack,,last,name,missing,above,3,6,7,,8,4
Also, I do not need help using basic C#. I know how to read a file, etc.. but I have no clue as to how I would approach this problem in regards to the parsing and processing and converting. I'm fairly new to C# so I'm not sure which direction to go. Any help will be appreciated!
Since each top-level XML node in your file is well-formed, you can use an XmlReader with XmlReaderSettings.ConformanceLevel = ConformanceLevel.Fragment to iterate through each top-level node in the file and read it with Linq-to-XML:
public static IEnumerable<string> XmlFragmentsToCSV(string path)
{
using (var textReader = new StreamReader(path, Encoding.UTF8))
foreach (var line in XmlFragmentsToCSV(textReader))
yield return line;
}
public static IEnumerable<string> XmlFragmentsToCSV(TextReader textReader)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
using (XmlReader reader = XmlReader.Create(textReader, settings))
{
while (reader.Read())
{ // Skip whitespace
if (reader.NodeType == XmlNodeType.Element)
{
using (var subReader = reader.ReadSubtree())
{
var element = XElement.Load(subReader);
yield return string.Join(",", element.DescendantNodes().OfType<XText>().Select(n => n.Value.Trim()).Where(t => !string.IsNullOrEmpty(t)).ToArray());
}
}
}
}
}
To precisely match the output you wanted I had to trim whitespaces at the beginning and end of each text node value.
Also, the Where(t => !string.IsNullOrEmpty(t)) clause is to skip the whitespace node corresponding to the space here: </ANOTHERTAG> </XML>. If that space doesn't exist in the real file, you can omit that clause.
Due to non standard format had to switch from an XML Linq solution to a standard XML solution. Linq doesn't support TEXT strings that are not in tags.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.csv";
static void Main(string[] args)
{
string input =
"<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>" +
"last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>" +
"<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>" +
"last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>";
input = "<Root>" + input + "</Root>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(input);
StreamWriter writer = new StreamWriter(FILENAME);
XmlNodeList rows = doc.GetElementsByTagName("XML");
foreach (XmlNode row in rows)
{
List<string> children = new List<string>();
foreach (XmlNode child in row.ChildNodes)
{
children.Add(child.InnerText.Trim());
}
writer.WriteLine(string.Join(",", children.ToArray()));
}
writer.Flush();
writer.Close();
}
}
}
Here is my solution that uses XML Linq. I create a XDocument by wrapping the fragments with a Root tag.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.csv";
static void Main(string[] args)
{
string input =
"<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>" +
"last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>" +
"<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>" +
"last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>";
input = "<Root>" + input + "</Root>";
XDocument doc = XDocument.Parse(input);
StreamWriter writer = new StreamWriter(FILENAME);
List<XElement> rows = doc.Descendants("XML").ToList();
foreach (XElement row in rows)
{
string[] elements = row.Elements().Select(x => x.Value).ToArray();
writer.WriteLine(string.Join(",", elements));
}
writer.Flush();
writer.Close();
}
}
}