So I have a device which has an inbuilt logger program which generates status messages about the device and keeps pushing them to a .txt file. These messages include information about the device status, network status amongst many other things. The data in the file looks something like the following:
<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>
last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>
<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>
last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>
... goes on
Note that it is not well formed XML. Also, one element can have multiple parameters and can also have blanks... for example: <NETWORKSTAT>1,456,3,6,,7</NETWORKSTAT>
What my objective is is to write something in C# WPF, that would take this text file, process the data in it and create a .csv file with each event per line.
For example, for the above given brief example, the first line in the csv file would be:
1,4,7,,5,hello,there,my,name,is,jack,,last,name,missing,above,3,6,7,,8,4
Also, I do not need help using basic C#. I know how to read a file, etc.. but I have no clue as to how I would approach this problem in regards to the parsing and processing and converting. I'm fairly new to C# so I'm not sure which direction to go. Any help will be appreciated!
Since each top-level XML node in your file is well-formed, you can use an XmlReader with XmlReaderSettings.ConformanceLevel = ConformanceLevel.Fragment to iterate through each top-level node in the file and read it with Linq-to-XML:
public static IEnumerable<string> XmlFragmentsToCSV(string path)
{
using (var textReader = new StreamReader(path, Encoding.UTF8))
foreach (var line in XmlFragmentsToCSV(textReader))
yield return line;
}
public static IEnumerable<string> XmlFragmentsToCSV(TextReader textReader)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
using (XmlReader reader = XmlReader.Create(textReader, settings))
{
while (reader.Read())
{ // Skip whitespace
if (reader.NodeType == XmlNodeType.Element)
{
using (var subReader = reader.ReadSubtree())
{
var element = XElement.Load(subReader);
yield return string.Join(",", element.DescendantNodes().OfType<XText>().Select(n => n.Value.Trim()).Where(t => !string.IsNullOrEmpty(t)).ToArray());
}
}
}
}
}
To precisely match the output you wanted I had to trim whitespaces at the beginning and end of each text node value.
Also, the Where(t => !string.IsNullOrEmpty(t)) clause is to skip the whitespace node corresponding to the space here: </ANOTHERTAG> </XML>. If that space doesn't exist in the real file, you can omit that clause.
Due to non standard format had to switch from an XML Linq solution to a standard XML solution. Linq doesn't support TEXT strings that are not in tags.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.csv";
static void Main(string[] args)
{
string input =
"<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>" +
"last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>" +
"<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>" +
"last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>";
input = "<Root>" + input + "</Root>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(input);
StreamWriter writer = new StreamWriter(FILENAME);
XmlNodeList rows = doc.GetElementsByTagName("XML");
foreach (XmlNode row in rows)
{
List<string> children = new List<string>();
foreach (XmlNode child in row.ChildNodes)
{
children.Add(child.InnerText.Trim());
}
writer.WriteLine(string.Join(",", children.ToArray()));
}
writer.Flush();
writer.Close();
}
}
}
Here is my solution that uses XML Linq. I create a XDocument by wrapping the fragments with a Root tag.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.csv";
static void Main(string[] args)
{
string input =
"<XML><DSTATUS>1,4,7,,5</DSTATUS><EVENT> hello,there,my,name,is,jack,</EVENT>" +
"last,name,missing,above <ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG> </XML>" +
"<XML><DSTATUS>1,5,7,,3</DSTATUS><EVENT>hello,there,my,name,is,mary,jane</EVENT>" +
"last,name,not,missing,above<ANOTHERTAG>3,6,7,,8,4</ANOTHERTAG></XML>";
input = "<Root>" + input + "</Root>";
XDocument doc = XDocument.Parse(input);
StreamWriter writer = new StreamWriter(FILENAME);
List<XElement> rows = doc.Descendants("XML").ToList();
foreach (XElement row in rows)
{
string[] elements = row.Elements().Select(x => x.Value).ToArray();
writer.WriteLine(string.Join(",", elements));
}
writer.Flush();
writer.Close();
}
}
}
Related
I have the following XML in an API
I wish to read the data from the tags and build a CSV file with the values.
My code so far,
using System;
using System.Collections.Generic;
using System.Text;
using System.Web;
using System.IO;
using System.Net;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication3
{
class Program
{
static void Main(string[] args)
{
string url = "https://localhost:5001/api/Scheduler/GetScheduler";
WebRequest request = WebRequest.Create(url);
try
{
WebResponse response = request.GetResponse();
using (var sr = new System.IO.StreamReader(response.GetResponseStream()))
{
XDocument xmlDoc = new XDocument();
try
{
xmlDoc = XDocument.Parse(sr.ReadToEnd());
Console.WriteLine(xmlDoc.Root.Element("ORD_NAME").Value);
}
catch (Exception)
{
// handle if necessary
}
}
}
catch (WebException)
{
// handle if necessary
}
}
}
}
I can see the data being read in, but xmlDoc, but xmlDoc.Root.Element("ORD_NAME").Value is NULL.
How do I get the data from the stream?
Thanks.
I think you want e.g
using (var stream = response.GetResponseStream())
{
XDocument xmlDoc = XDocument.Load(stream);
Console.WriteLine(string.Join("\n", from row in xmlDoc.Root.Elements("Scheduler") select string.Join(";", row.Elements().Select(e => e.Value))));
}
.NET probably has some better APIs to construct CSVs than using LINQ to XML directly, the above will fail to quote value or escape separators.
I have an xml document which has no root node. It looks like this:
<?xml version="1.0"?>
<Line>
<City>Paris</City>
<Country>France</Country>
</Line>
<Line>
<City>Lissabon</City>
<Country>Spain</Country>
</Line>
No I want to read Line by Line and write the contents to a database. However, XmlDocument seems to insist that there must exist a root node. How can I process this file?
If you want to parse it as an XML document, you can add a root node like Denis proposed in his comment.
If you would just like to read each line and write it to a database, you can handle the file like an ordinary (text) file and read its contents line by line using a StreamReader.
This would look something like this:
string line;
// Read the file and process it line by line.
var reader = new StreamReader(FILEPATH);
while((line = reader.ReadLine()) != null)
{
// Depending on what you need, you could strip the XML tags
// And write the line to the database
}
reader.Close();
You could try something like this (simple WinForms app with a button and a rich text box to display output for testing):
using System;
using System.Text;
using System.Xml;
using System.Windows.Forms;
namespace WindowsFormsApp11
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
StringBuilder sb = new StringBuilder();
XmlReaderSettings settings = new XmlReaderSettings
{
ConformanceLevel = ConformanceLevel.Fragment
};
using (XmlReader reader = XmlReader.Create(#"c:\ab\countries.xml", settings))
{
while(reader.Read())
{
if (reader.Name != "Line") // Ignore the <Line> nodes
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
sb.Append(string.Format("{0}:", reader.Name));
break;
case XmlNodeType.Text:
sb.Append(string.Format(" {0}{1}", reader.Value, Environment.NewLine));
break;
}
}
}
}
richTextBox1.Text = sb.ToString();
}
}
}
May be not the best solution, but you could create a List (or array) from your XML and insert missing nodes:
// Read lines into List
var list = File.ReadLines("doc.xml").ToList();
// Insert missing nodes
list.Insert(1, "<root>"); // Use 1, because 0 is XML directive
list.Insert(list.Count, "</root>"); //Add closing tag to the end
// Create final XML string with LINQ
var xml_str = list.Aggregate("", (acc, s) => acc + s);
// Having a string, we can create, for instance, XElement (or XDocument)
var xml = XElement.Parse(xml_str);
Console.WriteLine(xml.Element("Line").Element("City").Value);
//Output: Paris
I have a large single node .xml file that I have saved as a string. I want to parse the .xml file for a specific element read and output the innertext. EG: I want to read the FrameNo element and output BINGO to a messagebox. The desired element will only appear once in the .xml document. I prefer using XmlDocument.
I have tried numerous C# .xml examples but am unable to get a output.
xml text is
<Aircraft z:Id="i1" xmlns="http://xxx.yyyyycontract.gov/2018/03/Boeing.xxxxxxxxxxxxxx.Airframe"
xmlns:i="http://www.xxxxxxx.com/2019/XMLSchema-instance"
xmlns:z="http://xxxxxxx.xxxxxxxxx.com/2005/01/Serialization/"><Timestamp i:nil="true"/>
<Uuid>00000000-0000-0000-0000-000000000000</Uuid><Comments i:nil="true"/><Facility>..........
and so on to the end of the .xml
<FrameNo>BINGO</FrameNo><WDate i:nil="true"/></Aircraft>
this is the code section I want to have the code execute in.
private void buttonLoad_Click(object sender, EventArgs e)
{
}
I think, this is self-explanatory
using System.Xml.Linq;
XElement root = XElement.Load(textXML);
XElement myElement = root.Element("FrameNo");
if (myElement != null)
myData = myElement.InnerText;
Thanks to jdweng I wanted to share the final code for others to use. This will function in a method like below
private void buttonMaint_Click(object sender, EventArgs e)
{
XDocument doc = XDocument.Parse(xmlinputstr); // input string from memory or input file
XNamespace ns = doc.Root.GetDefaultNamespace();
string[] Frame = doc.Descendants(ns + "FrameNo").Select(x => (string)x).ToArray(); // selects element to read + trailing character of >
string frame = string.Join("", Frame); //converts from array to string
if (string.IsNullOrEmpty(frame)) // check for empty result
{
txtFrame.Text = "not found"; //outputs to textbox
}
else
{
txtFrame.Text = (frame); //outputs to textbox
}
}
Comments are there for clarity
You need to use the default namespace. See my xml linq solution below :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
string xml = File.ReadAllText(FILENAME);
XDocument doc = XDocument.Parse(xml);
XNamespace ns = doc.Root.GetDefaultNamespace();
XElement frameNo = doc.Descendants(ns + "FrameNo").FirstOrDefault();
string frame = (string)frameNo;
string[] serialNumbers = doc.Descendants(ns + "SerialNumber").Select(x => (string)x).ToArray();
}
}
}
Another weird snag has shown up. Some of the elements are named like this.
<a:SupplierServDoc>
the innertext contents of this element is a base64 packet. There is no problem processing the base64 packet.
The code from the above answers does output the base64 correctly but cannot handle the : in the element name. It throws a 3A hex character error.
I have this code that outputs the inntertext but not as a base64 packet. I have also looked into prefix to handle the : but with worse results. I am outputting the base 64 innertext as a .txt file when finished.
XNamespace ad = http://www.mmmmmmmmmm.com";
XName k = ad + "SupplierServDoc";
string[] WING = doc.Descendants(k).Select(x => (string)x).ToArray();
string wing = string.Join("", WING);
if (string.IsNullOrEmpty(syncd))
{
MessageBox.Show("a:SupplierServDoc Base 64 code not found");
}
else
{
MessageBox.Show("Test " + wing);
}
I generated a XML file through API call then I tried to read the file using XML source component in ssis but it is read only data sets except all data contains in file .
Here my file
<?XML version 1.0 >
<ABC>
<a>info<a/>
<ABC/>
But I want file like below then only I can easily read file using component
We can manipulate the file manually for single file but not for thousand files
<?XML Version 1.0>
<X>
<ABC>
<a>info <a/>
<ABC/>
</X>
How to add that 'X' node to the existing file .
I am not having much exposure on .Net technology .
Kindly help me at the earliest of time .
Thank You
KiranKumar
Using xml linq
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string xml =
"<?xml version=\"1.0\" encoding=\"utf-8\" ?>" +
"<ABC>" +
"<a>info</a>" +
"</ABC>";
XDocument doc = XDocument.Parse(xml);
XElement root = doc.Root;
root.ReplaceWith(new XElement("X", root));
}
}
}
Try streaming API.
using (var reader = XmlReader.Create("test.xml"))
using (var writer = XmlWriter.Create("test2.xml"))
{
writer.WriteStartElement("X");
reader.MoveToContent();
writer.WriteNode(reader.ReadSubtree(), true);
writer.WriteEndElement();
}
This approach handles xml without excessive memory consumption.
Also, this method allows to modify xml on the fly, getting it from the input API stream and writing to output stream.
using (var reader = XmlReader.Create(inputStream))
using (var writer = XmlWriter.Create(outputStream))
I want to read XML document from a property which is created in edit mode of Episerver.
I have made one property of type 'URL to Document'.
When I try to fetch it from code behind, it gives only file path. I am not able to read the content of XML file which is uploaded in property.
string XMLContent = Currentpage.Getproperty<string>("XMLFile");
Can anyone help out on this?
You need to load the file as well. Something like this:
var path = CurrentPage["XMLFile"] as string;
if (HostingEnvironment.VirtualPathProvider.FileExists(path))
{
var file = HostingEnvironment.VirtualPathProvider.GetFile(path) as UnifiedFile;
if (file != null)
{
using (var stream = file.Open())
{
// Here is your XML document
var xml = XDocument.Load(stream);
}
}
}
You can also load the file content by using the local path on disk, file.LocalPath.
try this
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string XMLContent = "";
//using XML
XmlDocument doc1 = new XmlDocument();
doc1.LoadXml(XMLContent);
//using xml linq
XDocument doc2 = XDocument.Parse(XMLContent);
}
}
}