how to convert csv to xml with different headers - c#

I have a csv with different column headers and I want to convert this to an XML payload.
The csv looks like following.
TEST1,APPLICATION_NAME,START_TIME,STOP_TIME,SERVICE_DESCRIPTION,FILING_STATUS,TIME_OF_LAST_UPDATE,RECORD_STATUS,ERROR_MESSAGE
,,20120101000000ES,20140131000000ES,New FGH Application,,,
,,20140304000000ES,20161231000000ES,New FGH Application,,,
,,20150109000000ES,20201231000000ES,New FGH Application,,,
TEST2,app,TOL,QUEUED
,nits,20120101000000ES,20201231000000ES
I tried to do this with Linq but couldn't figure out a way. Also I don't really want to specify columns like in the following example.
https://msdn.microsoft.com/en-us/library/bb387090
please note that this csv has different column headers.
The output I am expecting is;
<Root>
<TEST1>
<APPLICATION_NAME></APPLICATION_NAME>
<START_TIME>20120101000000ES</START_TIME>
<STOP_TIME>20140131000000ES</STOP_TIME>
<SERVICE_DESCRIPTION>New NITS Application</SERVICE_DESCRIPTION>
<FILING_STATUS></FILING_STATUS>
<TIME_OF_LAST_UPDATE></TIME_OF_LAST_UPDATE>
<RECORD_STATUS></RECORD_STATUS>
</TEST1>
<TEST1>
<APPLICATION_NAME></APPLICATION_NAME>
<START_TIME>20140304000000ES</START_TIME>
<STOP_TIME>20161231000000ES</STOP_TIME>
<SERVICE_DESCRIPTION>New NITS Application</SERVICE_DESCRIPTION>
<FILING_STATUS></FILING_STATUS>
<TIME_OF_LAST_UPDATE></TIME_OF_LAST_UPDATE>
<RECORD_STATUS></RECORD_STATUS>
</TEST1>
<TEST1>
<APPLICATION_NAME></APPLICATION_NAME>
<START_TIME>20150109000000ES</START_TIME>
<STOP_TIME>20201231000000ES</STOP_TIME>
<SERVICE_DESCRIPTION>New NITS Application</SERVICE_DESCRIPTION>
<FILING_STATUS></FILING_STATUS>
<TIME_OF_LAST_UPDATE></TIME_OF_LAST_UPDATE>
<RECORD_STATUS></RECORD_STATUS>
</TEST1>
<TEST2>
<app>nits</app>
<TOL>20120101000000ES</TOL>
<QUEUED>20201231000000ES</QUEUED>
</TEST2>
</root>
Thanks for your help.
update: this is what I started off with.
string[] headers = lines[0].Split(',').Select(x => x.Trim('\"')).ToArray();
var xml = new XElement("root",
lines.Where((line, index) => index > 0).Select(line => new XElement("TEST",
line.Split(',').Select((column, index) => new XElement(headers[index], column)))));

Expanding on the linked example, you can do this
string[] source = File.ReadAllLines("text.csv");
string IGNORE_ROW = "XXXXX";
List<string> data = new List<string>();
string test = "";
for (int i = 0; i < source.Length; i++)
{
string[] _str = source[i].Split(',');
if (String.IsNullOrWhiteSpace(_str[0])) _str[0] = test;
else
{
test = _str[0];
_str[0] = IGNORE_ROW;
}
source[i] = String.Join(",", _str);
}
XElement data = new XElement("Root",
from str in source
where str.StartsWith(IGNORE_ROW) == false
let fields = str.Split(',')
select new XElement(fields[0],
new XElement("APPLICATION_NAME", fields[1]),
new XElement("START_TIME", fields[2]),
new XElement("STOP_TIME", fields[3]),
new XElement("SERVICE_DESCRIPTION", fields[4]),
new XElement("FILING_STATUS", fields[5]),
new XElement("TIME_OF_LAST_UPDATE", fields[6]),
new XElement("RECORD_STATUS", fields[7])
)
);
Console.WriteLine(data);
It is simply a matter for renaming the relevant elements and including them in the correct order.
// Edited
After reviewing the comment, it appears you are repeating the header within the data so that it can used as an element name. If you have control over the csv generation, remove this repeated row, and simply output the test value as the first element in the csv.
If you do not have control over the csv, you can alter the text so that it can be set. This is what the edited example does.

Use TextFieldParser to read the csv file and parse it into classes.
Then use XDocument to build a xml document in memory and write it to a file after its completed.

Related

How to edit single line XML element with multiple values in c#?

I am editing an existing XML file. I cannot edit an element (the element “range”) that is a multiple value element (a list or an array?), as shown in the XML code below.
<objects>
<PinJoint name="revolute">
<socket_parent_frame>torso_offset</socket_parent_frame>
<socket_child_frame>base_offset</socket_child_frame>
<coordinates>
<Coordinate name="revolute_angle">
<default_value>0</default_value>
<default_speed_value>0</default_speed_value>
<range>-1.5700000000000001 1.5700000000000001</range>
<clamped>true</clamped>
<locked>true</locked>
</Coordinate>
</coordinates>
I want to rewrite the values of the element “range”. In this case, these are two numeric values (doubles) separated by spaces.
I am trying to do this with the following c# code
XDocument xdoc = XDocument.Load(#"G:/My Drive/File.xml");
double r1 = new double[,] { {0.1745, 1.3963 } };
var pinjoints = xdoc.Root.Descendants("objects").Elements("PinJoint").Descendants("coordinates").Elements("Coordinate").FirstOrDefault(x => x.Attribute("name").Value == "revolute_angle");
pinjoints.SetElementValue("range", r1);
but the “range” element is rewritten as follow:
<range>System.Double[,]</range>
Can you please help me to edit the element “range” by rewriting one or all its numeric values?
Best regards
Andrés
You are passing array(r1) to the XML element as a value. XML elements accept string.
You can replace your code with the following code & it should work.
XDocument xdoc = XDocument.Load(#"G:/My Drive/File.xml");
double[] r1 = new double[] {0.1745, 1.3963 };
var str = string.Join(",",r1);
var pinjoints = xdoc.Root.Descendants("objects").Elements("PinJoint").Descendants("coordinates").Elements("Coordinate").FirstOrDefault(x => x.Attribute("name").Value == "revolute_angle");
pinjoints.SetElementValue("range", str);
Here, I have just used string.Join() method to produce comma separated string that can be easily passed to XML element.

Get all value of attribute from each Element from XML

I have an xml file which looks like this:
<HeadercardUnit EndTime="2065-25-45 20:32:44" StartTime="2065-25-45 20:32:23" Rejects="NO" MilliSec="1" Currency="USD" DeclaredDepositAmount="0" denomvalue="1" DepositID="" CustomerID="" HeaderCardID="">
<Counter Number="2" Currency="USD" Output="Stacked" Quality="Accepted" Issue="2006" Value="5" DenomID="" DenomName="5 USD-2006"/>
<Counter Number="31" Currency="USD" Output="Stacked" Quality="Accepted" Issue="2000" Value="1" DenomID="" DenomName="1 USD-2000"/>
<Sum Number="33" Currency="USD" Output="Stacked" Sum="41.00"/>
</HeadercardUnit>
I try to parse it with this code:
string[] content = Directory.GetFiles(Directory.GetCurrentDirectory() + #"\", "*.xml");
XDocument xdoc = XDocument.Load(content[0]);
XElement xml1 = XElement.Load(content[0]);
string xml2 = xml1.ToString();
//Console.WriteLine(xml2);
XElement xml = XElement.Parse(xml2);
var counter = xdoc.Descendants("Counter").Count();
var data = from bps in xdoc.Root.Descendants("Machine")
let Param = bps.Element("ParameterSection")
let Opt = Param?.Element("Operator")
let Hcl = Param?.Element("HeadercardUnit")
let Count = Hcl?.Element("Counter")
select new
{
Type = (string)bps.Attribute("Type"),
SerialNum = (string)bps.Attribute("SerialNumber"),
Startime = (string)Param?.Attribute("StartTime"),
Endtime = (string)Param?.Attribute("EndTime"),
Opt = (string)Opt?.Value,
Number = (string)Count?.Attribute("Number")
};
foreach (var pcl in data)
{
MessageBox.Show(counter.ToString());
for (int i = 0; i < counter; i++)
{
LogService(string.Format("{0},{1},{2},{3},{4},{5}",
pcl.Type, pcl.SerialNum, pcl.Startime, pcl.Endtime, pcl.Opt, pcl.Number));
}
}
The result only give me one line which is looping two time because the counter tag have two elements looks like this:
BPSC1,309322,2065-25-45 20:32:23,2065-25-45 20:32:44,User1,2
BPSC1,309322,2065-25-45 20:32:23,2065-25-45 20:32:44,User1,2
It's a little hard to give a definite answer given you've omitted a portion of the XML that would allow this to be reproduced. However, this line:
let Count = Hcl?.Element("Counter")
Gets the first Counter element. If you want all of them (as you suggest), then you need to iterate through those:
from Count in Hcl.Elements("Counter")
This will then create an object in data for each Counter element.
i added the line inspired by #Charles Mager
from bps in xdoc.Root.Descendants("Machine")
from Countx in bps.Descendants("Counter")
then i can call all attributes

How to seperate strings from a serialized XML node

I have an serialized XML file. This shows the relevant part:
I am reading this XML file with this (code snippet):
temp = Path.GetFileNameWithoutExtension(s);
var document = new XmlDocument();
document.Load(s);
var root = document.DocumentElement;
var node = root["ScenarioDescription"];
var text = node?.InnerText;
var ArmyNode = root["ArmyFiles"];
var ArmyText = ArmyNode?.InnerText;
However, ArmyText returns the concatenation of all three strings that make up the ArmyFiles node. I need them as three separate strings. How can I do this?
This code works to read all the strings in the node and place them into a list:
foreach (XmlElement A in ArmyNode)
{
var ArmyTemp = A.InnerText;
ArmyList.Add(ArmyTemp);
}
var ArmyText = ArmyNode?.InnerText;

Need to select Data from XML file C#

What I'm trying to do is get data from my XML file which has been merged with two others and selected each venue from that file and try to add the value to a list so I can manipulate it further.
This is one of my XML files
<?xml version="1.0" encoding="utf-8" ?>
<Funrun>
<Venue name="Roker Park">
<Runner charity="Cancer Research">
<Firstname>Roger</Firstname>
<Lastname>Malibu</Lastname>
<Sponsorship>550</Sponsorship>
</Runner>
<Runner charity="Arthritis UK">
<Firstname>Adam</Firstname>
<Lastname>White</Lastname>
<Sponsorship>340</Sponsorship>
</Runner>
</Venue>
</Funrun >
I need to be able to select the venue name and save it to a list. This is what I've got so far:
List<string> VenueNames = new List<string>();
var doc = XDocument.Load("XMLFile1.xml");
var doc2 = XDocument.Load("XMLFile2.xml");
var doc3 = XDocument.Load("XMLFile3.xml");
var combinedUnique = doc.Descendants("Venue")
.Union(doc2.Descendants("Venue"))
.Union(doc3.Descendants("Venue"));
foreach (var venuename in combinedUnique.Elements("Venue"))
{
VenueNames.Add(venuename.Attribute("name").Value));
}
The easiest way I would do it is by including Name and Charity within the XElements they belong to.
What I would recommend you do is first reformat your document so that it looks like this:
<Funrun>
<Venue>
<Name>Roker Park</Name>
<Runner1>
<charity>Cancer Research</charity>
<Firstname>Roger</Firstname>
<Lastname>Malibu</Lastname>
<Sponsorship>550</Sponsorship>
</Runner1>
<Runner2>
<charity>Arthritis UK</charity>
<Firstname>Adam</Firstname>
<Lastname>White</Lastname>
<Sponsorship>340</Sponsorship>
</Runner2>
</Venue>
</Funrun >
Note that you could get even simpler by combining all the elements under "Funrun" (example: "Venue") and just iterate through all of them without having to switch documents.
Next moving over to C#:
List<string> VenueNames = new List<string>();
var doc = XDocument.Load("XMLFile1.xml");
var doc2 = XDocument.Load("XMLFile2.xml");
var doc3 = XDocument.Load("XMLFile3.xml");
foreach (XElement element in doc.Root.Descendants("Venue"))
{
VenueNames.Add(element.Element("Name").Value.ToString());
}
//Copy paste this code for each document you would like to search through, though of course change "doc" to say, "doc2".
So just real quick, what this code will do is it will first open the Root element in the XDocument. It will find Decendants of that element with the name, "Name", and for each of those it will copy its value as a string into your list.
List<string> xmlFilePaths = new List<string>
{
#"Path\\SomeJson.txt",
#"Path\\SomeJson1.txt"
};
var venues = xmlFilePaths.Select(fp => XDocument.Load(fp).Descendants("Venue")?.FirstOrDefault()?.Attribute("name")?.Value).Distinct().ToList();

LINQ Except query with an XElement

I have a data set that I receive from a service. The data comes in XML format. We are given an XElement object with all the data. The structure of the XML document is very simple. Looks like this:
<root>
<dataPoint>
<id>1</id>
<param1>somedata</param1>
<param2>somedata</param2>
</dataPoint>
<dataPoint>
<id>2</id>
<param1>somedata</param1>
<param2>somedata</param2>
</dataPoint>
</root>
Of course, I have a large number of dataPoints. I also have a list (List) with the id's of dataPoints being displayed in a GUI. What I'd like to have is the dataPoints that ARE NOT displayed on the GUI so I can manipulate only those and not the whole data set.
Thanks
var toDisplay = new List<string>() { "2" };
var xDoc = XElement.Load(.....);
var dPoints = xDoc.Descendants("dataPoint")
.Where(d => !toDisplay.Contains(d.Element("id").Value));
var newXml = new XElement("root",dPoints).ToString();
Here's one way using a mutable dictionary and giving you a dictionary of points and corresponding nodes:
var xml = "<root> \r\n <dataPoint>\r\n <id>1</id>\r\n <param1>somedata</param1>\r\n <param2>so" +
"medata</param2>\r\n </dataPoint>\r\n <dataPoint>\r\n <id>2</id>\r\n <param1>somedata" +
"</param1>\r\n <param2>somedata</param2>\r\n </dataPoint>\r\n</root>";
var doc = XElement.Parse(xml);
var pointsInGui = new List<int> {1, 3, 5};
var dict = doc.Descendants("dataPoint").ToDictionary(e => Convert.ToInt32(e.Element("id").Value));
foreach (var point in pointsInGui) dict.Remove(point);
At this point dict contains 2 => <relevant element>.

Categories