Read multiple xml tables (under the same Root node) into DataTables/DataSet - c#

I have an XML source document with multiple "report" nodes under the Root node. I need to read each "report" node into its own DataTable. It looks like I'll either need to transform my source XML data using an xsl stylesheet to get it in the format that'll work nicely or iterate through my xml elements like so:
namespace XmlParse2
{
class Program
{
static IEnumerable<string> expectedFields = new List<string>() { "Field1", "Field2", "Field3", "Field4" };
static void Main(string[] args)
{
string xml = #"<Root>
<Report1>
<Row>
<Field1>data1-1</Field1>
<Field2>data1-2</Field2>
<Field4>data1-4</Field4>
</Row>
<Row>
<Field1>data2-1</Field1>
<Field2>data2-2</Field2>
</Row>
</Report1>
<Report2>
<Row>
<Field1>data1-1</Field1>
<Field4>data1-4</Field4>
</Row>
<Row>
<Field1>data2-1</Field1>
<Field3>data2-3</Field3>
</Row>
</Report2>
</Root>";
DataTable report1 = new DataTable("Report1");
report1.Columns.Add("Field1");
report1.Columns.Add("Field2");
report1.Columns.Add("Field3");
report1.Columns.Add("Field4");
DataTable report2 = new DataTable("Report2");
report2.Columns.Add("Field1");
report2.Columns.Add("Field2");
report2.Columns.Add("Field3");
report2.Columns.Add("Field4");
var doc = XDocument.Parse(xml);
var report1Data = doc.Root.Elements("Report1").Elements("Row").Select(record => MapRecord(record));
var report2Data = doc.Root.Elements("Report2").Elements("Row").Select(record => MapRecord(record));
report1 = addRows(report1, report1Data);
report2 = addRows(report2, report2Data);
Console.ReadLine();
}
public static Dictionary<string, string> MapRecord(XElement element)
{
var output = new Dictionary<string, string>();
foreach (var field in expectedFields)
{
bool hasField = element.Elements(field).Any();
if (hasField)
{
output.Add(field, element.Elements(field).First().Value);
}
}
return output;
}
public static DataTable addRows(DataTable table, IEnumerable<Dictionary<string, string>> data)
{
foreach (Dictionary<string, string> dict in data)
{
DataRow row = table.NewRow();
foreach(var item in dict)
{
row[item.Key] = item.Value;
}
table.Rows.Add(row);
}
return table;
}
}
}
The problem with my source data not working seems to be that both Report1 and Report2 have child nodes that are named "Row" and my attempts to do stuff using DataSet.ReadXml is not successful because my code just groups all nodes named Row into one DataTable instead of separate DataTables. :/
What am I missing?

XDocument xdoc = XDocument.Load(path_to_xml);
var tables = xdoc.Root.Elements()
.Select(report => {
DataTable table = new DataTable(report.Name.LocalName);
var fields = report
.Descendants("Row")
.SelectMany(row => row.Elements()
.Select(e => e.Name.LocalName))
.Distinct();
foreach(string field in fields)
table.Columns.Add(field);
foreach(var row in report.Descendants("Row"))
{
DataRow dr = table.NewRow();
foreach(var field in row.Elements())
dr[field.Name.LocalName] = (string)field;
table.Rows.Add(dr);
}
return table;
});
This query will return IEnumerable<DataTable>. Each datatable will contain only those columns, which have values in xml. Column names retrieved from xml and could be different for each table. For your sample structure will look this way:
DataTable: Report1
Columns: Field1, Field2, Field4
DataTable: Report2
Columns: Field1, Field3, Field4
All rows data will be added to each table.
You can extract some code to methods. It will make code easier to understand:
XDocument xdoc = XDocument.Load(path_to_xml);
var tables = xdoc.Root.Elements()
.Select(report => CreateTableFrom(report));
And methods:
private static DataTable CreateTableFrom(XElement report)
{
DataTable table = new DataTable(report.Name.LocalName);
table.Columns.AddRange(GetColumnsOf(report));
foreach (var row in report.Descendants("Row"))
{
DataRow dr = table.NewRow();
foreach (var field in row.Elements())
dr[field.Name.LocalName] = (string)field;
table.Rows.Add(dr);
}
return table;
}
private static DataColumn[] GetColumnsOf(XElement report)
{
return report.Descendants("Row")
.SelectMany(row => row.Elements().Select(e => e.Name.LocalName))
.Distinct()
.Select(field => new DataColumn(field))
.ToArray();
}

Related

How to fill a DataTable with XElement/XAttribute?

I'm new to C#, I never worked with a DataTable before.
I want a DataGridView with specific names.
DataTable table = new DataTable();
List<string> bla = new List<string>();
XDocument config = XDocument.Load(configFile);
Dictionary<string, string> dict = config.Descendants("Columns").FirstOrDefault().Elements()
.GroupBy(x => (string)x.Attribute("XPath"), y => (string)y.Attribute("Name"))
.ToDictionary(x => x.Key, y => y.FirstOrDefault());
//I dont know if I need this:
foreach (string key in dict.Keys)
{
table.Columns.Add(key, typeof(string));
}
foreach (XElement position in positions.Where(e => e.HasAttributes))
{
foreach (XAttribute attribute in position.Attributes().Where(a => dict.ContainsKey($"#{a.Name.LocalName}")))
{
string name = attribute.Name.LocalName;
string value = (string)attribute;
string xName = dict["#" + name];
bla.Add(xName);
}
The columns should have the name from xName.
How can I do this?
I've tried this:
foreach (var item in bla)
{
DataRow row = table.NewRow();
row.SetField<string>(item); //this didn't work.
//foreach (string key in dict.Keys)
//{
// row.SetField<string>(key, item[key]);
//}
}
Just want the names from xName as my heading for the output.
Example für xName: Position, Status, Order, Number, ...
As my heading.
And under that the values.
if i understand you correctly, you've got your list of column names ok, but dont know how to create a datatable with the correct column names.
Below is an example of how to add a column and row to a datatable with a specific column header name.
As discussed in the comments, I've demonstrated a process to get the data you need into a structure that allows you to populate your table.
//Class to hold data
public class MyRecordContent
{
public MyRecordContent()
{
//initialise list
RecordsColumns = new List<string>();
}
//Holds a list of strings for each column of the record.
//It starts at position 0 to however many columns you have
public List<string> RecordsColumns { get; set; }
}
//This creates an empty table with the columns
var myTable = new DataTable("Table1");
foreach (var item in bla)
{
if (!myTable.Columns.Contains(item))
{
myTable.Columns.Add(new DataColumn(item, typeof(string)));
}
}
//Here you build up a list of all records and their field content from your xml.
foreach (var xmlNode in yourXMLRecordCollection)
{
var thisRecord = new MyRecordContent();
foreach (var xmlCol in xmlNode.Elements)//Each column value
{
thisRecord.RecordsColumns.Add(xmlCol.GetValue());
}
myListOfRecords.Add(thisRecord);
}
foreach (MyRecordContent record in myListOfRecords)
{
var row = myTable.NewRow();
//Here we set each row column values in the datatable.
//Map each rows column value to be the value in the list at same position.
for (var colPosition = 0; colPosition <= myTable.Columns.Count - 1;) //Number of columns added.
{
row[colPosition] = record.RecordsColumns[colPosition];
}
myTable.Rows.Add(row);
}
In the above, itterate through your list of column names and add each column to the table. You may want to add a switch statement to the loop to change the datatype of the column based upon name if required. Then create of new row off that table and set each fields value accordingly.
Finally, add the new row to the datatable.
Hope that helps.
Then

Very Big JSON (92,000 rows) does not load

I have a simple function to return my datatable as JSON from c# using the Serializer as follows -
public static string ConvertToJSON (DataTable dt)
{
System.Web.Script.Serialization.JavaScriptSerializer serializer = new System.Web.Script.Serialization.JavaScriptSerializer();
List<Dictionary<string, object>> rows = new List<Dictionary<string, object>>();
Dictionary<string, object> row;
foreach (DataRow dr in dt.Rows)
{
row = new Dictionary<string, object>();
foreach (DataColumn col in dt.Columns)
{
row.Add(col.ColumnName, dr[col]);
}
rows.Add(row);
}
return serializer.Serialize(rows);
}
Which I use as follows
return ConvertToJSON(objDataTable);
where objDataTable is my Datatable
I also have
return JsonConvert.SerializeObject(strArrMapObject, Formatting.None);
where I am using the library Newtonsoft.Json and strArrMapObject is an Itemarray built from objDataTable
Both the above methods work fine for small datatables and I get the output like this -
["11-06-2014 00:00:00","17:45:00","Beta","357637031475680","404490480844084","78","IN","","8143888569","48"]
But when I do it for a big datatable (eg. 92,000 rows), nothing happens!
There is no response and there is no timeout error also.
So when I use
alert (response);
[in Javascript] or even
document.getElementById('divDataHolder').innerHTML = response;
[in Javascript]
absolutely nothing happens!
Please help!
Rewrite your request so that you can ask
select 100 rows page 1 // selects items 1-100
select 100 rows page 2 // selects items 101-200
This would solve more then 1 problem.
public static string ConvertToJSON (DataTable dt, int page = 0, int count = 100)
//...
foreach (DataRow o in dt.AsEnumerable().Skip(page * count).Take(count))
Edit: You can use following method to get Json
//add reference System.Data
//add reference System.Web.extensions
//add reference System.Web.DataTableExtensions
public static string ConvertToJson(DataTable dt, int page = 0, int count = 100)
{
var serializer = new System.Web.Script.Serialization.JavaScriptSerializer();
var rows = new List<Dictionary<string, object>>();
foreach (DataRow dr in dt.AsEnumerable().Skip(page * count).Take(count).ToList())
{
rows.Add(dt.Columns.Cast<DataColumn>().ToDictionary(col => col.ColumnName, col => dr[col]));
}
return serializer.Serialize(rows);
}

C# treeview getting duplicate nodes

By the beginning of this week Iwas having a problem with TreeView not displaying children. Everything got worked out through recursiveness. However, a new and unexpected problem arose: the methods i'm using are getting duplicate nodes on some specific DataTables.
Having this DataTable of two columns:
ParentOT ChildOT
20120601 20120602
20120601 20120603
20120601 20120604
20120601 20120611
20120601 20120612
20120602 20120605
20120602 20120606
20120602 20120607
20120602 20120608
20120602 20120610
20120603 20120607
20120603 20120608
20120603 20120609
If I try to display its Treeview I get the right treeview, but five times consecutively (the times the parent appears as parent in parentOT records).
The Methods are these:
private TreeView cargarOtPadres(TreeView trv, int otPadre, DataTable datos)
{
if (datos.Rows.Count > 0)
{
foreach (DataRow dr in datos.Select("OTPadre="+ otPadre))
{
TreeNode nodoPadre = new TreeNode();
nodoPadre.Text = dr["OTPadre"].ToString();
trv.Nodes.Add(nodoPadre);
cargarSubOts(ref nodoPadre, int.Parse(dr["OTPadre"].ToString()), datos);
}
}
return trv;
}
private void cargarSubOts(ref TreeNode nodoPadre, int otPadre, DataTable datos)
{
DataRow[] otHijas = datos.Select("OTPadre=" + otPadre);
foreach (DataRow drow in otHijas)
{
TreeNode hija = new TreeNode();
hija.Text = drow["OTHija"].ToString();
nodoPadre.Nodes.Add(hija);
cargarSubOts(ref hija, int.Parse(drow["OTHija"].ToString()), datos);
}
}
With Tables with just 1 great parent appearing 1 time only, it works great. How can i prevent the TreeView from duplicating??
I'll leave the answer for the sake of completion. This solution came courtesy of #King King
public static class TreeViewExtension
{
public static void LoadFromDataTable(this TreeView tv, DataTable dt)
{
var parentNodes = dt.AsEnumerable()
.GroupBy(row => (string)row[0])
.ToDictionary(g => g.Key, value => value.Select(x => (string)x[1]));
Stack<KeyValuePair<TreeNode, IEnumerable<string>>> lookIn = new Stack<KeyValuePair<TreeNode, IEnumerable<string>>>();
HashSet<string> removedKeys = new HashSet<string>();
foreach (var node in parentNodes)
{
if (removedKeys.Contains(node.Key)) continue;
TreeNode tNode = new TreeNode(node.Key);
lookIn.Push(new KeyValuePair<TreeNode, IEnumerable<string>>(tNode, node.Value));
while (lookIn.Count > 0)
{
var nodes = lookIn.Pop();
foreach (var n in nodes.Value)
{
IEnumerable<string> children;
TreeNode childNode = new TreeNode(n);
nodes.Key.Nodes.Add(childNode);
if (parentNodes.TryGetValue(n, out children))
{
lookIn.Push(new KeyValuePair<TreeNode, IEnumerable<string>>(childNode, children));
removedKeys.Add(n);
}
}
}
tv.Nodes.Add(tNode);
}
}
}
You create this class
And you use afterwards like this.
treeView1.LoadFromDataTable(DataTable);
Be sure to use it with a String type DataTable. If you have a int type Table, you can do something like this:
DataTable stringDataTable = intDataTable.Clone();
stringDataTable.Columns[0].DataType = typeof(string);
stringDataTable.Columns[1].DataType = typeof(string);
foreach (DataRow dr in intDataTable.Rows)
{
stringDataTable.ImportRow(dr);
}
treeView1.LoadFromDataTable(stringDataTable);

Populating xml elements representitave of a table structure containing a header and body

I am trying to build a xml structure that represents a table with rows and cells using linq to xml. Example:
<table>
<tablerow>
<cell></cell>
<cell></cell>
<cell></cell>
</tablerow>
<tablerow>
<cell></cell>
<cell></cell>
<cell></cell>
</tablerow>
</table>
The first row serves as a header and the second row contains the field value. Within the object I am pulling the names from the names are the same
I am trying to figure out the best approach to add the field name to both tablerow cells so that
I have an output as
<table>
<tablerow>
<cell>Field1</cell>
<cell>Field2</cell>
<cell>Field3</cell>
</tablerow>
<tablerow>
<cell>Field1</cell>
<cell>Field2</cell>
<cell>Field3</cell>
</tablerow>
</table>
I am currently retrieving all the cell Elements
var cells = doc.Descendants(tablerow).Descendants(cell);
and then using a foreach to insert using a normal .add()
foreach (c in cells)
{
c.add(//XElement content...);
}
My question is if I only have 3 fields (but 6 cells) what would be the best approach for populating them into the 6 cells.
I generate the cells dynamically so I can control and ensure that there will always
be one cell for each field in each row.
I'd appreciate any suggestions or ideas
-Cheers
You could do it like so:
var fieldNames = new[] { "Field1", "Field2", "Field3" };
var doc = XDocument.Load("c:/somewhere.xml").Root;
foreach (var row in doc.Elements("tablerow"))
{
var i=0; // index into fieldNames array
foreach (var cell in row.Elements("cell"))
{
cell.Add(new XText(fieldNames[i++])); // take one, and increment
}
}
doc.Save("c:/somewhere.xml");
If it helps, you can create it from scratch using:
var fieldNames = new[] { "Field1", "Field2", "Field3" };
var doc = new XElement("table");
doc.Add(CreateTableRowElement(fieldNames));
doc.Add(CreateTableRowElement(fieldNames));
doc.Save("c:/file.xml");
and the helper function
private XElement CreateTableRowElement(string[] fieldNames)
{
return new XElement("tablerow", fieldNames.Select(name =>
new XElement("cell", new XText(name))));
}
Thanks to CSJ for guiding me in the right direction his answer is the correct answer but I wanted to share my code as my implementation was just a tad different. Here is the full working sample I mocked up in a console app. - Cheers
static void Main(string[] args)
{
XDocument xd = CreateXml();
List<string> stuff = GeneratList();
PopulateArray(stuff, xd);
}
private static XDocument PopulateArray(List<string>mylist, XDocument xmlFile)
{
var row = xmlFile.Descendants("tablerow");
foreach (var r in row)
{
var i = 0;
var cell = r.Descendants("cell");
foreach (var c in cell)
{
c.Add(new XText(mylist[i++]));
}
}
return xmlFile;
}
private static XDocument CreateXml()
{
XDocument doc = new XDocument(
new XElement("table",
new XElement("tablerow",
new XElement("cell"),
new XElement("cell"),
new XElement("cell")
),
new XElement("tablerow",
new XElement("cell"),
new XElement("cell"),
new XElement("cell")
)
)
);
return doc;
}
private static List<string> GenerateList()
{
return new List<string> {
"Orange",
"Grape",
"Banana"
};
}

Reading XML Data and Storing in DataTable

I have an log file like this..
This is the segment 1
============================
<MAINELEMENT><ELEMENT1>10-10-2013 10:10:22.444</ELEMENT1><ELEMENT2>1111</ELEMENT2>
<ELEMENT3>Message 1</ELEMENT3></MAINELEMENT>
<MAINELEMENT><ELEMENT1>10-10-2013 10:10:22.555</ELEMENT1><ELEMENT2>1111</ELEMENT2>
<ELEMENT3>Message 2</ELEMENT3></MAINELEMENT>
This is the segment 2
============================
<MAINELEMENT><ELEMENT1>10-11-2012 10:10:22.444</ELEMENT1><ELEMENT2>2222</ELEMENT2>
<ELEMENT3>Message 1</ELEMENT3></MAINELEMENT>
<MAINELEMENT><ELEMENT1>10-11-2012 10:10:22.555</ELEMENT1><ELEMENT2>2222</ELEMENT2>
<ELEMENT3>Message 2</ELEMENT3></MAINELEMENT>
How can I read this into DataTable excluding the data This is the segment 1 and This is the segment 2 and ====== lines completely.
I would like to have the Datatable as with Columns as "ELEMENT1", "ELEMENT2", "ELEMENT3" and fill the details with the content between those tags in the order of print of line.
It should not change the sequence of the order of records in the table while inserting.
HtmlAgilityPack seems to be a good tool for what you need:
using HtmlAgilityPack;
class Program
{
static void Main(string[] args)
{
var doc = new HtmlDocument();
doc.Load("log.txt");
var dt = new DataTable();
bool hasColumns = false;
foreach (HtmlNode row in doc
.DocumentNode
.SelectNodes("//mainelement"))
{
if (!hasColumns)
{
hasColumns = true;
foreach (var column in row.ChildNodes
.Where(node => node.GetType() == typeof(HtmlNode)))
{
dt.Columns.Add(column.Name);
}
}
dt.Rows.Add(row.ChildNodes
.Where(node => node.GetType() == typeof(HtmlNode))
.Select(node => node.InnerText).ToArray());
}
}
}
could do this, where stringData is the data from the file you have
var array = stringData.Split(new[] { "============================" }, StringSplitOptions.RemoveEmptyEntries);
var document = new XDocument(new XElement("Root"));
foreach (var item in array)
{
if(!item.Contains("<"))
continue;
var subDocument = XDocument.Parse("<Root>" + item.Substring(0, item.LastIndexOf('>') + 1) + "</Root>");
foreach (var element in subDocument.Root.Descendants("MAINELEMENT"))
{
document.Root.Add(element);
}
}
var table = new DataTable();
table.Columns.Add("ELEMENT1");
table.Columns.Add("ELEMENT2");
table.Columns.Add("ELEMENT3");
var rows =
document.Descendants("MAINELEMENT").Select(el =>
{
var row = table.NewRow();
row["ELEMENT1"] = el.Element("ELEMENT1").Value;
row["ELEMENT2"] = el.Element("ELEMENT2").Value;
row["ELEMENT3"] = el.Element("ELEMENT3").Value;
return row;
});
foreach (var row in rows)
{
table.Rows.Add(row);
}
foreach (DataRow dataRow in table.Rows)
{
Console.WriteLine("{0},{1},{2}", dataRow["ELEMENT1"], dataRow["ELEMENT2"], dataRow["ELEMENT3"]);
}
I'm not so sure where you problem is.
You can use XElement for reading the xml and manually creating DataTable.
For Reading the XML See Xml Parsing using XElement
Then you can create dynamically the datatable.
Heres an example of creating a datatable in code
https://sites.google.com/site/bhargavaclub/datatablec
But why do you want to use a DataTable ? There are a lot of downsides...

Categories