XML from DataTable using Linq - c#

This code
XmlDataDocument xmlDataDocument = new XmlDataDocument(ds);
does not work for me, because the node names are derived from the columns' encoded ColumnName property and will look like "last_x20_name", for instance. This I cannot use in the resulting Excel spreadsheet. In order to treat the column names to make them something more friendly, I need to generate the XML myself.
I like LINQ to XML, and one of the responses to this question contained the following snippets:
XDocument doc = new XDocument(new XDeclaration("1.0","UTF-8","yes"),
new XElement("products", from p in collection
select new XElement("product",
new XAttribute("guid", p.ProductId),
new XAttribute("title", p.Title),
new XAttribute("version", p.Version))));
The entire goal is to dynamically derive the column names from the dataset, so hardcoding them is not an option. Can this be done with Linq and without making the code much longer?

It ought to be possible.
In order to use your Dataset as a source you need Linq-to-Dataset.
Then you would need a nested query
// untested
var data = new XElement("products",
from row in ds.Table["ProductsTable"].Rows.AsEnumerable()
select new XElement("product",
from column in ds.Table["ProductsTable"].Columns // not sure about this
select new XElement(colum.Fieldname, rows[colum.Fieldname])
) );

I appreciate the answers, but I had to abandon this approach altogether. I did manage to produce the XML that I wanted (albeit not with Linq), but of course there is a reason why the default implementation of the XmlDataDocument constructor uses the EncodedColumnName - namely that special characters are not allowed in element names in XML. But since I wanted to use the XML to convert what used to be a simple CSV file to the XML Spreadsheet format using XSLT (customer complains about losing leading 0's in ZIP codes etc when loading the original CSV into Excel), I had to look into ways that preserve the data in Excel.
But the ultimate goal of this is to produce a CSV file for upload to the payroll processor, and they mandate the column names to be something that is not XML-compliant (e.g. "File #"). The data is reviewed by humans before the upload, and they use Excel.
I resorted to hard-coding the column names in the XSLT after all.

Related

How do I append to a CSV file using Filehelpers with multiple record types that have distinct headers?

As the question says, using the FileHelpers library I am attempting to generate a CSV file along side a report file. The report file may have different (but finite) inputs/data structures and hence my CSV generation method is not explicitly typed. The CSV contains all of the report data as well as the report's header information. For my headers, I am using the class object properties because they are descriptive enough for my end use purpose.
My relevant code snippet is below:
// File location, where the .csv goes and gets stored.
string filePath = Path.Combine(destPath, fileName);
// First, write report header details based on header list
Type type = DetermineListType(headerValues);
var headerEngine = new FileHelperEngine(type);
headerEngine.HeaderText = headerEngine.GetFileHeader();
headerEngine.WriteFile(filePath, (IEnumerable<object>)headerValues);
// Next, append the report data below the report header data.
type = DetermineListType(reportData);
var reportDataEngine = new FileHelperEngine(type);
reportDataEngine.HeaderText = reportDataEngine.GetFileHeader();
reportDataEngine.AppendToFile(filePath, (IEnumerable<object>)reportData);
When this is executed, the CSV is successfully generated however the .AppendToFile() method does not add the reportDataEngine.HeaderText. From the documentation I do not see this functionality to .AppendToFile() and I am wondering if anyone has a known work-around for this or a suggestion how to output the headers of two different class objects in a single CSV file using FileHelpers.
The desired output would look something like this however in a single CSV file (This would be a contiguous CSV obviously; not tables)
Report_Name
Operator
Timestamp
Access Report
User1
14:50:12 28 Dec 2020
UserID
Login_Time
Logout_Time
User4
09:33:23
10:45:34
User2
11:32:11
11:44:11
User4
15:14:22
16:31:09
User1
18:55:32
19:10:10
I have looked also at the MultiRecordEngine in FileHelpers and while I think this may be helpful, I cannot figure out based on the examples how to actually write a multirecord CSV file in the required fashion I have above; if it is possible at all.
Thank you!
The best way is to merge the columns and make one big table then make your classes match the columns you need to separate them out when reading. CSV only allows for the first row to define the column names and that is optional based on your use case. Look at CSVHelper https://joshclose.github.io/CsvHelper/ it has a lot of built-in features with lots of examples. Let me know if you need additional help.

Define dateTime format in xsd

At my work we load excel files and save them in the database.
This is basically the flow:
We import data into a DataSet from an Excel file, where each sheet is loaded into its own DataTable inside the DataSet. After populating the DataDet, i want to validate the data inside the DataSet, let's say the first DataTable. I get xml from the DataTable by using WriteXml() method of the DataTable class and load this xml into an XDocument. I then use the Validate() method of the XDocument class with a predefined xsd, which is loaded into a XmlSchemaSet object.
The problem is that the data in the excel is stored in a format that is different from the format of dateTime in xsd.
We get Excel files with datetime columns formatted like thie: '12/01/2015 12:44:45', whereas the dateTime format in xsd should be like this: '2015-01-12T12:44:45'
Is it possible to define custom dateTime format in an xsd file?
For example, instead of '2015-01-12T12:44:45', I would like it to be '12/01/2015 12:44:45', so my xml element would look like this:
<createDate>12/01/2015 12:44:45>/createDate>
In addition, i wouldn't mind if the time part would be ignored altogether.
In addition, another custom xsd format i need is like this: 378,216.00
Is it possible to define it in my xsd file?
Here is this code where we do the validation of the xml, retrieved from the datatable
public string[] ValidateExcelFromXsdFile(string schemaUri)
{
_validationErrors.Clear();
var schemas = new XmlSchemaSet();
schemas.Add("", schemaUri);
var doc = XDocument.Parse(GetXml(_dataSetFromExcel.Tables[0]));
doc.Validate(schemas, (sender, args) => _validationErrors.Add(args.Message));
return _validationErrors.ToArray();
}
You can define a pattern for strings in the format 'dd/mm/yyyy hh:mm:ss' using a regular expression, but the resulting value won't be an xs:dateTime, and checking for full validity (leap years etc) is a bit of a nightmare (it can be done, but leads to a regular expression that's about a mile long).
A better solution here might the transform-then-validate pattern, where you preprocess the input document (into standard XSD format) before validating it. You can even do some of the validation during the preprocessing phase if you choose.
The Saxon schema processor has an preprocess facet which allows you to declare some rearrangement of a value prior to schema processing, which is exactly what you need here (for both your use cases), but unfortunately it's not standard.
Check this site - there you have DateTime data type http://www.w3schools.com/schema/schema_dtypes_date.asp
Also Date and Time Data Types paragraph will allows you to adjust the rule to your needs.

Saving a dataset as XML can I override tag names produced to get smaller XML

So I'm saving out a dataset to an xml file in an application where producing a smaller file is important. Right now it just saves the XML elements as the name of the datasets tables and fields. I'm wondering if instead there's a way to easily save out these tags as something other than the default dataset names and load them back in using the same names.
For example right now I'll do something like this to save out the dataset
XmlTextWriter xml = new XmlTextWriter("myfile.xml", Encoding.UTF8) { Formatting = Formatting.None };
myDataset.WriteXml(xml);
and it'll produce long xml that looks like this...
<MYDataSet xmlns="http://tempuri.org/MYDataSet.xsd">
<MyElement><ReallyLongTagNames>other info and tags</ReallyLongTagNames></MyElement>
...
</MyDataSet>
I want to be able to save and load stuff that looks more like this to conserve space but I can't find a good way to do it. Good meaning keeping my dataset from being cryptic but still being able to load and save out XML like this.
<MYDataSet xmlns="http://tempuri.org/MYDataSet.xsd">
<a><b>other info and tags</b></a>
...
</MyDataSet>
Is there a good way to get xml to print out this way when saving off a dataset?
How about renaming the columns in the DataSet before exporting to XML?
If file size is important then use the Binary Formatter. This will reduce the size dramatically.
string path = #"D:\myfile.dat";
BinaryFormatter bf = new BinaryFormatter();
mydataset.RemotingFormat = SerializationFormat.Binary;
using (StreamWriter sw = new StreamWriter(path))
{
bf.Serialize(sw.BaseStream, mydataset);
}

Reading Xml into a datagrid in C#

Whats the best way to read Xml from either an XmlDocument or a String into a DataGrid?
Does the xml have to be in a particular format?
Do I have to use A DataSet as an intermediary?
I'm working on a client that consumes Xml sent over from a Server which is being developed by one of my colleagues, I can get him to change the format of the Xml to match what a DataGrid requires.
It depends on which version of .NET you are running on. If you can use Linq2Xml then it is easy. Just create an XDocument and select the child nodes as a list of an anonymous type.
If you can't use Linq2Xml then you have a few other options. Using a DataSet is one, this can work well, but it depends on the xml you are receiving. An other option is to create a class that describes the entity you will read from the xml and step through the xml nodes manually. A third option would be to use Xml serialization and deserialize the xml into a list of objects. This can work well as long as you have classes that are setup for it.
The easiest option will be either to create an XDocument or to create a DataSet as you suggest.
Obviously your XML needs to be valid :)
After that, define a dataset, define a datagrid. Use the readXML method on the dataset to fill the dataset with your XML, finish with a dataBind and you are good to go.
DataSet myDataSet = new DataSet();
myDataSet .ReadXml(myXMLString);
myDataGrid.DataSource = myDataSet ;
myDataGrid.DataBind();
You can simply use the XmlDatasource object as the grid's data source. That allows you to set the file and the XPath, in order to choose the XML that is the soure of your data. You can then use the <%# XPath="blah"%> function to write out your data explicitely, if you like.
We have a partial answer to get the data into a dataset but it reads it in as a set of tables with relational links.
DataSet ds = new DataSet();
XmlTextReader xmlreader = new XmlTextReader(xmlSource, XmlNodeType.Document, null);
ds.ReadXml(xmlreader);

Read XMl File to String Array

I have a XML File and I want to read the data and assign it to a string array, so Operative would be assign to 1 array and JobLocation to another
<Demo>
<JOBOperatives>
<Operative>
<Clock>aaaa</Clock>
<Name>aaaaa</Name>
<MobileNumber>00000000010</MobileNumber>
<OperativeTrade>3</OperativeTrade>
<OperativeTicket>1</OperativeTicket>
</Operative>
</JOBOperatives>
<JobLocation>
<UPRN>aaa</UPRN>
<Address1>aaaa</Address1>
<Address2>aaaa</Address2>
<Address3>aaaa</Address3>
<Address4>aaa</Address4>
<Address5>aa</Address5>
<PostCode>JR4 4ED</PostCode>
</JobLocation>
I take it you mean where each property from the xml is it's own element in the array?
That doesn't seem like a very good data structure, especially as xml schema definitions allow for the items to arrive in any order; your expected indexes could get all screwed up. A strongly-typed object seems more appropriate and is well supported in .Net. At very least you should use a dictionary, so the keys are preserved.
In this case the number of items in each tree is very small and you could end up with many of them, so a dictionary is probably not the best choice. You could do objects, but that would be a lot of extra code just to set it up and I get the impression the xml may come from different sources and be different based on the source (or something where the structure could change regularly, hence the initial desire for loosely-typed validation).
Ultimately your destination is a database, so I think in this case I'll show you an example using a dataset:
string xml = GetXmlString(); // <Demo><JobOperatives><Operative><Clock>aaaa</Clock>...
StringReader sr = new StringReader(xml);
DataSet ds = new DataSet();
ds.ReadXml(sr);
Play around with that: look in the dataset's .Tables collection, at each table's .TableName property, .Columns collection, and .Rows collection, and at each columns .ColumnName and .DataType properties.
The OuterXml property of the XmlNode class might help you here.

Categories