So I am reading a xml file with unknown length and reading each element into a list structure. Right now once I get to the end of the file I continue reading, this causes an exception. Right now I just catch this exception and continue with my life but is there a cleaner way to do this?
try
{
while(!textReader.EOF)
{
// Used to store info from each command as they are read from the xml file
ATAPassThroughCommands command = new ATAPassThroughCommands ();
// the following is just commands being read and their contents being saved
XmlNodeType node = textReader.NodeType;
textReader.ReadStartElement( "Command" );
node = textReader.NodeType;
name = textReader.ReadElementString( "Name" );
node = textReader.NodeType;
CommandListContext.Add(name);
command.m_Name = name;
command.m_CMD = Convert .ToByte(textReader.ReadElementString("CMD" ),16);
command.m_Feature = Convert .ToByte(textReader.ReadElementString("Feature" ),16);
textReader.ReadEndElement(); //</command>
m_ATACommands.Add(command);
}
}
catch ( Exception ex)
{
//</ATAPassThrough> TODO: this is an ugly fix come up with something better later
textReader.ReadEndElement();
//cUtils.DisplayError(ex.Message);
}
xml file:
<ATAPassThrough>
<Command>
<Name>Smart</Name>
<CMD>B0</CMD>
<Feature>D0</Feature>
</Command>
<Command>
<Name>Identify</Name>
<CMD>B1</CMD>
<Feature>D0</Feature>
</Command>
.
.
.
.
</ATAPassThrough>
I would recomend using XDocument for reading XML data... for instance in your case since you already have a TextReader for your XML you can just pass that into the XDocument.Load method... your entire function above looks like this..
var doc = XDocument.Load(textReader);
foreach (var commandXml in doc.Descendants("Command"))
{
var command = new ATAPassThroughCommands();
var name = commandXml.Descendants("Name").Single().Value;
// I'm not sure what this does but it looks important...
CommandListContext.Add(name);
command.m_Name = name;
command.m_CMD =
Convert.ToByte(commandXml.Descendants("CMD").Single().Value, 16);
command.m_Feature =
Convert.ToByte(commandXml.Descendants("Feature").Single().Value, 16);
m_ATACommands.Add(command);
}
Significantly easier. Let the framework do the heavy lifting for you.
Probably the easiest way if you have normal and consistant XML is to use the XML Serializer.
First Create Objects that match your XML
[Serializable()]
public class Command
{
[System.Xml.Serialization.XmlElement("Name")]
public string Name { get; set; }
[System.Xml.Serialization.XmlElement("CMD")]
public string Cmd { get; set; }
[System.Xml.Serialization.XmlElement("Feature")]
public string Feature { get; set; }
}
[Serializable()]
[System.Xml.Serialization.XmlRoot("ATAPassthrough")]
public class CommandCollection
{
[XmlArrayItem("Command", typeof(Command))]
public Command[] Command { get; set; }
}
The a method to return the CommandCollection
public class CommandSerializer
{
public commands Deserialize(string path)
{
CommandCollection commands = null;
XmlSerializer serializer = new XmlSerializer(typeof(CommandCollection ));
StreamReader reader = new StreamReader(path);
reader.ReadToEnd();
commands = (CommandCollection)serializer.Deserialize(reader);
reader.Close();
return commands ;
}
}
Not sure if this is exactly correct, I don't have the means to test it, but is should be really close.
Related
I created a class, very simple, and I'm attempting to read from a text file into a list using the class List.
I use StreamReader inputFile to open the file, but when I try to use ReadLine I get an error that I cannot convert from string to... .Bowler (which is the designation in my list.
I created the class so that I can access the list from multiple forms.
I'm obviously new to C#, and programming in general.
//the ReadBowlers method reads the names of bowlers
//into the listBowlers.
private void ReadBowlers(List<Bowler> listBowlers)
{
try
{
//Open the Bowlers.txt file.
StreamReader inputFile = File.OpenText("Bowlers.txt");
//read the names into the list
while (!inputFile.EndOfStream)
{
listBowlers.Add(inputFile.ReadLine());
}
//close the file.
inputFile.Close();
}
catch (Exception ex)
{
//display error message
MessageBox.Show(ex.Message);
}
The line that's giving me the error is:
listBowlers.Add(inputFile.ReadLine());
You can create a method that takes a string read from your file and returns a Bowler.
For example, suppose your line of data looks like this:
Bob Smith,5,XYZ
public Bowler Parse(string inputLine)
{
// split the line of text into its individual pieces
var lineSegments = inputLine.Split(',');
// create a new Bowler using those values
var result = new Bowler
{
Name = lineSegments[0],
Id = lineSegments[1],
SomeOtherBowlerProperty = lineSegments[2]
}
return result;
}
Now you can do this:
var line = inputFile.ReadLine();
var bowler = Parse(line);
listBowlers.Add(bowler);
But it gets even better! What if Bowler has lots of properties? What if you don't want to keep track of which position each column is in?
CsvHelper is a great Nuget package, and I'm sure there are others like it. They let us use someone else's tested code instead of writing it ourself. (I didn't lead with this because writing it first is a great way to learn, but learning to use what's available is good too.)
If your data has column headers, CsvHelper will figure out which columns contain which properties for you.
So suppose you have this data in a file:
FirstName,LastName,Id,StartDate
Bob,Smith,5,1/1/2019
John,Galt,6,2/1/2019
And this class:
public class Bowler
{
public int Id { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public DateTime? StartDate { get; set; }
}
You could write this code:
public List<Bowler> GetBowlersFromFile(string filePath)
{
using(var fileReader = File.OpenText(filePath))
using (var reader = new CsvReader(fileReader))
{
return reader.GetRecords<Bowler>().ToList();
}
}
It looks at the header row and figures out which column is which.
ReadLine() gives you the string so you can't directly assign it to a custom class unless an explicit conversion is specified.
Better make a new Bowler instance, assign its values by parsing the string separately and then add that instance to the list.
//the ReadBowlers method reads the names of bowlers
//into the listBowlers.
private void ReadBowlers(List<Bowler> listBowlers)
{
try
{
//Open the Bowlers.txt file.
string path = #"Bowlers.txt";
//read the names into the list
using (FileStream fs = File.OpenRead(path))
{
byte[] b = new byte[1024];
UTF8Encoding temp = new UTF8Encoding(true);
while (fs.Read(b,0,b.Length) > 0)
{
listBowlers.Add(temp.GetString(b));
}
}
}
catch (Exception ex)
{
//display error message
MessageBox.Show(ex.Message);
}
}
I have several text files with the following data structure:
{
huge
json
block that spans across multiple lines
}
--#newjson#--
{
huge
json
block that spans across multiple lines
}
--#newjson#--
{
huge
json
block that spans across multiple lines
} etc....
So it is actually json blocks that are row delimited by "--##newjson##--" string .
I am trying to write a customer extractor to parse this. The problem is that I can't use string data type to feed json deserializer because it has a maximum size of 128 KB and the json blocks do not fit in this. What is the best approach to parse this file using a custom extractor?
I have tried using the code below, but it doesn't work. Even the row delimiter "--#newjson#--" doesn't seem to work right.
public SampleExtractor(Encoding encoding, string row_delim = "--#newjson#--", char col_delim = ';')
{
this._encoding = ((encoding == null) ? Encoding.UTF8 : encoding);
this._row_delim = this._encoding.GetBytes(row_delim);
this._col_delim = col_delim;
}
public override IEnumerable<IRow> Extract(IUnstructuredReader input, IUpdatableRow output)
{
//Read the input by json
foreach (Stream current in input.Split(_encoding.GetBytes("--#newjson#--")))
{
var serializer = new JsonSerializer();
using (var sr = new StreamReader(current))
using (var jsonTextReader = new JsonTextReader(sr))
{
var jsonrow = serializer.Deserialize<JsonRow>(jsonTextReader);
output.Set(0, jsonrow.status.timestamp);
}
yield return output.AsReadOnly();
}
}
You dont need a custom extractor to do that.
The best solution is add one json by line. Then you can use a text extractor and extract line by line. You can also pick your own delimiter.
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
#JsonLines=
EXTRACT
[JsonLine] string
FROM
#Full_Path
USING
Extractors.Text(delimiter:'\b', quoting : false);
#ParsedJSONLines =
SELECT
Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple([JsonLine]) AS JSONLine
FROM
#JsonLines
#AccessToProperties=
SELECT
JSONLine["Property"] AS Property
FROM
#ParsedJSONLines;
Here is how you can achieve the solution:
1) Create a c# equivalent of your JSON object
Note:- Assuming all your json object are same in your text file.
E.g:
Json Code
{
"id": 1,
"value": "hello",
"another_value": "world",
"value_obj": {
"name": "obj1"
},
"value_list": [
1,
2,
3
]
}
C# Equivalent
public class ValueObj
{
public string name { get; set; }
}
public class RootObject
{
public int id { get; set; }
public string value { get; set; }
public string another_value { get; set; }
public ValueObj value_obj { get; set; }
public List<int> value_list { get; set; }
}
2) Change your de-serializing code like below after you have done the split based on the delimiter
using (JsonReader reader = new JsonTextReader(sr))
{
while (!sr.EndOfStream)
{
o = serializer.Deserialize<List<MyObject>>(reader);
}
}
This would deserialize the json data in c# class object which would solve your purpose.
Later which you can serialize again or print it in text or ...any file.
Hope it helps.
When I serialize the value : If there is no value present in for data then it's coming like below format.
<Note>
<Type>Acknowledged by PPS</Type>
<Data />
</Note>
But what I want xml data in below format:
<Note>
<Type>Acknowledged by PPS</Type>
<Data></Data>
</Note>
Code For this i have written :
[Serializable]
public class Notes
{
[XmlElement("Type")]
public string typeName { get; set; }
[XmlElement("Data")]
public string dataValue { get; set; }
}
I am not able to figure out what to do for achieve data in below format if data has n't assign any value.
<Note>
<Type>Acknowledged by PPS</Type>
<Data></Data>
</Note>
You can do this by creating your own XmlTextWriter to pass into the serialization process.
public class MyXmlTextWriter : XmlTextWriter
{
public MyXmlTextWriter(Stream stream) : base(stream, Encoding.UTF8)
{
}
public override void WriteEndElement()
{
base.WriteFullEndElement();
}
}
You can test the result using:
class Program
{
static void Main(string[] args)
{
using (var stream = new MemoryStream())
{
var serializer = new XmlSerializer(typeof(Notes));
var writer = new MyXmlTextWriter(stream);
serializer.Serialize(writer, new Notes() { typeName = "Acknowledged by PPS", dataValue="" });
var result = Encoding.UTF8.GetString(stream.ToArray());
Console.WriteLine(result);
}
Console.ReadKey();
}
If you saved your string somewhere (e.g a file) you can use this simple Regex.Replace:
var replaced = Regex.Replace(File.ReadAllText(name), #"<([^<>/]+)\/>", (m) => $"<{m.Groups[1].Value.Trim()}></{m.Groups[1].Value.Trim()}>");
File.WriteAllText(name, replaced);
IMO it's not possibe to generate your desired XML using Serialization. But, you can use LINQ to XML to generate the desired schema like this -
XDocument xDocument = new XDocument();
XElement rootNode = new XElement(typeof(Notes).Name);
foreach (var property in typeof(Notes).GetProperties())
{
if (property.GetValue(a, null) == null)
{
property.SetValue(a, string.Empty, null);
}
XElement childNode = new XElement(property.Name, property.GetValue(a, null));
rootNode.Add(childNode);
}
xDocument.Add(rootNode);
XmlWriterSettings xws = new XmlWriterSettings() { Indent=true };
using (XmlWriter writer = XmlWriter.Create("D:\\Sample.xml", xws))
{
xDocument.Save(writer);
}
Main catch is in case your value is null, you should set it to empty string. It will force the closing tag to be generated. In case value is null closing tag is not created.
Kludge time - see Generate System.Xml.XmlDocument.OuterXml() output thats valid in HTML
Basically after XML doc has been generated go through each node, adding an empty text node if no children
// Call with
addSpaceToEmptyNodes(xmlDoc.FirstChild);
private void addSpaceToEmptyNodes(XmlNode node)
{
if (node.HasChildNodes)
{
foreach (XmlNode child in node.ChildNodes)
addSpaceToEmptyNodes(child);
}
else
node.AppendChild(node.OwnerDocument.CreateTextNode(""))
}
(Yes I know you shouldn't have to do this - but if your sending the XML to some other system that you can't easily fix then have to be pragmatic about things)
You can add a dummy field to prevent the self-closing element.
[XmlText]
public string datavalue= " ";
Or if you want the code for your class then Your class should be like this.
public class Notes
{
[XmlElement("Type")]
public string typeName { get; set; }
[XmlElement("Data")]
private string _dataValue;
public string dataValue {
get {
if(string.IsNullOrEmpty(_dataValue))
return " ";
else
return _dataValue;
}
set {
_dataValue = value;
}
}
}
In principal, armen.shimoon's answer worked for me. But if you want your XML output pretty printed without having to use XmlWriterSettings and an additional Stream object (as stated in the comments), you can simply set the Formatting in the constructor of your XmlTextWriter class.
public MyXmlTextWriter(string filename) : base(filename, Encoding.UTF8)
{
this.Formatting = Formatting.Indented;
}
(Would have posted this as a comment but am not allowed yet ;-))
Effectively the same as Ryan's solution which uses the standard XmlWriter (i.e. there's no need for a derived XmlTextWriter class), but written using linq to xml (XDocument)..
private static void AssignEmptyElements(this XNode node)
{
if (node is XElement e)
{
e.Nodes().ToList().ForEach(AssignEmptyElements);
if (e.IsEmpty)
e.Value = string.Empty;
}
}
usage..
AssignEmptyElements(document.FirstNode);
How can I read the following xml file into a List:
Partial XML file (data.log)
<ApplicationLogEventObject>
<EventType>Message</EventType>
<DateStamp>10/13/2016 11:15:00 AM</DateStamp>
<ShortDescription>N/A</ShortDescription>
<LongDescription>Sending 'required orders' email.</LongDescription>
</ApplicationLogEventObject>
<ApplicationLogEventObject>
<EventType>Message</EventType>
<DateStamp>10/13/2016 11:15:10 AM</DateStamp>
<ShortDescription>N/A</ShortDescription>
<LongDescription>Branches Not Placed Orders - 1018</LongDescription>
</ApplicationLogEventObject>
<ApplicationLogEventObject>
<EventType>Message</EventType>
<DateStamp>10/13/2016 11:15:10 AM</DateStamp>
<ShortDescription>N/A</ShortDescription>
<LongDescription>Branches Not Placed Orders - 1019</LongDescription>
</ApplicationLogEventObject>
...
And here is the data access layer (DAL):
public List<FLM.DataTypes.ApplicationLogEventObject> Get()
{
try
{
XmlTextReader xmlTextReader = new XmlTextReader(#"C:\data.log");
List<FLM.DataTypes.ApplicationLogEventObject> recordSet = new List<ApplicationLogEventObject>();
xmlTextReader.Read();
while (xmlTextReader.Read())
{
xmlTextReader.MoveToElement();
FLM.DataTypes.ApplicationLogEventObject record = new ApplicationLogEventObject();
record.EventType = xmlTextReader.GetAttribute("EventType").ToString();
record.DateStamp = Convert.ToDateTime(xmlTextReader.GetAttribute("DateStamp"));
record.ShortDescription = xmlTextReader.GetAttribute("ShortDescription").ToString()
record.LongDescription = xmlTextReader.GetAttribute("LongDescription").ToString();
recordSet.Add(record);
}
return recordSet;
}
catch (Exception ex)
{
throw ex;
}
}
And the Data Types which will hold the child elements from the XML file:
public class ApplicationLogEventObject
{
public string EventType { get; set; }
public DateTime DateStamp { get; set; }
public string ShortDescription { get; set; }
public string LongDescription { get; set; }
}
After I've read the child nodes into a List I would then like to return it and display it in a DataGridView.
Any help regarding this question will be much appreciated.
Your log file is not an XML document. Since an XML document must have one and only one root element, it's a series of XML documents concatenated together. Such a series of documents can be read by XmlReader by setting XmlReaderSettings.ConformanceLevel == ConformanceLevel.Fragment. Having done so, you can read through the file and deserialize each root element individually using XmlSerializer as follows:
static List<ApplicationLogEventObject> ReadEvents(string fileName)
{
return ReadObjects<ApplicationLogEventObject>(fileName);
}
static List<T> ReadObjects<T>(string fileName)
{
var list = new List<T>();
var serializer = new XmlSerializer(typeof(T));
var settings = new XmlReaderSettings { ConformanceLevel = ConformanceLevel.Fragment };
using (var textReader = new StreamReader(fileName))
using (var xmlTextReader = XmlReader.Create(textReader, settings))
{
while (xmlTextReader.Read())
{ // Skip whitespace
if (xmlTextReader.NodeType == XmlNodeType.Element)
{
using (var subReader = xmlTextReader.ReadSubtree())
{
var logEvent = (T)serializer.Deserialize(subReader);
list.Add(logEvent);
}
}
}
}
return list;
}
Using the following version of ApplicationLogEventObject:
public class ApplicationLogEventObject
{
public string EventType { get; set; }
[XmlElement("DateStamp")]
public string DateStampString {
get
{
// Replace with culturally invariant desired formatting.
return DateStamp.ToString(CultureInfo.InvariantCulture);
}
set
{
DateStamp = Convert.ToDateTime(value, CultureInfo.InvariantCulture);
}
}
[XmlIgnore]
public DateTime DateStamp { get; set; }
public string ShortDescription { get; set; }
public string LongDescription { get; set; }
}
Sample .Net fiddle.
Notes:
The <DateStamp> element values 10/13/2016 11:15:00 AM are not in the correct format for dates and times in XML, which is ISO 8601. Thus I introduced a surrogate string DateStampString property to manually handle the conversion from and to your desired format, and then marked the original DateTime property with XmlIgnore.
Using ReadSubtree() prevents the possibility of reading past the end of each root element when the XML is not indented.
According to the documentation for XmlTextReader:
Starting with the .NET Framework 2.0, we recommend that you use the System.Xml.XmlReader class instead.
Thus I recommend replacing use of that type with XmlReader.
The child nodes of your <ApplicationLogEventObject> are elements not attributes, so XmlReader.GetAttribute() was not an appropriate method to use to read them.
Given that your log files are not formatting their times in ISO 8601, you should at least make sure they are formatted in a culturally invariant format so that log files can be exchanged between computers with different regional settings. Doing your conversions using CultureInfo.InvariantCulture ensures this.
Been looking around alot to find answers to my problem. I found a great way to convert my List to an xml file, but I also want to get it back to a list.
My current xml file looks like this:
<Commands>
<Command command="!command1" response="response1" author="niccon500" count="1237"/>
<Command command="!command2" response="response2" author="niccon500" count="1337"/>
<Command command="!command3" response="response3" author="niccon500" count="1437"/>
</Commands>
Generated with:
var xml = new XElement("Commands", commands.Select(x =>
new XElement("Command",
new XAttribute("command", x.command),
new XAttribute("response", x.response),
new XAttribute("author", x.author),
new XAttribute("count", x.count))));
File.WriteAllText(file_path, xml.ToString());
So, how would I get the info back from the xml file?
You can use the following piece of code:
XDocument doc = XDocument.Load(#"myfile.xml");
to load the XML file into an XDocument object.
Then you can use .Descendants("Commands").Elements() to access all elements contained inside the <Commands> element and add them to a list:
List<Command> lstCommands = new List<Command>();
foreach(XElement elm in doc.Descendants("Commands").Elements())
{
lstCommands.Add(new Command
{
command = elm.Attribute("command").Value,
response = elm.Attribute("response").Value,
author = elm.Attribute("author").Value,
count = int.Parse(elm.Attribute("count").Value)
});
}
When using xml for persistent storage, start with defining POC classes and decorate them with proper attributes - like
[XmlType("Command")]
public class CommandXml
{
[XmlAttribute("command")]
public string Command { get; set; }
[XmlAttribute("response")]
public string Response { get; set; }
[XmlAttribute("author")]
public string Author { get; set; }
[XmlAttribute("count")]
public int Count { get; set; }
}
[XmlRoot("Commands")]
public class CommandListXml : List<CommandXml>
{
}
And then de-serialization is simple as:
var txt = #"<Commands>
<Command command=""!command1"" response=""response1"" author=""niccon500"" count=""1237""/>
<Command command=""!command2"" response=""response2"" author=""niccon500"" count=""1337""/>
<Command command=""!command3"" response=""response3"" author=""niccon500"" count=""1437""/>
</Commands>";
CommandListXml commands;
using (var reader = new StringReader(txt))
{
var serializer = new XmlSerializer(typeof(CommandListXml));
commands = (CommandListXml)serializer.Deserialize(reader);
}
You can use XDocument.Load method to load the document and then create command list using LINQ to XML syntax :
XDocument doc = XDocument.Load(file_path);
var commands = doc.Root.Elements("Commands").Select(e =>
new Command()
{
command = e.Attribute("command").Value,
response = e.Attribute("response").Value,
author = e.Attribute("author").Value,
count= int.Parse(e.Attribute("count").Value)
}).ToList();