Deserialize XML file - c#

I have an XML file containing 4 different strings, but I am having trouble deserializing the file. Could someone help me with this?
I looked online for answers, but none of them worked, I'm not sure what to do about it.
Here is the XML file I am trying to deserialize:
<?xml version="1.0" encoding="utf-8"?>
<saveData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<strFolder1>1st Location</strFolder1>
<strFolder2>2nd Location</strFolder2>
<strTabName>newTab0</strTabName>
<strTabText>Main</strTabText>
</saveData>

var ser = new XmlSerializer(typeof(saveData));
var obj = (saveData)ser.Deserialize(stream);
public class saveData
{
public string strFolder1;
public string strFolder2;
public string strTabName;
public string strTabText;
}

I'd recommend looking at XmlReader. Some other approaches are easier in different ways, but you can build anything from XmlReader. Such as:
while(rdr.Read())
if(rdr.NodeType == XmlNodeType.Element)
switch(rdr.LocalName)
{
case "strFolder1":
strFolder1 = rdr.ReadContentAsString();
break;
case "strFolder2":
strFolder2 = rdr.ReadContentAsString();
break;
case "strTabName":
strTabName = rdr.ReadContentAsString();
break;
case "strTabText":
strTabText = rdr.ReadContentAsString();
break;
}
(We could take some short-cuts if guaranteed the ordering, I did it the hard way to show that the hard way isn't that hard).
Using XmlDocument, XmlSerializer and XDocument are easier in a lot of cases, but I recommend learning this first because it'll handle everything and is never less efficient. If you learn it first the worse that'll happen is you do a bit more work than necessary to end up with something a bit more efficient than strictly necessary (you'll do a micro-optimisation out of ignorance of the simpler ways). If you learn the others first the worse that'll happen is you do a lot more work than necessary to end up with something a lot less efficient than needed.

namespace Cars1
{
public class saveData
{
public string strFolder1;
public string strFolder2;
public string strTabName;
public string strTabText;
}
[Serializable]
class Program
{
static void Main(string[] args)
{
saveData obj = new saveData();
FileStream fopen = new FileStream("abc.xml", FileMode.Open);
XmlSerializer x = new XmlSerializer(obj.GetType());
StreamReader read_from_xml = new StreamReader(fopen);
obj = (saveData)x.Deserialize(read_from_xml);
Console.WriteLine(obj.strFolder1 + "=>" + obj.strFolder2 + "=>" + obj.strTabName+"=>"+obj.strTabText);
Console.ReadKey();
fopen.Close();
}
}
}

Related

How can I make XmlSerializer Deserialize tell me about typos on tag names

I know that most of the time, the Deserialize method of XmlSerializer will complain if there's something wrong (for example, if there is a typo). However, I've found an example where it doesn't complain, when I would have expected it to; and I'd like to know if there's a way of being told about the problem.
The example code below contains 3 things: an good example which works as expected, and example which would complain (commented out) and an example which does not complain, which is the one I want to know how to tell that there is something wrong.
Note: I appreciate that one possible route would be XSD validation; but that really feels like a sledgehammer to crack what seems like a simpler problem. For example, if I was writing a deserializer which had unexpected data that it didn't know what to do with, I'd make my code complain about it.
I've used NUnit (NuGet package) for assertions; but you don't really need it, just comment out the Assert lines - you can see what I'm expecting.
using System.IO;
using System.Linq;
using System.Text;
using System.Xml.Serialization;
using NUnit.Framework;
public static class Program
{
public static void Main()
{
string goodExampleXml = #"<?xml version=""1.0"" encoding=""utf-8""?><Example><Weathers><Weather>Sunny</Weather></Weathers></Example>";
var goodExample = Load(goodExampleXml);
Assert.That(goodExample, Is.Not.Null);
Assert.That(goodExample.Weathers, Is.Not.Null);
Assert.That(goodExample.Weathers, Has.Length.EqualTo(1));
Assert.That(goodExample.Weathers.First(), Is.EqualTo(Weather.Sunny));
string badExampleXmlWhichWillComplainXml = #"<?xml version=""1.0"" encoding=""utf-8""?><Example><Weathers><Weather>Suny</Weather></Weathers></Example>";
// var badExampleWhichWillComplain = Load(badExampleXmlWhichWillComplainXml); // this would complain, quite rightly, so I've commented it out
string badExampleXmlWhichWillNotComplain = #"<?xml version=""1.0"" encoding=""utf-8""?><Example><Weathers><Weathe>Sunny</Weathe></Weathers></Example>";
var badExample = Load(badExampleXmlWhichWillNotComplain);
Assert.That(badExample, Is.Not.Null);
Assert.That(badExample.Weathers, Is.Not.Null);
// clearly, the following two assertions will fail because I mis-typed the tag name; but I want to know there has been a problem before this point.
Assert.That(badExample.Weathers, Has.Length.EqualTo(1));
Assert.That(badExample.Weathers.First(), Is.EqualTo(Weather.Sunny));
}
private static Example Load(string serialized)
{
byte[] byteArray = Encoding.UTF8.GetBytes(serialized);
var xmlSerializer = new XmlSerializer(typeof(Example));
using var stream = new MemoryStream(byteArray, false);
return (Example)xmlSerializer.Deserialize(stream);
}
}
public enum Weather
{
Sunny,
Cloudy,
Rainy,
Windy,
Stormy,
Snowy,
}
public class Example
{
[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Performance", "CA1819:PropertiesShouldNotReturnArrays", Justification = "Serialized XML")]
[XmlArray("Weathers")]
[XmlArrayItem("Weather")]
public Weather[] Weathers { get; set; }
}
Having looked at Microsoft's published source code for XmlSerializer, it became apparent that there are events that you can subscribe to (which is what I was hoping for); but they aren't exposed on the XmlSerializer itself... you have to inject a struct containing them into the constructor.
So I've been able to modify the code from the question to have an event handler which gets called when an unknown node is encountered (which is exactly what I was after). You need one extra using, over the ones given in the question...
using System.Xml;
and then here is the modified "Load" method...
private static Example Load(string serialized)
{
XmlDeserializationEvents events = new XmlDeserializationEvents();
events.OnUnknownNode = (sender, e) => System.Diagnostics.Debug.WriteLine("Unknown Node: " + e.Name);
var xmlSerializer = new XmlSerializer(typeof(Example));
using var reader = XmlReader.Create(new StringReader(serialized));
return (Example)xmlSerializer.Deserialize(reader, events);
}
So now I just need to do something more valuable than just write a line to the Debug output.
Note that more events are available, as described on the XmlDeserializationEvents page, and I'll probably pay attention to each of them.
I tested following and it works
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string xml =#"<?xml version=""1.0"" encoding=""utf-8"" ?>
<Example>
<Weathers>Sunny</Weathers>
<Weathers>Cloudy</Weathers>
<Weathers>Rainy</Weathers>
<Weathers>Windy</Weathers>
<Weathers>Stormy</Weathers>
<Weathers>Snowy</Weathers>
</Example>";
StringReader sReader = new StringReader(xml);
XmlReader reader = XmlReader.Create(sReader);
XmlSerializer serializer = new XmlSerializer(typeof(Example));
Example example = (Example)serializer.Deserialize(reader);
}
}
public enum Weather
{
Sunny,
Cloudy,
Rainy,
Windy,
Stormy,
Snowy,
}
public class Example
{
[XmlElement("Weathers")]
public Weather[] Weathers { get; set; }
}
}

.NET assemblies: understanding type visibility

I am trying to reproduce something that System.Xml.Serialization already does, but for a different source of data.
For now task is limited to deserialization only.
I.e. given defined source of data that I know how to read. Write a library that takes a random type, learns about it fields/properties via reflection, then generates and compiles "reader" class that can take data source and an instance of that random type and writes from data source into the object's fields/properties.
here is a simplified extract from my ReflectionHelper class
public class ReflectionHelper
{
public abstract class FieldReader<T>
{
public abstract void Fill(T entity, XDataReader reader);
}
public static FieldReader<T> GetFieldReader<T>()
{
Type t = typeof(T);
string className = GetCSharpName(t);
string readerClassName = Regex.Replace(className, #"\W+", "_") + "_FieldReader";
string source = GetFieldReaderCode(t.Namespace, className, readerClassName, fields);
CompilerParameters prms = new CompilerParameters();
prms.GenerateInMemory = true;
prms.ReferencedAssemblies.Add("System.Data.dll");
prms.ReferencedAssemblies.Add(Assembly.GetExecutingAssembly().GetModules(false)[0].FullyQualifiedName);
prms.ReferencedAssemblies.Add(t.Module.FullyQualifiedName);
CompilerResults compiled = new CSharpCodeProvider().CompileAssemblyFromSource(prms, new string[] {source});
if (compiled.Errors.Count > 0)
{
StringWriter w = new StringWriter();
w.WriteLine("Error(s) compiling {0}:", readerClassName);
foreach (CompilerError e in compiled.Errors)
w.WriteLine("{0}: {1}", e.Line, e.ErrorText);
w.WriteLine();
w.WriteLine("Generated code:");
w.WriteLine(source);
throw new Exception(w.GetStringBuilder().ToString());
}
return (FieldReader<T>)compiled.CompiledAssembly.CreateInstance(readerClassName);
}
private static string GetFieldReaderCode(string ns, string className, string readerClassName, IEnumerable<EntityField> fields)
{
StringWriter w = new StringWriter();
// write out field setters here
return #"
using System;
using System.Data;
namespace " + ns + #".Generated
{
public class " + readerClassName + #" : ReflectionHelper.FieldReader<" + className + #">
{
public void Fill(" + className + #" e, XDataReader reader)
{
" + w.GetStringBuilder().ToString() + #"
}
}
}
";
}
}
and the calling code:
class Program
{
static void Main(string[] args)
{
ReflectionHelper.GetFieldReader<Foo>();
Console.ReadKey(true);
}
private class Foo
{
public string Field1 = null;
public int? Field2 = null;
}
}
The dynamic compilation of course fails because Foo class is not visible outside of Program class. But! The .NET XML deserializer somehow works around that - and the question is: How?
After an hour of digging System.Xml.Serialization via Reflector I came to accept that I lack some kind of basic knowledge here and not really sure what am I looking for...
Also it is entirely possible that I am reinventing a wheel and/or digging in a wrong direction, in which case please do speak up!
You don’t need to create a dynamic assembly and dynamically compile code in order to deserialise an object. XmlSerializer does not do that either — it uses the Reflection API, in particular it uses the following simple concepts:
Retrieving the set of fields from any type
Reflection provides the GetFields() method for this purpose:
foreach (var field in myType.GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic))
// ...
I’m including the BindingFlags parameter here to ensure that it will include non-public fields, because otherwise it will return only public ones by default.
Setting the value of a field in any type
Reflection provides the function SetValue() for this purpose. You call this on a FieldInfo instance (which is returned from GetFields() above) and give it the instance in which you want to change the value of that field, and the value to set it to:
field.SetValue(myObject, myValue);
This is basically equivalent to myObject.Field = myValue;, except of course that the field is identified at runtime instead of compile-time.
Putting it all together
Here is a simple example. Notice you need to extend this further to work with more complex types such as arrays, for example.
public static T Deserialize<T>(XDataReader dataReader) where T : new()
{
return (T) deserialize(typeof(T), dataReader);
}
private static object deserialize(Type t, XDataReader dataReader)
{
// Handle the basic, built-in types
if (t == typeof(string))
return dataReader.ReadString();
// etc. for int and all the basic types
// Looks like the type t is not built-in, so assume it’s a class.
// Create an instance of the class
object result = Activator.CreateInstance(t);
// Iterate through the fields and recursively deserialize each
foreach (var field in t.GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic))
field.SetValue(result, deserialize(field.FieldType, dataReader));
return result;
}
Notice I had to make some assumptions about XDataReader, most notably that it can just read a string like that. I’m sure you’ll be able to change it so that it works with your particular reader class.
Once you’ve extended this to support all the types you need (including int? in your example class), you can deserialize an object by calling:
Foo myFoo = Deserialize<Foo>(myDataReader);
and you can do this even when Foo is a private type as it is in your example.
If I try to use sgen.exe (the standalone XML serialization assembly compiler), I get the following error message:
Warning: Ignoring 'TestApp.Program'.
- TestApp.Program is inaccessible due to its protection level. Only public types can be processed.
Warning: Ignoring 'TestApp.Program+Foo'.
- TestApp.Program+Foo is inaccessible due to its protection level. Only public types can be processed.
Assembly 'c:\...\TestApp\bin\debug\TestApp.exe' does not contain any types that can be serialized using XmlSerializer.
Calling new XmlSerializer(typeof(Foo)) in your example code results in:
System.InvalidOperationException: TestApp.Program+Foo is inaccessible due to its protection level. Only public types can be processed.
So what gave you the idea that XmlSerializer can handle this?
However, remember that at runtime, there are no such restrictions. Trusted code using reflection is free to ignore access modifiers. This is what .NET binary serialization is doing.
For example, if you generate IL code at runtime using DynamicMethod, then you can pass skipVisibility = true to avoid any checks for visibility of fields/classes.
I've been working a bit on this. I'm not sure if it will help but, anyway I think it could be the way. Recently I worked with Serialization and DeSerealization of a class I had to send over the network. As there were two different programs (the client and the server), at first I implemented the class in both sources and then used serialization. It failed as the .Net told me it had not the same ID (I'm not sure but it was some sort of assembly id).
Well, after googling a bit I found that it was because the serialized class was on different assemblies, so the solution was to put that class in a independent library and then compile both client and server with that library. I've used the same idea with your code, so I put both Foo class and FieldReader class in a independent library, let's say:
namespace FooLibrary
{
public class Foo
{
public string Field1 = null;
public int? Field2 = null;
}
public abstract class FieldReader<T>
{
public abstract void Fill(T entity, IDataReader reader);
}
}
compile it and add it to the other source (using FooLibrary;)
this is the code I've used. It's not exactly the same as yours, as I don't have the code for GetCSharpName (I used t.Name instead) and XDataReader, so I used IDataReader (just for the compiler to accept the code and compile it) and also change EntityField for object
public class ReflectionHelper
{
public static FieldReader<T> GetFieldReader<T>()
{
Type t = typeof(T);
string className = t.Name;
string readerClassName = Regex.Replace(className, #"\W+", "_") + "_FieldReader";
object[] fields = new object[10];
string source = GetFieldReaderCode(t.Namespace, className, readerClassName, fields);
CompilerParameters prms = new CompilerParameters();
prms.GenerateInMemory = true;
prms.ReferencedAssemblies.Add("System.Data.dll");
prms.ReferencedAssemblies.Add(Assembly.GetExecutingAssembly().GetModules(false)[0].FullyQualifiedName);
prms.ReferencedAssemblies.Add(t.Module.FullyQualifiedName);
prms.ReferencedAssemblies.Add("FooLibrary1.dll");
CompilerResults compiled = new CSharpCodeProvider().CompileAssemblyFromSource(prms, new string[] { source });
if (compiled.Errors.Count > 0)
{
StringWriter w = new StringWriter();
w.WriteLine("Error(s) compiling {0}:", readerClassName);
foreach (CompilerError e in compiled.Errors)
w.WriteLine("{0}: {1}", e.Line, e.ErrorText);
w.WriteLine();
w.WriteLine("Generated code:");
w.WriteLine(source);
throw new Exception(w.GetStringBuilder().ToString());
}
return (FieldReader<T>)compiled.CompiledAssembly.CreateInstance(readerClassName);
}
private static string GetFieldReaderCode(string ns, string className, string readerClassName, IEnumerable<object> fields)
{
StringWriter w = new StringWriter();
// write out field setters here
return #"
using System;
using System.Data;
namespace " + ns + ".Generated
{
public class " + readerClassName + #" : FieldReader<" + className + #">
{
public override void Fill(" + className + #" e, IDataReader reader)
" + w.GetStringBuilder().ToString() +
}
}";
}
}
by the way, I found a tiny mistake, you should use new or override with the Fill method, as it is abstract.
Well, I must admit that GetFieldReader returns null, but at least the compiler compiles it.
Hope that this will help you or at least it guides you to the good answer
regards

OFX File parser in C#

I am looking for an OFX file parser library in C#. I have search the web but there seems to be none. Does anyone know of any good quality C# OFX file parser. I need to process some bank statements files which are in OFX format.
Update
I have managed to find a C# library for parsing OFX parser.
Here is the link ofx sharp. This codebase seems to be the best case to startup my solution.
I tried to use the ofx sharp library, but realised it doesn't work is the file is not valid XML ... it seems to parse but has empty values ...
I made a change in the OFXDocumentParser.cs where I first fix the file to become valid XML and then let the parser continue. Not sure if you experienced the same issue?
Inside of the method:
private string SGMLToXML(string file)
I added a few lines first to take file to newfile and then let the SqmlReader process that after the following code:
string newfile = ParseHeader(file);
newfile = SGMLToXMLFixer.Fix_SONRS(newfile);
newfile = SGMLToXMLFixer.Fix_STMTTRNRS(newfile);
newfile = SGMLToXMLFixer.Fix_CCSTMTTRNRS(newfile);
//reader.InputStream = new StringReader(ParseHeader(file));
reader.InputStream = new StringReader(newfile);
SGMLToXMLFixer is new class I added into the OFXSharp library. It basically scans all the tags that open and verifies it has a closing tag too.
namespace OFXSharp
{
public static class SGMLToXMLFixer
{
public static string Fix_SONRS(string original)
{ .... }
public static string Fix_STMTTRNRS(string original)
{ .... }
public static string Fix_CCSTMTTRNRS(string original)
{ .... }
private static string Fix_Transactions(string file, string transactionTag, int lastIdx, out int lastIdx_new)
{ .... }
private static string Fix_Transactions_Recursive(string file_modified, int lastIdx, out int lastIdx_new)
{ .... }
}
}
Try http://www.codeproject.com/KB/aspnet/Ofx_to_DataSet.aspx. The code uses Framework 3.5 and transforms an ofx into a dataset, this may help with what you're trying to do.

Writing Logs to an XML File with .NET

I am storing logs in an xml file...
In a traditional straight text format approach, you would typically just have a openFile... then writeLine method...
How is it possible to add a new entry into the xml document structure, like you would just with the text file approach?
use an XmlWriter.
example code:
public class Quote
{
public string symbol;
public double price;
public double change;
public int volume;
}
public void Run()
{
Quote q = new Quote
{
symbol = "fff",
price = 19.86,
change = 1.23,
volume = 190393,
};
WriteDocument(q);
}
public void WriteDocument(Quote q)
{
var settings = new System.Xml.XmlWriterSettings
{
OmitXmlDeclaration = true,
Indent= true
};
using (XmlWriter writer = XmlWriter.Create(Console.Out, settings))
{
writer.WriteStartElement("Stock");
writer.WriteAttributeString("Symbol", q.symbol);
writer.WriteElementString("Price", XmlConvert.ToString(q.price));
writer.WriteElementString("Change", XmlConvert.ToString(q.change));
writer.WriteElementString("Volume", XmlConvert.ToString(q.volume));
writer.WriteEndElement();
}
}
example output:
<Stock Symbol="fff">
<Price>19.86</Price>
<Change>1.23</Change>
<Volume>190393</Volume>
</Stock>
see
Writing with an XmlWriter
for more info.
One of the problems with writing a log file in XML format is that you can't just append lines to the end of the file, because the last line has to have a closing root element (for the XML to be valid)
This blog post by Filip De Vos demonstrates quite a good solution to this:
High Performance writing to XML Log files (edit: link now dead so removed)
Basically, you have two XML files linked together using an XML-include thing:
Header file:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE log [
<!ENTITY loglines SYSTEM "loglines.xml">
]>
<log>
&loglines;
</log>
Lines file (in this example, named loglines.xml):
<logline date="2007-07-01 13:56:04.313" text="start process" />
<logline date="2007-07-01 13:56:25.837" text="do something" />
<logline date="2007-07-01 13:56:25.853" text="the end" />
You can then append new lines to the 'lines file', but (most) XML parsers will be able to open the header file and read the lines correctly.
Filip notes that: This XML will not be parsed correctly by every XML parser on the planet. But all the parsers I have used do it correctly.
The big difference is the way you are thinking about your log data. In plain text files you are indeed just adding new lines. XML is a tree structure however, and you need to think about like such. What you are adding is probably another NODE, i.e.:
<log>
<time>12:30:03 PST</time>
<user>joe</user>
<action>login</action>
<log>
Because it is a tree what you need to ask is what parent are you adding this new node to. This is usually all defined in your DTD (Aka, how you are defining the structure of your data). Hopefully this is more helpful then just what library to use as once you understand this principle the interface of the library should make more sense.
Why reinvent the wheel? Use TraceSource Class (System.Diagnostics) with the XmlWriterTraceListener.
Sorry to post a answer for old thread. i developed the same long time ago. here i like to share my full code for logger saved log data in xml file date wise.
logger class code
using System.IO;
using System.Xml;
using System.Threading;
public class BBALogger
{
public enum MsgType
{
Error ,
Info
}
public static BBALogger Instance
{
get
{
if (_Instance == null)
{
lock (_SyncRoot)
{
if (_Instance == null)
_Instance = new BBALogger();
}
}
return _Instance;
}
}
private static BBALogger _Instance;
private static object _SyncRoot = new Object();
private static ReaderWriterLockSlim _readWriteLock = new ReaderWriterLockSlim();
private BBALogger()
{
LogFileName = DateTime.Now.ToString("dd-MM-yyyy");
LogFileExtension = ".xml";
LogPath= Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location) + "\\Log";
}
public StreamWriter Writer { get; set; }
public string LogPath { get; set; }
public string LogFileName { get; set; }
public string LogFileExtension { get; set; }
public string LogFile { get { return LogFileName + LogFileExtension; } }
public string LogFullPath { get { return Path.Combine(LogPath, LogFile); } }
public bool LogExists { get { return File.Exists(LogFullPath); } }
public void WriteToLog(String inLogMessage, MsgType msgtype)
{
_readWriteLock.EnterWriteLock();
try
{
LogFileName = DateTime.Now.ToString("dd-MM-yyyy");
if (!Directory.Exists(LogPath))
{
Directory.CreateDirectory(LogPath);
}
var settings = new System.Xml.XmlWriterSettings
{
OmitXmlDeclaration = true,
Indent = true
};
StringBuilder sbuilder = new StringBuilder();
using (StringWriter sw = new StringWriter(sbuilder))
{
using (XmlWriter w = XmlWriter.Create(sw, settings))
{
w.WriteStartElement("LogInfo");
w.WriteElementString("Time", DateTime.Now.ToString());
if (msgtype == MsgType.Error)
w.WriteElementString("Error", inLogMessage);
else if (msgtype == MsgType.Info)
w.WriteElementString("Info", inLogMessage);
w.WriteEndElement();
}
}
using (StreamWriter Writer = new StreamWriter(LogFullPath, true, Encoding.UTF8))
{
Writer.WriteLine(sbuilder.ToString());
}
}
catch (Exception ex)
{
}
finally
{
_readWriteLock.ExitWriteLock();
}
}
public static void Write(String inLogMessage, MsgType msgtype)
{
Instance.WriteToLog(inLogMessage, msgtype);
}
}
Calling or using this way
BBALogger.Write("pp1", BBALogger.MsgType.Error);
BBALogger.Write("pp2", BBALogger.MsgType.Error);
BBALogger.Write("pp3", BBALogger.MsgType.Info);
MessageBox.Show("done");
may my code help you and other :)
Without more information on what you are doing I can only offer some basic advice to try.
There is a method on most of the XML objects called "AppendChild". You can use this method to add the new node you create with the log comment in it. This node will appear at the end of the item list. You would use the parent element of where all the log nodes are as the object to call on.
Hope that helps.
XML needs a document element (Basically top level tag starting and ending the document).
This means a well formed XML document need have a beginning and end, which does not sound very suitable for logs, where the current "end" of the log is continously extended.
Unless you are writing batches of self contained logs where you write everything to be logged to one file in a short period of time, I'd consider something else than XML.
If you are writing a log of a work-unit done, or a log that doesn't need to be inspected until the whole thing has finished, you could use your approach though - simply openfile, write the log lines, close the file when the work unit is done.
For editing an xml file, you could also use LINQ. You can take a look on how here:
http://www.linqhelp.com/linq-tutorials/adding-to-xml-file-using-linq-and-c/

How can I find the position where a string is malformed XML (in C#)?

I'm writing a lightweight XML editor, and in cases where the user's input is not well formed, I would like to indicate to the user where the problem is, or at least where the first problem is. Does anyone know of an existing algorithm for this? If looking at code helps, if I could fill in the FindIndexOfInvalidXml method (or something like it), this would answer my question.
using System;
namespace TempConsoleApp
{
class Program
{
static void Main(string[] args)
{
string text = "<?xml version=\"1.0\"?><tag1><tag2>Some text.</taagg2></tag1>";
int index = FindIndexOfInvalidXml(text);
Console.WriteLine(index);
}
private static int FindIndexOfInvalidXml(string theString)
{
int index = -1;
//Some logic
return index;
}
}
}
I'd probably just cheat. :) This will get you a line number and position:
string s = "<?xml version=\"1.0\"?><tag1><tag2>Some text.</taagg2></tag1>";
System.Xml.XmlDocument doc = new System.Xml.XmlDocument();
try
{
doc.LoadXml(s);
}
catch(System.Xml.XmlException ex)
{
MessageBox.Show(ex.LineNumber.ToString());
MessageBox.Show(ex.LinePosition.ToString());
}
Unless this is an academic exercise, I think that writing your own XML parser is probably not the best way to go about this. I would probably check out the XmlDocument class within the System.Xml namespace and try/catch exceptions for the Load() or LoadXml() methods. The exception's message property should contain info on where the error occurred (row/col numbers) and I suspect it'd be easier to use a regular expression to extract those error messages and the related positional info.
You should be able to simply load the string into an XmlDocument or an XmlReader and catch XmlException. The XmlException class has a LineNumber property and a LinePosition property.
You can also use XmlValidatingReader if you want to validate against a schema in addition to checking that a document is well-formed.
You'd want to load the string into an XmlDocument object via the load method and then catch any exceptions.
public bool isValidXml(string xml)
{
System.Xml.XmlDocument xDoc = null;
bool valid = false;
try
{
xDoc = new System.Xml.XmlDocument();
xDoc.loadXml(xmlString);
valid = true;
}
catch
{
// trap for errors
}
return valid;
}

Categories