Xml gets corrupted each time I append a node - c#

I have an Xml file as:
<?xml version="1.0"?>
<hashnotes>
<hashtags>
<hashtag>#birthday</hashtag>
<hashtag>#meeting</hashtag>
<hashtag>#anniversary</hashtag>
</hashtags>
<lastid>0</lastid>
<Settings>
<Font>Arial</Font>
<HashtagColor>red</HashtagColor>
<passwordset>0</passwordset>
<password></password>
</Settings>
</hashnotes>
I then call a function to add a node in the xml,
The function is :
public static void CreateNoteNodeInXDocument(XDocument argXmlDoc, string argNoteText)
{
string lastId=((Convert.ToInt32(argXmlDoc.Root.Element("lastid").Value)) +1).ToString();
string date = DateTime.Now.ToString("MM/dd/yyyy");
argXmlDoc.Element("hashnotes").Add(new XElement("Note", new XAttribute("ID", lastId), new XAttribute("Date",date),new XElement("Text", argNoteText)));
//argXmlDoc.Root.Note.Add new XElement("Text", argNoteText)
List<string> hashtagList = Utilities.GetHashtagsFromText(argNoteText);
XElement reqNoteElement = (from xml2 in argXmlDoc.Descendants("Note")
where xml2.Attribute("ID").Value == lastId
select xml2).FirstOrDefault();
if (reqNoteElement != null)
{
foreach (string hashTag in hashtagList)
{
reqNoteElement.Add(new XElement("hashtag", hashTag));
}
}
argXmlDoc.Root.Element("lastid").Value = lastId;
}
After this I save the xml.
Next time when I try to load the Xml, it fails with an exception:
System.Xml.XmlException: Unexpected XML declaration. The XML declaration must be the first node in the document, and no white space characters are allowed to appear before it.
Here is the code to load the XML:
private static XDocument hashNotesXDocument;
private static Stream hashNotesStream;
StorageFile hashNoteXml = await InstallationFolder.GetFileAsync("hashnotes.xml");
hashNotesStream = await hashNoteXml.OpenStreamForWriteAsync();
hashNotesXDocument = XDocument.Load(hashNotesStream);
and I save it using:
hashNotesXDocument.Save(hashNotesStream);

You don't show all of your code, but it looks like you open the XML file, read the XML from it into an XDocument, edit the XDocument in memory, then write back to the opened stream. Since the stream is still open it will be positioned at the end of the file and thus the new XML will be appended to the file.
Suggest eliminating hashNotesXDocument and hashNotesStream as static variables, and instead open and read the file, modify the XDocument, then open and write the file using the pattern shown here.
I'm working only on desktop code (using an older version of .Net) so I can't test this, but something like the following should work:
static async Task LoadUpdateAndSaveXml(Action<XDocument> editor)
{
XDocument doc;
var xmlFile = await InstallationFolder.GetFileAsync("hashnotes.xml");
using (var reader = new StreamReader(await xmlFile.OpenStreamForReadAsync()))
{
doc = XDocument.Load(reader);
}
if (doc != null)
{
editor(doc);
using (var writer = new StreamWriter(await xmlFile.OpenStreamForWriteAsync()))
{
// Truncate - https://stackoverflow.com/questions/13454584/writing-a-shorter-stream-to-a-storagefile
if (writer.CanSeek && writer.Length > 0)
writer.SetLength(0);
doc.Save(writer);
}
}
}
Also, be sure to create the file before using it.

Related

C# Parse XML file into object from given tag

I have an xml file and the dataset that I want to make into an object is encapsulated by another tag, so when I try and parse it, of course it throws an InvalidOperationException, due to the unexpected member.
I've tried reading various MS Docs about xml, as well as googling my problem, but I couldn't find how could I solve it without too much hussle.
My code:
public static ClassToDeserialize GetObjectFromXml (string path)
{
XmlSerializer xmlSerializer = new XmlSerializer(typeof(ClassToDeserialize));
System.IO.FileStream file = System.IO.File.OpenRead(path);
ClassToDeserialize loadedObjectXml = xmlSerializer.Deserialize(file) as ClassToDeserialize;
return loadedLicenseXml;
}
So how could I tell this program to start deserializing only from a specific tag, as that contains the object's related xml data?
You might try read Xml up the point you find your node and then retrieve it's outer xml and put that into XmlSerializer. Let's say you have a simple XML file like this one:
<rootnode>
<!-- some nodes inside -->
<uselessNode>
<thatsWhatIWant>
<!-- some fields inside -->
<uselessNodeInside/>
<usefullNodeInside/>
</thatsWhatIWant>
</uselessNode>
</rootnode>
What you could do is open up XmlReader:
XmlReader reader = XmlReader.Create("path/to/myfile.xml");
Then read contents up to your POI and store that in some variable:
string wantedNodeContents = string.Empty;
while (reader.Read())
{
if(reader.NodeType == XmlNodeType.Element && reader.Name == "thatsWhatIWant")
{
wantedNodeContents = reader.ReadOuterXml();
break;
}
}
Having this you should be able to use XmlSerializer like so:
XmlSerializer xmlSerializer = new XmlSerializer(typeof(ClassToDeserialize));
System.IO.TextReader textReader = System.IO.StringReader(wantedNodeContents);
ClassToDeserialize loadedObjectXml = xmlSerializer.Deserialize(textReader) as ClassToDeserialize;
You can alternatively (or in addition to that) try to add some handlers for UnknownNode and UnknownAttribute:
xmlSerializer.UnknownNode+= new XmlNodeEventHandler(UnknownNode);
xmlSerializer.UnknownAttribute+= new XmlAttributeEventHandler(UnknownAttribute);
void UnknownNode(object sender, XmlNodeEventArgs e) { }
void UnknownAttribute(object sender, XmlAttributeEventArgs e) { }

How to parse a xml document without a root node?

I have an xml document which has no root node. It looks like this:
<?xml version="1.0"?>
<Line>
<City>Paris</City>
<Country>France</Country>
</Line>
<Line>
<City>Lissabon</City>
<Country>Spain</Country>
</Line>
No I want to read Line by Line and write the contents to a database. However, XmlDocument seems to insist that there must exist a root node. How can I process this file?
If you want to parse it as an XML document, you can add a root node like Denis proposed in his comment.
If you would just like to read each line and write it to a database, you can handle the file like an ordinary (text) file and read its contents line by line using a StreamReader.
This would look something like this:
string line;
// Read the file and process it line by line.
var reader = new StreamReader(FILEPATH);
while((line = reader.ReadLine()) != null)
{
// Depending on what you need, you could strip the XML tags
// And write the line to the database
}
reader.Close();
You could try something like this (simple WinForms app with a button and a rich text box to display output for testing):
using System;
using System.Text;
using System.Xml;
using System.Windows.Forms;
namespace WindowsFormsApp11
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
StringBuilder sb = new StringBuilder();
XmlReaderSettings settings = new XmlReaderSettings
{
ConformanceLevel = ConformanceLevel.Fragment
};
using (XmlReader reader = XmlReader.Create(#"c:\ab\countries.xml", settings))
{
while(reader.Read())
{
if (reader.Name != "Line") // Ignore the <Line> nodes
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
sb.Append(string.Format("{0}:", reader.Name));
break;
case XmlNodeType.Text:
sb.Append(string.Format(" {0}{1}", reader.Value, Environment.NewLine));
break;
}
}
}
}
richTextBox1.Text = sb.ToString();
}
}
}
May be not the best solution, but you could create a List (or array) from your XML and insert missing nodes:
// Read lines into List
var list = File.ReadLines("doc.xml").ToList();
// Insert missing nodes
list.Insert(1, "<root>"); // Use 1, because 0 is XML directive
list.Insert(list.Count, "</root>"); //Add closing tag to the end
// Create final XML string with LINQ
var xml_str = list.Aggregate("", (acc, s) => acc + s);
// Having a string, we can create, for instance, XElement (or XDocument)
var xml = XElement.Parse(xml_str);
Console.WriteLine(xml.Element("Line").Element("City").Value);
//Output: Paris

C# - From XML to Database

I got an XML file which can have several nodes, child nodes, "child child nodes", ... and I'd like to figure out how to get these data in order to store them into my own SQL Server database.
I've read some tutos on internet and also tried some things. At the current moment, I'm able to open and read the file but not to retrieve data. Here's what I'm doing for instance :
class Program
{
static void Main(string[] args)
{
Person p = new Person();
string filePath = #"C:\Users\Desktop\ConsoleApplication1\XmlPersonTest.xml";
XmlDocument xmlDoc = new XmlDocument();
if(File.Exists(filePath))
{
xmlDoc.Load(filePath);
XmlElement elm = xmlDoc.DocumentElement;
XmlNodeList list = elm.ChildNodes;
Console.WriteLine("The root element contains {0} nodes",
list.Count);
}
else
{
Console.WriteLine("The file {0} could not be located",
filePath);
}
Console.Read();
}
}
And here's a small example of what my XML file looks like :
<person>
<name>McMannus</name>
<firstname>Fionn</firstname>
<age>21</age>
<nationality>Belge</nationality>
<car>
<mark>Audi</mark>
<model>A1</model>
<year>2013</year>
<hp>70</hp>
</car>
<car>
<mark>VW</mark>
<model>Golf 7</model>
<year>2014</year>
<hp>99</hp>
</car>
<car>
<mark>BMW</mark>
<model>Série 1</model>
<year>2013</year>
<hp>80</hp>
</car>
</person>
Any advice or tuto to do that guys?
I have made a little method for navigating through xml nodes, using XElement (Linq.Xml):
public string Get(XElement root, string path)
{
if (root== null)
return null;
string[] p = path.Split(new string[] { "/" }, StringSplitOptions.RemoveEmptyEntries);
XElement at = root;
foreach (string n in p)
{
at = at.Element(n);
if (at == null)
return null;
}
return at.Value;
}
Using this, you can get the value of an XElement node via Get(root, "rootNode/nodeA/nodeAChild/etc")
Well, having gone through something similar the other day. You should try the following, initially build a model:
Open your XML Document.
Copy your entire XML Document.
Open Visual Studio.
Click in an area out of your initial class (1b diagram)
Go to Edit in Visual Studio
Paste Special - Paste as XML Classes
1b:
namespace APICore
{
public class APIParser()
{
// Parse logic would go here.
}
// You would click here.
}
When you do that you'll end up with a valid XML Model, which can be accessed through your parser, how you choose to access the XML Web or Local will be up to you. For simplicity I'm going to choose a file:
public class APIParser(string file)
{
// Person should be Xml Root Element Class.
XmlSerializer serialize = new XmlSerializer(typeof(Person));
using(FileStream stream = new FileStream(file, FileMode.Open, FileAccess.ReadWrite, FileShare.ReadWrite))
using(XmlReader reader XmlReader.Create(stream))
{
Person model = serialize.Deserialize(reader) as Person;
}
}
So now you've successfully got the data to iterate through, so you can work with your data. Here is an example of how you would:
// Iterates through each Person
foreach(var people in model.Person)
{
var information = people.Cars.SelectMany(obj => new { obj.Mark, obj.model, obj.year, obj.hp }).ToList();
}
You would do something like that, then write to the database. This won't fit your example perfectly but should point you in a strong direction.

XDocument will not parse html entities (e.g. ) but XmlDocument will

I am currently converting our old parsers that run on XmlDocument to the XDocument. I do this mainly to get the Linq querying and the added linenumber info.
My xml contains an element like this:
<?xml version="1.0"?>
<fulltext>
hello this is a failed textnode
and I don't know how to parse it.
</fulltext>
My problem is that while XmlDocument seems to have no problem reading that node with:
var xmlDocument = new XmlDocument();
var physicalPath = GetPhysicalPath(uploadFolderFile);
try
{
xmlDocument.Load(physicalPath);
}
catch (XmlException xmlException)
{
_log.Warn("Problems with the document", xmlException);
}
The example above parses the document fine but when I try to do:
XDocument xmlDocument;
var physicalPath = GetPhysicalPath(uploadFolderFile);
var xmlStream = new System.IO.StreamReader(physicalPath);
try
{
xmlDocument = XDocument.Load(xmlStream, LoadOptions.SetLineInfo | LoadOptions.SetBaseUri);
}
catch (XmlException)
{
_log.Warn("Trying to clean document for HexaDecimal", xmlException);
}
It fails to read the document because of the character
The special character seems to be allowed in XML version 1.1 but changing the description doesn't help.
I have thought about just parsing the document with XmlDocument and then converting it; but that seems to be counterintuitive. Can anybody help with this problem?
Ok...so I sort of found a solution to this problem.
First of all I try to parse the xml using the following code:
private XDocument GetXmlDocument(String physicalPath)
{
XDocument xmlDocument;
var xmlStream = new System.IO.StreamReader(physicalPath);
try
{
xmlDocument = XDocument.Load(xmlStream, LoadOptions.SetLineInfo);
}
catch (XmlException)
{
//_log.Warn("Trying to clean document for HexaDecimal", xmlException);
xmlDocument = XmlSanitizingStream.TryToCleanXMLBeforeParsing(physicalPath);
}
return xmlDocument;
}
If it fails to load the document, then I will try to clean it using the technique used in this blogpost:
http://seattlesoftware.wordpress.com/2008/09/11/hexadecimal-value-0-is-an-invalid-character/
It will not remove the character I mentioned before, but it will remove any character not allowed by the XML standard.
Then, after sanitizing the XML, I add an XMLReader and set its settings to not check characters:
public static XDocument TryToCleanXMLBeforeParsing(String physicalPath)
{
string xml;
Encoding encoding;
using (var reader = new XmlSanitizingStream(File.OpenRead(physicalPath)))
{
xml = reader.ReadToEnd();
encoding = reader.CurrentEncoding;
}
byte[] encodedString;
if (encoding.Equals(Encoding.UTF8)) encodedString = Encoding.UTF8.GetBytes(xml);
else if (encoding.Equals(Encoding.UTF32)) encodedString = Encoding.UTF32.GetBytes(xml);
else encodedString = Encoding.Unicode.GetBytes(xml);
var ms = new MemoryStream(encodedString);
ms.Flush();
ms.Position = 0;
var settings = new XmlReaderSettings {CheckCharacters = false};
XmlReader xmlReader = XmlReader.Create(ms, settings);
var xmlDocument = XDocument.Load(xmlReader);
ms.Close();
return xmlDocument;
}
Since I've cleaned the document removing illegal characters before I add the ignore characters to the reader, I am pretty sure that I do not read a malformed XML document. Worst case scenario is I get a malformed XML and it will throw an error anyways.
I only use this for parsing and it should only be used to read the data. This will not make the XML well-formed and will in many cases throw exceptions elsewhere in your code. I am only using this because I cannot change what the customer is sending us and I have to read it as is.

XML Parsing - Read a Simple XML File and Retrieve Values

I've written a Task Scheduling program for learning purposes. Currently I'm saving the scheduled tasks just as plain text and then parsing it using Regex. This looks messy (code wise) and is not very coherent.
I would like to load the scheduled tasks from an XML file instead, I've searched quite a bit to find some solutions but I couldn't get it to work how I wanted.
I wrote an XML file structured like this to store my data in:
<Tasks>
<Task>
<Name>Shutdown</Name>
<Location>C:/WINDOWS/system32/shutdown.exe</Location>
<Arguments>-s -f -t 30</Arguments>
<RunWhen>
<Time>8:00:00 a.m.</Time>
<Date>18/03/2011</Date>
<Days>
<Monday>false</Monday>
<Tuesday>false</Tuesday>
<Wednesday>false</Wednesday>
<Thursday>false</Thursday>
<Friday>false</Friday>
<Saturday>false</Saturday>
<Sunday>false</Sunday>
<Everyday>true</Everyday>
<RunOnce>false</RunOnce>
</Days>
</RunWhen>
<Enabled>true</Enabled>
</Task>
</Tasks>
The way I'd like to parse the data is like so:
Open Tasks.xml
Load the first Task tag.
In that task retrieve the values of the Name, Location and Arguments tags.
Then open the RunWhen tag and retrieve the values of the Time and Date tags.
After that open the Days tag and retrieve the value of each individual tag within.
Retrieve the value of Enabled.
Load the next task and repeat steps 3 -> 7 until all the Task tags in Tasks have been parsed.
I'm very sure you can do it this way I just can't work it out as there are so many different ways to do things in XML I got a bit overwhelmed. But what I've go so far is that I would most likely be using XPathDocument and XPathNodeIterator right?
If someone can show me an example or explain to me how this would be done I would be very happy.
Easy way to parse the xml is to use the LINQ to XML
for example you have the following xml file
<library>
<track id="1" genre="Rap" time="3:24">
<name>Who We Be RMX (feat. 2Pac)</name>
<artist>DMX</artist>
<album>The Dogz Mixtape: Who's Next?!</album>
</track>
<track id="2" genre="Rap" time="5:06">
<name>Angel (ft. Regina Bell)</name>
<artist>DMX</artist>
<album>...And Then There Was X</album>
</track>
<track id="3" genre="Break Beat" time="6:16">
<name>Dreaming Your Dreams</name>
<artist>Hybrid</artist>
<album>Wide Angle</album>
</track>
<track id="4" genre="Break Beat" time="9:38">
<name>Finished Symphony</name>
<artist>Hybrid</artist>
<album>Wide Angle</album>
</track>
<library>
For reading this file, you can use the following code:
public void Read(string fileName)
{
XDocument doc = XDocument.Load(fileName);
foreach (XElement el in doc.Root.Elements())
{
Console.WriteLine("{0} {1}", el.Name, el.Attribute("id").Value);
Console.WriteLine(" Attributes:");
foreach (XAttribute attr in el.Attributes())
Console.WriteLine(" {0}", attr);
Console.WriteLine(" Elements:");
foreach (XElement element in el.Elements())
Console.WriteLine(" {0}: {1}", element.Name, element.Value);
}
}
I usually use XmlDocument for this. The interface is pretty straight forward:
var doc = new XmlDocument();
doc.LoadXml(xmlString);
You can access nodes similar to a dictionary:
var tasks = doc["Tasks"];
and loop over all children of a node.
Try XmlSerialization
try this
[Serializable]
public class Task
{
public string Name{get; set;}
public string Location {get; set;}
public string Arguments {get; set;}
public DateTime RunWhen {get; set;}
}
public void WriteXMl(Task task)
{
XmlSerializer serializer;
serializer = new XmlSerializer(typeof(Task));
MemoryStream stream = new MemoryStream();
StreamWriter writer = new StreamWriter(stream, Encoding.Unicode);
serializer.Serialize(writer, task);
int count = (int)stream.Length;
byte[] arr = new byte[count];
stream.Seek(0, SeekOrigin.Begin);
stream.Read(arr, 0, count);
using (BinaryWriter binWriter=new BinaryWriter(File.Open(#"C:\Temp\Task.xml", FileMode.Create)))
{
binWriter.Write(arr);
}
}
public Task GetTask()
{
StreamReader stream = new StreamReader(#"C:\Temp\Task.xml", Encoding.Unicode);
return (Task)serializer.Deserialize(stream);
}
Are you familiar with the DataSet class?
The DataSet can also load XML documents and you may find it easier to iterate.
http://msdn.microsoft.com/en-us/library/system.data.dataset.readxml.aspx
DataSet dt = new DataSet();
dt.ReadXml(#"c:\test.xml");
class Program
{
static void Main(string[] args)
{
//Load XML from local
string sourceFileName="";
string element=string.Empty;
var FolderPath=#"D:\Test\RenameFileWithXmlAttribute";
string[] files = Directory.GetFiles(FolderPath, "*.xml");
foreach (string xmlfile in files)
{
try
{
sourceFileName = xmlfile;
XElement xele = XElement.Load(sourceFileName);
string convertToString = xele.ToString();
XElement parseXML = XElement.Parse(convertToString);
element = parseXML.Descendants("Meta").Where(x => (string)x.Attribute("name") == "XMLTAG").Last().Value;
DirectoryInfo CurrentDate = Directory.CreateDirectory(DateTime.Now.ToString("yyyy-MM-dd"));
string saveWithThisName= Path.Combine(CurrentDate.FullName, element);
File.Copy(sourceFileName, saveWithThisName,true);
}
catch(Exception ex)
{
}
}
}
}

Categories