C# - From XML to Database - c#

I got an XML file which can have several nodes, child nodes, "child child nodes", ... and I'd like to figure out how to get these data in order to store them into my own SQL Server database.
I've read some tutos on internet and also tried some things. At the current moment, I'm able to open and read the file but not to retrieve data. Here's what I'm doing for instance :
class Program
{
static void Main(string[] args)
{
Person p = new Person();
string filePath = #"C:\Users\Desktop\ConsoleApplication1\XmlPersonTest.xml";
XmlDocument xmlDoc = new XmlDocument();
if(File.Exists(filePath))
{
xmlDoc.Load(filePath);
XmlElement elm = xmlDoc.DocumentElement;
XmlNodeList list = elm.ChildNodes;
Console.WriteLine("The root element contains {0} nodes",
list.Count);
}
else
{
Console.WriteLine("The file {0} could not be located",
filePath);
}
Console.Read();
}
}
And here's a small example of what my XML file looks like :
<person>
<name>McMannus</name>
<firstname>Fionn</firstname>
<age>21</age>
<nationality>Belge</nationality>
<car>
<mark>Audi</mark>
<model>A1</model>
<year>2013</year>
<hp>70</hp>
</car>
<car>
<mark>VW</mark>
<model>Golf 7</model>
<year>2014</year>
<hp>99</hp>
</car>
<car>
<mark>BMW</mark>
<model>Série 1</model>
<year>2013</year>
<hp>80</hp>
</car>
</person>
Any advice or tuto to do that guys?

I have made a little method for navigating through xml nodes, using XElement (Linq.Xml):
public string Get(XElement root, string path)
{
if (root== null)
return null;
string[] p = path.Split(new string[] { "/" }, StringSplitOptions.RemoveEmptyEntries);
XElement at = root;
foreach (string n in p)
{
at = at.Element(n);
if (at == null)
return null;
}
return at.Value;
}
Using this, you can get the value of an XElement node via Get(root, "rootNode/nodeA/nodeAChild/etc")

Well, having gone through something similar the other day. You should try the following, initially build a model:
Open your XML Document.
Copy your entire XML Document.
Open Visual Studio.
Click in an area out of your initial class (1b diagram)
Go to Edit in Visual Studio
Paste Special - Paste as XML Classes
1b:
namespace APICore
{
public class APIParser()
{
// Parse logic would go here.
}
// You would click here.
}
When you do that you'll end up with a valid XML Model, which can be accessed through your parser, how you choose to access the XML Web or Local will be up to you. For simplicity I'm going to choose a file:
public class APIParser(string file)
{
// Person should be Xml Root Element Class.
XmlSerializer serialize = new XmlSerializer(typeof(Person));
using(FileStream stream = new FileStream(file, FileMode.Open, FileAccess.ReadWrite, FileShare.ReadWrite))
using(XmlReader reader XmlReader.Create(stream))
{
Person model = serialize.Deserialize(reader) as Person;
}
}
So now you've successfully got the data to iterate through, so you can work with your data. Here is an example of how you would:
// Iterates through each Person
foreach(var people in model.Person)
{
var information = people.Cars.SelectMany(obj => new { obj.Mark, obj.model, obj.year, obj.hp }).ToList();
}
You would do something like that, then write to the database. This won't fit your example perfectly but should point you in a strong direction.

Related

How to parse a xml document without a root node?

I have an xml document which has no root node. It looks like this:
<?xml version="1.0"?>
<Line>
<City>Paris</City>
<Country>France</Country>
</Line>
<Line>
<City>Lissabon</City>
<Country>Spain</Country>
</Line>
No I want to read Line by Line and write the contents to a database. However, XmlDocument seems to insist that there must exist a root node. How can I process this file?
If you want to parse it as an XML document, you can add a root node like Denis proposed in his comment.
If you would just like to read each line and write it to a database, you can handle the file like an ordinary (text) file and read its contents line by line using a StreamReader.
This would look something like this:
string line;
// Read the file and process it line by line.
var reader = new StreamReader(FILEPATH);
while((line = reader.ReadLine()) != null)
{
// Depending on what you need, you could strip the XML tags
// And write the line to the database
}
reader.Close();
You could try something like this (simple WinForms app with a button and a rich text box to display output for testing):
using System;
using System.Text;
using System.Xml;
using System.Windows.Forms;
namespace WindowsFormsApp11
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
StringBuilder sb = new StringBuilder();
XmlReaderSettings settings = new XmlReaderSettings
{
ConformanceLevel = ConformanceLevel.Fragment
};
using (XmlReader reader = XmlReader.Create(#"c:\ab\countries.xml", settings))
{
while(reader.Read())
{
if (reader.Name != "Line") // Ignore the <Line> nodes
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
sb.Append(string.Format("{0}:", reader.Name));
break;
case XmlNodeType.Text:
sb.Append(string.Format(" {0}{1}", reader.Value, Environment.NewLine));
break;
}
}
}
}
richTextBox1.Text = sb.ToString();
}
}
}
May be not the best solution, but you could create a List (or array) from your XML and insert missing nodes:
// Read lines into List
var list = File.ReadLines("doc.xml").ToList();
// Insert missing nodes
list.Insert(1, "<root>"); // Use 1, because 0 is XML directive
list.Insert(list.Count, "</root>"); //Add closing tag to the end
// Create final XML string with LINQ
var xml_str = list.Aggregate("", (acc, s) => acc + s);
// Having a string, we can create, for instance, XElement (or XDocument)
var xml = XElement.Parse(xml_str);
Console.WriteLine(xml.Element("Line").Element("City").Value);
//Output: Paris

How to display a large XML file (>21MB) in a tree view quickly

I need to display a large XML file (>21MB) in a tree view control in a C# Windows Form application. I have written the code which is working for small XML files but when i am trying to open a BIG XML file (>1 MB), its taking too much of time.
Can anyone suggest how i can optimise this and suggest me any changes or alternatives to achieve this.
Below is the code snippet:
private void CreateTreeViewFromATXML(string strSrcFileName)
{
XmlDataDocument xmldoc = new XmlDataDocument();
XmlNode xmlnode ;
FileStream fs = new FileStream(strSrcFileName, FileMode.Open, FileAccess.Read);
xmldoc.Load(fs);
xmlnode = xmldoc.ChildNodes[1];
XMLTreeView.Nodes.Clear();
XMLTreeView.Nodes.Add(new TreeNode(xmldoc.DocumentElement.Name));
TreeNode tNode ;
tNode = XMLTreeView.Nodes[0];
AddNode(xmlnode, tNode);
}
private void AddNode(XmlNode inXmlNode, TreeNode inTreeNode)
{
//XmlNode xNode ;
TreeNode tNode ;
XmlNodeList nodeList ;
int i = 0;
if (inXmlNode.HasChildNodes)
{
nodeList = inXmlNode.ChildNodes;
foreach (XmlNode XNode in inXmlNode.ChildNodes)
{
tNode = new TreeNode(XNode.Name);
inTreeNode.Nodes.Add(tNode);
AddNode(XNode, tNode);
}
}
else
{
inTreeNode.Text = inXmlNode.InnerText.ToString();
}
}
I would wrap your code like this:
XMLTreeView.BeginUpdate();
try
{
CreateTreeViewFromATXML(strSrcFileName);
}
catch (Exception e)
{
//Handle any error
}
finally
{
XMLTreeView.EndUpdate();
}
If you're not in an update block it's repainting the GUI on every node add and that's expensive. You also have recursion in AddNode but if the XML isn't too deeply nested it shouldn't be an issue.
I would suggest using XDocument and XML to Linq for a faster parsing. You can use the following code to parse the XML:
using System.Xml;
using System.Xml.Linq;
using System.Data;
XDocument xdoc = XDocument.Load(XMLFile);
var item = from items in xdoc.Element("EPICORTLOG").Descendants("POS")
where (string)items.Element("Id") == strSelectedPOSID
select items.Elements("TRADE").Elements("ITEM").ToList().ToList();
You can then follow the explanation in the following link to parse the XML:
http://www.dotnetcurry.com/showarticle.aspx?ID=564
The article above will explain the XML to LINQ programming. Using the above method you can load XML files as big as 10MB in a short time.
Recently, I used TreeView component to implement my HTML editor in C#, I used serialized data structure to impove the performance to open and save a XML file.
From my experience , using this structure and serialize read and write file can open file up to 20M byes within 2 seconds in my computer. This solution can open a XML file over 2G bytes in my C# application. Hope this solution can help you better.
Example to define Serializable Structure for TreeView and TreeNode
[Serializable]
public class TreeViewData
{
public TreeNodeData[] Nodes;
public TreeViewData(){ }
public TreeViewData(TreeView treeview)
{
//your code
}
public TreeViewData(TreeNode treenode)
{
//your code
}
public void PopulateTree(TreeView treeview)
{
//your code
}
public void PopulateSubTree(TreeNode treenode)
{
//your code
}
}
[Serializable]
public class TreeNodeData
{
public string Text;
public int ImageIndex;
public int SelectedImageIndex;
public string Tag;
public TreeNodeData[] Nodes;
public TreeNodeData() {}
public TreeNodeData(TreeNode node)
{
// your code
}
public TreeNode ToTreeNode()
{
// your code
}
}
Example to serialize read XML file
System.Xml.Serialization.XmlSerializer ser = new System.Xml.Serialization.XmlSerializer(typeof(TreeViewData));
System.IO.FileStream file = new System.IO.FileStream(strFilename, FileMode.Open);
System.Xml.XmlTextReader reader = new System.Xml.XmlTextReader(file);
TreeViewData treeData = (TreeViewData)ser.Deserialize(reader);
treeData.PopulateTree(TreeView1);
Example to serialize write XML file
System.Xml.Serialization.XmlSerializer ser = new System.Xml.Serialization.XmlSerializer(typeof(TreeViewData));
System.IO.FileStream file = new System.IO.FileStream(strFilename, System.IO.FileMode.Create);
System.Xml.XmlTextWriter writer = new System.Xml.XmlTextWriter(file, null);
ser.Serialize(writer, new TreeViewData(TreeView1));

Xml gets corrupted each time I append a node

I have an Xml file as:
<?xml version="1.0"?>
<hashnotes>
<hashtags>
<hashtag>#birthday</hashtag>
<hashtag>#meeting</hashtag>
<hashtag>#anniversary</hashtag>
</hashtags>
<lastid>0</lastid>
<Settings>
<Font>Arial</Font>
<HashtagColor>red</HashtagColor>
<passwordset>0</passwordset>
<password></password>
</Settings>
</hashnotes>
I then call a function to add a node in the xml,
The function is :
public static void CreateNoteNodeInXDocument(XDocument argXmlDoc, string argNoteText)
{
string lastId=((Convert.ToInt32(argXmlDoc.Root.Element("lastid").Value)) +1).ToString();
string date = DateTime.Now.ToString("MM/dd/yyyy");
argXmlDoc.Element("hashnotes").Add(new XElement("Note", new XAttribute("ID", lastId), new XAttribute("Date",date),new XElement("Text", argNoteText)));
//argXmlDoc.Root.Note.Add new XElement("Text", argNoteText)
List<string> hashtagList = Utilities.GetHashtagsFromText(argNoteText);
XElement reqNoteElement = (from xml2 in argXmlDoc.Descendants("Note")
where xml2.Attribute("ID").Value == lastId
select xml2).FirstOrDefault();
if (reqNoteElement != null)
{
foreach (string hashTag in hashtagList)
{
reqNoteElement.Add(new XElement("hashtag", hashTag));
}
}
argXmlDoc.Root.Element("lastid").Value = lastId;
}
After this I save the xml.
Next time when I try to load the Xml, it fails with an exception:
System.Xml.XmlException: Unexpected XML declaration. The XML declaration must be the first node in the document, and no white space characters are allowed to appear before it.
Here is the code to load the XML:
private static XDocument hashNotesXDocument;
private static Stream hashNotesStream;
StorageFile hashNoteXml = await InstallationFolder.GetFileAsync("hashnotes.xml");
hashNotesStream = await hashNoteXml.OpenStreamForWriteAsync();
hashNotesXDocument = XDocument.Load(hashNotesStream);
and I save it using:
hashNotesXDocument.Save(hashNotesStream);
You don't show all of your code, but it looks like you open the XML file, read the XML from it into an XDocument, edit the XDocument in memory, then write back to the opened stream. Since the stream is still open it will be positioned at the end of the file and thus the new XML will be appended to the file.
Suggest eliminating hashNotesXDocument and hashNotesStream as static variables, and instead open and read the file, modify the XDocument, then open and write the file using the pattern shown here.
I'm working only on desktop code (using an older version of .Net) so I can't test this, but something like the following should work:
static async Task LoadUpdateAndSaveXml(Action<XDocument> editor)
{
XDocument doc;
var xmlFile = await InstallationFolder.GetFileAsync("hashnotes.xml");
using (var reader = new StreamReader(await xmlFile.OpenStreamForReadAsync()))
{
doc = XDocument.Load(reader);
}
if (doc != null)
{
editor(doc);
using (var writer = new StreamWriter(await xmlFile.OpenStreamForWriteAsync()))
{
// Truncate - https://stackoverflow.com/questions/13454584/writing-a-shorter-stream-to-a-storagefile
if (writer.CanSeek && writer.Length > 0)
writer.SetLength(0);
doc.Save(writer);
}
}
}
Also, be sure to create the file before using it.

What is the most efficient way to take XML from API and store it locally?

I am trying to find the fastest way to read XML from the merriam webster dictionary, and store it to a local file for later use. Below, I try to implement a module which does a few things:
Read 2000 words from a local directory
Look up each of the words in the merriam dictionary using the API
Store the definition(s) in a local XML for later use.
Im not sure if making an XML is the best way to store this data, but it seemed like the simplest thing to do. At first, I thought I would do it in different steps. (1. Look up word, store word and definitions into data structure. 2. Dump all data into XML.) However, this poses a problem, because it just too much stuff to store on the runtime(call) stack.
So, in this scenario, I try to speed things up by looking up each word and then saving it to the xml one by one. This, however, is also a slow method. Its taking me up around 10 minutes per 500-600 words.
public void load_module() // stores words/definitions into xml file
{ // 1. Pick up word from text file 2. Look up word's definition 3. Store in Xml
string workdirect = Directory.GetCurrentDirectory();
workdirect = workdirect.Substring(0, workdirect.LastIndexOf("bin"));
workdirect += "words1.txt";
using (StreamReader read = new StreamReader(workdirect)) // 1. Pick up word from text file
{
while (!read.EndOfStream)
{
string line = read.ReadLine();
var definitions = load(line.ToLower()); // 2. Retrieve Words Definitions
store_xml(line, definitions);
wordlist.Add(line);
}
}
}
public List<string> load(string word)
{
XmlDocument doc = new XmlDocument();
List<string> definitions = new List<string>();
XmlNodeList node = null;
doc.Load("http://www.dictionaryapi.com/api/v1/references/collegiate/xml/"+word+"?key=*****************"); // Asteriks to hide the actual API key
if (doc.SelectSingleNode("entry_list").SelectSingleNode("entry").SelectSingleNode("def") == null)
{
return definitions;
}
node = doc.SelectSingleNode("entry_list").SelectSingleNode("entry").SelectSingleNode("def").SelectNodes("dt");
// TO DO : implement definitions if there is no node "def" in first node entry "entry_list"
foreach (XmlNode item in node)
{
definitions.Add(item.InnerXml.ToString().ToLower());
}
return definitions;
}
public void store_xml(string word, List<string> definitions)
{
string local = Directory.GetCurrentDirectory();
string name = "dictionary_word.xml";
local = local.Substring(0, local.LastIndexOf("bin"));
bool exists = File.Exists(local + name);
if (exists)
{
XmlDocument doc = new XmlDocument();
doc.Load(local + name);
XmlElement wordindoc = doc.CreateElement("Word");
wordindoc.SetAttribute("xmlns", word);
XmlElement defs = doc.CreateElement("Definitions");
foreach (var item in definitions)
{
XmlElement def = doc.CreateElement("Definition");
def.InnerText = item;
defs.AppendChild(def);
}
wordindoc.AppendChild(defs);
doc.DocumentElement.AppendChild(wordindoc);
doc.Save(local+name);
}
else
{
using (XmlWriter writer = XmlWriter.Create(#local + name))
{
writer.WriteStartDocument();
writer.WriteStartElement("Dictionary");
writer.WriteStartElement("Word", word);
writer.WriteStartElement("Definitions");
foreach (var def in definitions)
{
writer.WriteElementString("Definition", def);
}
writer.WriteEndElement();
writer.WriteEndElement();
writer.WriteEndElement();
writer.WriteEndDocument();
}
}
}
}
When handling large amounts of data that need to be exported to XML, I would normally keep the data in memory as a collection of custom objects rather than as an XMLDocument:
public class Definition
{
public string Word { get; set; }
public string Definition { get; set; }
}
I would then use XMLWriter to write the collection to the XML file:
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.IndentChars = (" ");
settings.Encoding = Encoding.UTF8;
using (XmlWriter writer = XmlWriter.Create("C:\output\output.xml", settings))
{
writer.WriteStartDocument();
// TODO - use XMLWriter functions to write out each word and definition
writer.Flush();
}
If you are still short on memory, you might be able to write out the XML in batches (e.g. every 500 definitions).
I found the Microsoft article on Improving XML Performance a very useful reference, particularly the section on Design Considerations.

XML Parsing - Read a Simple XML File and Retrieve Values

I've written a Task Scheduling program for learning purposes. Currently I'm saving the scheduled tasks just as plain text and then parsing it using Regex. This looks messy (code wise) and is not very coherent.
I would like to load the scheduled tasks from an XML file instead, I've searched quite a bit to find some solutions but I couldn't get it to work how I wanted.
I wrote an XML file structured like this to store my data in:
<Tasks>
<Task>
<Name>Shutdown</Name>
<Location>C:/WINDOWS/system32/shutdown.exe</Location>
<Arguments>-s -f -t 30</Arguments>
<RunWhen>
<Time>8:00:00 a.m.</Time>
<Date>18/03/2011</Date>
<Days>
<Monday>false</Monday>
<Tuesday>false</Tuesday>
<Wednesday>false</Wednesday>
<Thursday>false</Thursday>
<Friday>false</Friday>
<Saturday>false</Saturday>
<Sunday>false</Sunday>
<Everyday>true</Everyday>
<RunOnce>false</RunOnce>
</Days>
</RunWhen>
<Enabled>true</Enabled>
</Task>
</Tasks>
The way I'd like to parse the data is like so:
Open Tasks.xml
Load the first Task tag.
In that task retrieve the values of the Name, Location and Arguments tags.
Then open the RunWhen tag and retrieve the values of the Time and Date tags.
After that open the Days tag and retrieve the value of each individual tag within.
Retrieve the value of Enabled.
Load the next task and repeat steps 3 -> 7 until all the Task tags in Tasks have been parsed.
I'm very sure you can do it this way I just can't work it out as there are so many different ways to do things in XML I got a bit overwhelmed. But what I've go so far is that I would most likely be using XPathDocument and XPathNodeIterator right?
If someone can show me an example or explain to me how this would be done I would be very happy.
Easy way to parse the xml is to use the LINQ to XML
for example you have the following xml file
<library>
<track id="1" genre="Rap" time="3:24">
<name>Who We Be RMX (feat. 2Pac)</name>
<artist>DMX</artist>
<album>The Dogz Mixtape: Who's Next?!</album>
</track>
<track id="2" genre="Rap" time="5:06">
<name>Angel (ft. Regina Bell)</name>
<artist>DMX</artist>
<album>...And Then There Was X</album>
</track>
<track id="3" genre="Break Beat" time="6:16">
<name>Dreaming Your Dreams</name>
<artist>Hybrid</artist>
<album>Wide Angle</album>
</track>
<track id="4" genre="Break Beat" time="9:38">
<name>Finished Symphony</name>
<artist>Hybrid</artist>
<album>Wide Angle</album>
</track>
<library>
For reading this file, you can use the following code:
public void Read(string fileName)
{
XDocument doc = XDocument.Load(fileName);
foreach (XElement el in doc.Root.Elements())
{
Console.WriteLine("{0} {1}", el.Name, el.Attribute("id").Value);
Console.WriteLine(" Attributes:");
foreach (XAttribute attr in el.Attributes())
Console.WriteLine(" {0}", attr);
Console.WriteLine(" Elements:");
foreach (XElement element in el.Elements())
Console.WriteLine(" {0}: {1}", element.Name, element.Value);
}
}
I usually use XmlDocument for this. The interface is pretty straight forward:
var doc = new XmlDocument();
doc.LoadXml(xmlString);
You can access nodes similar to a dictionary:
var tasks = doc["Tasks"];
and loop over all children of a node.
Try XmlSerialization
try this
[Serializable]
public class Task
{
public string Name{get; set;}
public string Location {get; set;}
public string Arguments {get; set;}
public DateTime RunWhen {get; set;}
}
public void WriteXMl(Task task)
{
XmlSerializer serializer;
serializer = new XmlSerializer(typeof(Task));
MemoryStream stream = new MemoryStream();
StreamWriter writer = new StreamWriter(stream, Encoding.Unicode);
serializer.Serialize(writer, task);
int count = (int)stream.Length;
byte[] arr = new byte[count];
stream.Seek(0, SeekOrigin.Begin);
stream.Read(arr, 0, count);
using (BinaryWriter binWriter=new BinaryWriter(File.Open(#"C:\Temp\Task.xml", FileMode.Create)))
{
binWriter.Write(arr);
}
}
public Task GetTask()
{
StreamReader stream = new StreamReader(#"C:\Temp\Task.xml", Encoding.Unicode);
return (Task)serializer.Deserialize(stream);
}
Are you familiar with the DataSet class?
The DataSet can also load XML documents and you may find it easier to iterate.
http://msdn.microsoft.com/en-us/library/system.data.dataset.readxml.aspx
DataSet dt = new DataSet();
dt.ReadXml(#"c:\test.xml");
class Program
{
static void Main(string[] args)
{
//Load XML from local
string sourceFileName="";
string element=string.Empty;
var FolderPath=#"D:\Test\RenameFileWithXmlAttribute";
string[] files = Directory.GetFiles(FolderPath, "*.xml");
foreach (string xmlfile in files)
{
try
{
sourceFileName = xmlfile;
XElement xele = XElement.Load(sourceFileName);
string convertToString = xele.ToString();
XElement parseXML = XElement.Parse(convertToString);
element = parseXML.Descendants("Meta").Where(x => (string)x.Attribute("name") == "XMLTAG").Last().Value;
DirectoryInfo CurrentDate = Directory.CreateDirectory(DateTime.Now.ToString("yyyy-MM-dd"));
string saveWithThisName= Path.Combine(CurrentDate.FullName, element);
File.Copy(sourceFileName, saveWithThisName,true);
}
catch(Exception ex)
{
}
}
}
}

Categories