Reading xml file? - c#

I have this xml file that i have created pragmatically using C# :-
<Years>
<Year Year="2011">
<Month Month="10">
<Day Day="10" AccessStartTime="01:15 PM" ExitTime="01:15 PM" />
<Day Day="11" AccessStartTime="01:15 PM" ExitTime="01:15 PM" />
<Day Day="12" AccessStartTime="01:15 PM" ExitTime="01:15 PM" />
<Day Day="13" AccessStartTime="01:15 PM" ExitTime="01:15 PM" />
</Month>
<Month Month="11">
<Day Day="12" AccessStartTime="01:16 PM" ExitTime="01:16 PM" />
</Month>
</Year>
</Years>
I am having problems when i want to get specfic data from it while i am using XmlReader or i am doing it the wrong way cause each time the reader reads one single line and i what i want is to get a list of all days in a specific month and a year

Use Linq-XML or post the code you have tried.
var list = from ele in XDocument.Load(#"c:\filename.xml").Descendants("Year")
select new
{
Year = (string)ele.Attribute("Year"),
Month= (string)ele.Element("Month").Attribute("Month"),
Day = (string)ele.Element("Month").Element("Day").Attribute("Day")
};
foreach (var t in list)
{
Console.WriteLine(t.Year + " " + t.Month + " " + t.Day );
}

I agree with AVD's suggestion of using LINQ to XML. Finding all the days for a specific year and month is simple:
XDocument doc = XDocument.Load("file.xml");
var days = doc.Elements("Year").Where(y => (int) y.Attribute("Year") == year)
.Elements("Month").Where(m => (int) m.Attribute("Month") == month)
.Elements("Day");
(This assumes that Month and Year attributes are specified on all Month and Year elements.)
The result is a sequence of the Day elements for the specified month and year.
In most cases I'd actually write one method call per line, but in this case I thought it looked better to have one full filter of both element and attribute per line.
Note that in LINQ, some queries end up being more readable using query expressions, and some are more readable in the "dot notation" I've used above.
You asked for an explanation of AVD's code, so you may be similarly perplexed by mine - rather than explain the bits of LINQ to XML and LINQ that my code happens to use, I strongly recommend that you read good tutorials on both LINQ and LINQ to XML. They're wonderful technologies which will help your code all over the place.

Take a look at this example how to represent the xml with root node and using xml reader how to get the data ....
using System;
using System.Xml;
class Program
{
static void Main()
{
// Create an XML reader for this file.
using (XmlReader reader = XmlReader.Create("perls.xml"))
{
while (reader.Read())
{
// Only detect start elements.
if (reader.IsStartElement())
{
// Get element name and switch on it.
switch (reader.Name)
{
case "perls":
// Detect this element.
Console.WriteLine("Start <perls> element.");
break;
case "article":
// Detect this article element.
Console.WriteLine("Start <article> element.");
// Search for the attribute name on this current node.
string attribute = reader["name"];
if (attribute != null)
{
Console.WriteLine(" Has attribute name: " + attribute);
}
// Next read will contain text.
if (reader.Read())
{
Console.WriteLine(" Text node: " + reader.Value.Trim());
}
break;
}
}
}
}
}
}
Input text [perls.xml]
<?xml version="1.0" encoding="utf-8" ?>
<perls>
<article name="backgroundworker">
Example text.
</article>
<article name="threadpool">
More text.
</article>
<article></article>
<article>Final text.</article>
</perls>
Output
Start <perls> element.
Start <article> element.
Has attribute name: backgroundworker
Text node: Example text.
Start <article> element.
Has attribute name: threadpool
Text node: More text.
Start <article> element.
Text node:
Start <article> element.
Text node: Final text.

Related

Retriving set of XML nodes from a plain text file

I have a plain text file as below,
<body labelR={Right} LabelL={Left}> </body/> Video provides a powerful way to help you prove your point. When you click Online Video, you can paste in the embed code for the video you want to add. You can also type a keyword to search online for the video that best fits your document. <body TestR={TestRight} TestL={TestLeft}> </body/>
It is read into the file system as,
var plainText = File.ReadAllText(#"D:\TestTxt.txt");
I'm trying to figure out a way if there is a way to filer out and get a list of a particular set of elements which are in XML syntax. Desired outcome is as below,
A list of 2 items in this case, with,
<body labelR={Right} LabelL={Left}>
</body/>
<body TestR={TestRight} TestL={TestLeft}>
</body/>
Basically the XML elements with <body> </body>
I cannot use LINQ to XML here since this plain text content is not valid XML syntax, I have read that RegEx might be possible but I'm not sure the proper way to use it here.
Any advise is greatly appreciated here
I think the best way to implement this situation is to change your txt file to an XML file by adding a little piece of code, Then you can easily read it
this and this will help you to do that.
using (XmlReader reader = XmlReader.Create(#"YOUFILEPATH.xml"))
{
while (reader.Read())
{
if (reader.IsStartElement())
{
//return only when you have START tag
switch (reader.Name.ToString())
{
case "Key":
Console.WriteLine("Element tag name is: " + reader.ReadString());
break;
case "Element value is: "
Console.WriteLine("Your Location is : " + reader.ReadString());
break;
}
}
Console.WriteLine("");
}
A plain string-based solution could be:
var s = "<body labelR={Right} LabelL={Left}> </body/> Video provides ... your document. <body TestR={TestRight} TestL={TestLeft}> </body/>";
int start = 0;
while ((start = s.IndexOf("<body", start )) >= 0)
{
var end = s.IndexOf("</body/>", start + "<body".Length) + "</body/>".Length;
Console.WriteLine(s[start..end]);
start = end;
}
This finds the next <body starting from the previous "node" (if any). Then it finds the (end of the) next </body/>.
Finally it prints the substring.
Repeat until no start marker was found, so it prints:
<body labelR={Right} LabelL={Left}> </body/>
<body TestR={TestRight} TestL={TestLeft}> </body/>
You may want to add some checks - what if the end marker is missing?

Avoid skipping elements in xml reader

Let's suppose that I have a xml like this:
<articles>
<article>
<id>1</id>
<name>A1</name>
<price>10</price>
</article>
<article>
<id>2</id>
<name>A2</name>
</article>
<article>
<id>3</id>
<name>A3</name>
<price>30</price>
</article>
</articles>
As you can see article A2 is missing price tag.
I have a real world case where I parse xml file where some tags in some articles are missing (which I didn't know earlier). I wrote a very simple parser like this:
using (XmlReader reader = XmlReader.Create(new StringReader(myXml)))
{
while (true)
{
bool articleExists = reader.ReadToFollowing("article");
if (!articleExists) return;
reader.ReadToFollowing("id");
string id = reader.ReadElementContentAsString();
reader.ReadToFollowing("name");
string name = reader.ReadElementContentAsString();
reader.ReadToFollowing("price");
string price = reader.ReadElementContentAsString();
//do something with these values
}
But if there is no price tag in article 2 xmlreader will jump to price tag in article A3 and I get articles mixed up and some data skipped, right?
How can I protect from this? So if some tag in article node is absent, then let's say default value is used?
I would still like to use xmlreader if possible. My real file is 200 MB big so I need a simple,fast and efficient solution that won't hang the system.

Parse datetime from XML element

var xml = #"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>
<metadata created=""2014-11-03T18:13:02.769Z"" xmlns=""http://example.com/ns/mmd-2.0#"" xmlns:ext=""http://example.com/ns/ext#-2.0"">
<customer-list count=""112"" offset=""0"">
<customer id=""5f6ab597-f57a-40da-be9e-adad48708203"" type=""Person"" ext:score=""100"">
<name>Bobby Smith</name>
<gender>male</gender>
<country>US</country>
<date-span>
<begin>1965-02-18</begin>
<end>false</end>
</date-span>
</customer>
<customer id=""22"" type=""Person"" ext:score=""100"">
<name>Tina Smith</name>
<gender>Female</gender>
<country>US</country>
<date-span>
<end>false</end>
</date-span>
</customer>
<customer id=""30"" type=""Person"" ext:score=""500"">
<name>George</name>
<gender>Male</gender>
<country>US</country>
<date-span>
<begin>1965</begin>
<end>false</end>
</date-span>
</customer>
</customer-list>
</metadata>";
Im using the above XML. The problem i have is the date (im referring to the <date-span> <begin> element) can be in any format. So i am trying to use the below code in order to take care of the date format
GetCustomers = from c in XDoc.Descendants(ns + "customer")
select
new Customer
{
Name = c.Element(ns + "name").Value,
DateOfBirth = Convert.ToDateTime(c.Element(ns + "date-span").Elements(ns + "begin").Any() ? c.Element(ns + "date-span").Element(ns + "begin").Value : DateTime.Now.ToString())
};
The above works but crashed as soon as the XML contained 1965 - unfortunately i have no control over the XML. So i tried to use TryParse in order to convert 1965 to dd/mm/1965 where dd and mm could be todays date and current month, but i cant seem to get it working:
BeginDate = Convert.ToDateTime(c.Element(ns + "life-span").Elements(ns + "begin").Any() ? DateTime.TryParse( c.Element(ns + "life-span").Element(ns + "begin").Value, culture, styles, out dateResult) : DateTime.Now).ToString())
Could anyone guide me here how to resolve the issue?
Edit 1
var ModifyBeginDate = XDoc.Descendants(ns + "artist").Elements(ns + "date-span").Elements(ns + "begin");
The above retrieves all the dates but how do i assign the values after i have changed them back to the XML (i dont think i can use this variable in my code as when i iterate through the XML it would go directly back to the original XML)
If the data can be in any format then you will have to preprocess the data before trying to parse it into a DateTime.
If i were going to implement this the first thing i would do is break the input into an array of integers, if there is only one item in the array i would check the length, if it was 4 long then I would assume that it is a year and instantiate a new DateTime of January 1, with the year. If i found an array of length 4,2,2 or 2,2,4 i would parse them accordingly - obviously there will be some of it left to guessing but if you can't control the format of the xml there will always be something left to chance
You could use something like this (but modified to only return the integer types and skip the splitting type which could be /,-, etc) to split the date time into an array containing the integer values:
https://stackoverflow.com/a/13548184/184746

Doing regex style compare while looping through a XML file in C#

I have a XML file that i am using to loop through an on matching of a child node getting the value of a an attribute.The thing is matching these values with a * character or ? character like some regex style..can someone tell me how to do this .So if a request comes like g.portal.com it should match the second node .I am using .net 2.0
Below is my XML file
<Test>
<Test Text="portal.com" Sample="1" />
<Test Text="*.portal.com" Sample="201309" />
<Test Text="portal-0?.com" Sample="201309" />
</Test>
XmlDocument xDoc = new XmlDocument();
xDoc.Load(PathToXMLFile);
foreach (XmlNode node in xDoc.DocumentElement.ChildNodes)
{
if (node.Attributes["Sample"].InnerText == value)
{
}
}
What you need to do is first convert each Text attribute into a valid Regex pattern and then use it to match your input. Something like this:
string input = "g.portal.com";
XmlNode foundNode = null;
foreach (XmlNode node in xDoc.DocumentElement.ChildNodes)
{
string value = node.Attributes["Text"].Value;
string pattern = Regex.Escape(value)
.Replace(#"\*", ".*")
.Replace(#"\?", ".");
if (Regex.IsMatch(input, "^" + pattern + "$"))
{
foundNode = node;
break; //remove if you want to continue searching
}
}
After executing the above code, foundNode should contain the second node from the xml file.
So you have an XML file that sets up patterns, right? You'll want to feed those patterns into Regexes and then stream a number of requests through them. Did I get that correct?
Assuming the XML file doesn't change it only needs to be processed into according Regexes. For example *.portal.com would translate to
new Regex("\\w+\\.portal\\.com");
You'll just have to escape the dots, replace * with \\w+ and ? with \\w if i guessed the semantics of you match patterns correctly.
Look up the correct replacements at http://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx

Form control as an xml value

I have this example code in my c# win form...
List<string> listFurniture = new List<string>();
XDocument xml = XDocument.Load(Application.StartupPath + #"\Furniture.xml");
foreach (XElement quality in xml.Descendants("Quality"))
listFurniture.Add(quality.Value);
maintextbox.Text = listFurniture[0];
... And this example xml
<Furniture>
<Table>
<Quality>textbox1.Text + "and" + textbox2.Text + "but" + textbox3.Text</Quality>
...
</Table>
</Furniture>
My dilemma is, the maintextbox is producing the actual string "textbox1.Text", instead of the value of textbox1.
I want the xml value to be read as:
maintextbox.Text = textbox1.Text + "and" + textbox2.Text + "but" + textbox3.Text;
not as:
maintextbox.Text = "textbox1.Text + "and" + textbox2.Text + "but" + textbox3.Text";
I tried using a text file as well with StreamReader and I got the same result.
The reason for coding my project this way is because the sequence of the textboxes changes and so does the "and" and the "but". When that change happens, I wouldn't have to rewrite the code and recompile the program. I would just make the changes in xml.
There is all OK with xml parsing in your solution. What you need is processing of Quality strings.
string[] parts = quality.Split('+');
Regex regex = new Regex(#"^""(.*)""$");
var textBoxes = Controls.OfType<TextBox>().ToList();
for (int i = 0; i < parts.Length; i++)
{
string part = parts[i].Trim();
var match = regex.Match(part);
if (match.Success)
{
parts[i] = match.Groups[1].Value;
continue;
}
var textBox = textBoxes.FirstOrDefault(tb => tb.Name + ".Text" == part);
if (textBox != null) // possibly its an error if textbox not found
parts[i] = textBox.Text;
}
mainTextBox.Text = String.Join(" ", parts);
What happened here:
We split quality string by + chars to get array of string parts
With regular expression we verify if part looks like something in quotes "something". If yes, then it will be or, and or other connective word
And last, we check all textboxes for matching name of textbox in quality string part. If it matches, then we replace part with text from textbox
We join parts to get result string
BTW you can parse Xml in one line:
var listFurniture = xml.Descendants("Quality")
.Select(q => (string)q)
.ToList();
Update:
Since I received a comment to explain the code a bit; I'll explain it a bit.
First, XML as a language is designed for structure. That structure and ease; provides the flexibility and power to quickly parse data between languages or applications seamless. Your original question states that your textbox is producing a string value of your code textbox.text.
The XML need to be structured; an example structure would be:
<Furniture>
<Table>
<Color> Red </Color>
<Quality> 10 </Quality>
<Material> Wood </Material>
</Table>
</Furniture>
So if you were to read your XML it would be finding the root tag. All other components would be nodes. These nodes need to be index, or siphoned through to attain the proper correlation you would like represented into your textbox.
That is what this code is doing; I'll break it down through each step.
// String you will format with the XML Structure.
StringBuilder output = new StringBuilder();
The next part will be as follows:
// Create an XML Reader, by wrapping it in the 'using' it will ensure once your done the object is disposed of. Rather then leaving the connection to your document open.
using (XmlReader reader = XmlReader.Create(new StringReader(xmlString)))
{
// We will read our document to the following; hold to that attribute. The attribute is identifying the node and all of the child elements that reside within it: So in our case Table.
reader.ReadToFollowing("Table");
reader.MoveToFirstAttribute();
string color = reader.Value;
output.AppendLine("The color of the table " + color);
// As you can see there isn't anything fancy here, it reads to our root node. Then moves to the first child element. Then it creates a string and appends it. Keep in mind we are using our StringBuilder so we are just appending to that structure.
reader.ReadToFollowing("Material");
output.AppendLine("The material: " + reader.ReadElementContentAsString());
// Same result as we used earlier; just a different method to attain our value.
}
// Now we output our block.
OutputTextBlock.Text = output.ToString();
Now all the data is pushed into a string, obviously you can use the above code with a textbox to retrieve those values as well.
That is how you correctly receive XML into your application; but you mentioned two things earlier. So it sounds like your trying to use the textbox to physically write to the document, which can be done through the XmlWriter.
But the reason you also keep receiving your textbox because as far as the textbox is concerned textbox.text is associated to the value. Your structure is stating this string is the value.
In order to achieve your goal; you would have a method to write the value to the document. Then another to read it; so that it properly transitions the data in and out of your document and is represented correctly.
<Quality>Textbox1.Text</Quality> That doesn't allow the textbox value to automatically be read into your document and textbox. Your assigning a string value into the node. You would physically have to write to the document the values before it can be read.
The MSDN has examples of how to properly parse the data; hopefully I've clarified some of the reasons in which you are having your issue.
More code; straight from MSDN:
Right off the MSDN:
StringBuilder output = new StringBuilder();
String xmlString =
#"<?xml version='1.0'?>
<!-- This is a sample XML document -->
<Items>
<Item>test with a child element <more/> stuff</Item>
</Items>";
// Create an XmlReader
using (XmlReader reader = XmlReader.Create(new StringReader(xmlString)))
{
XmlWriterSettings ws = new XmlWriterSettings();
ws.Indent = true;
using (XmlWriter writer = XmlWriter.Create(output, ws))
{
// Parse the file and display each of the nodes.
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
writer.WriteStartElement(reader.Name);
break;
case XmlNodeType.Text:
writer.WriteString(reader.Value);
break;
case XmlNodeType.XmlDeclaration:
case XmlNodeType.ProcessingInstruction:
writer.WriteProcessingInstruction(reader.Name, reader.Value);
break;
case XmlNodeType.Comment:
writer.WriteComment(reader.Value);
break;
case XmlNodeType.EndElement:
writer.WriteFullEndElement();
break;
}
}
}
}
OutputTextBlock.Text = output.ToString();
or
StringBuilder output = new StringBuilder();
String xmlString =
#"<bookstore>
<book genre='autobiography' publicationdate='1981-03-22' ISBN='1-861003-11-0'>
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
</bookstore>";
// Create an XmlReader
using (XmlReader reader = XmlReader.Create(new StringReader(xmlString)))
{
reader.ReadToFollowing("book");
reader.MoveToFirstAttribute();
string genre = reader.Value;
output.AppendLine("The genre value: " + genre);
reader.ReadToFollowing("title");
output.AppendLine("Content of the title element: " + reader.ReadElementContentAsString());
}
OutputTextBlock.Text = output.ToString();

Categories