C# Parallel.Foreach with XML

C# Parallel.Foreach with XML - c#

I'm brand new to C# though have some minor experience with other languages and have hit a brick wall.
The code below works exactly as expected:
XmlDocument doc = new XmlDocument();
doc.Load("config.xml");
foreach (XmlNode node in doc.DocumentElement.ChildNodes)
{
string name = node.Attributes["name"].Value;
string ips = node.Attributes["ip"].Value;
string port = node.Attributes["port"].Value;
Console.WriteLine(name + " | " + ips + ":" + port);
}
I get out exactly what I am expecting with zero errors, however the following code has got me stumped. I am hoping someone can explain what I am doing wrong as I feel like I am perhaps missing something fundamental.
XmlDocument doc = new XmlDocument();
doc.Load("config.xml");
node = doc.DocumentElement.ChildNodes;
Parallel.ForEach(node,
(item) => {
string name = item.Attributes["name"].Value;
string ips = item.Attributes["ip"].Value;
string port = item.Attributes["port"].Value;
Console.WriteLine(name + " | " + ips + ":" + port);
});
I am simply trying to run each iteration of the loop in parallel. When I try compile I get the following error:
CS0411 The type arguments for method 'Parallel.ForEach
(IEnumerable, Action)' cannot be inferred from the usage.
Try specifying the type arguments explicitly.
Example XML below:
<item name="pc01" ip="192.168.0.10" port="80"><!--PC01--></item>
<item name="pc02" ip="192.168.0.11" port="80"><!--PC02--></item>
<item name="pc03" ip="192.168.0.12" port="80"><!--PC03--></item>
<item name="pc04" ip="192.168.0.13" port="80"><!--PC04--></item>
Any assistance would be greatly appreciated.

You need to Cast non generic types. Full solution below.
static void Main(string[] args)
{
var xml="<root><item name='pc01' ip='192.168.0.10' port='80'><!--PC01--></item><item name='pc02' ip='192.168.0.11' port='80'><!--PC02--></item><item name='pc03' ip='192.168.0.12' port='80'><!--PC03--></item><item name='pc04' ip='192.168.0.13' port='80'><!--PC04--></item></root>";
XmlDocument doc=new XmlDocument();
// doc.Load("config.xml");
doc.LoadXml(xml);
var nodeList=doc.DocumentElement.ChildNodes;
Parallel.ForEach(nodeList.Cast<XmlNode>(),
item => {
string name=item.Attributes["name"].Value;
string ips=item.Attributes["ip"].Value;
string port=item.Attributes["port"].Value;
Console.WriteLine(name + " | " + ips + ":" + port);
});
Console.ReadLine();
}

Are the items displaying in the console out of sequence?
You can safely call Console.WriteLine from multiple threads but I wouldn't count on the items actually getting written to the console in the expected sequence. I'd expect them to usually get written in the expected sequence and then sometimes not. That's the behavior of multithreaded execution. It will do what you expect it to do, but never count on it happening in the expected sequence, even if you test over and over and over and it does happen in the expected sequence.

Related

C# XmlReader reads XML wrong and different based on how I invoke the reader's methods

So my current understanding of how the C# XmlReader works is that it takes a given XML File and reads it node-by-node when I wrap it in a following construct:
using System.Xml;
using System;
using System.Diagnostics;
...
XmlReaderSettings settings = new XmlReaderSettings();
settings.IgnoreComments = true;
settings.IgnoreWhitespace = true;
settings.IgnoreProcessingInstructions = true;
using (XmlReader reader = XmlReader.Create(path, settings))
{
while (reader.Read())
{
// All reader methods I call here will reference the current node
// until I move the pointer to some further node by calling methods like
// reader.Read(), reader.MoveToContent(), reader.MoveToElement() etc
}
}
Why will the following two snippets (within the above construct) produce two very different results, even though they both call the same methods?
I used this example file for testing.
Debug.WriteLine(new string(' ', reader.Depth * 2) + "<" + reader.NodeType.ToString() + "|" + reader.Name + ">" + reader.ReadString() + "</>");
(Snippet 1)
vs
(Snippet 2)
string xmlcontent = reader.ReadString();
string xmlname = reader.Name.ToString();
string xmltype = reader.NodeType.ToString();
int xmldepth = reader.Depth;
Debug.WriteLine(new string(' ', xmldepth * 2) + "<" + xmltype + "|" + xmlname + ">" + xmlcontent + "</>");
Output of Snippet 1:
<XmlDeclaration|xml></>
<Element|rss></>
<Element|head></>
<Text|>Test Xml File</>
<Element|description>This will test my xml reader</>
<EndElement|head></>
<Element|body></>
<Element|g:id>1QBX23</>
<Element|g:title>Example Title</>
<Element|g:description>Example Description</>
<EndElement|item></>
<Element|item></>
<Text|>2QXB32</>
<Element|g:title>Example Title</>
<Element|g:description>Example Description</>
<EndElement|item></>
<EndElement|body></>
<EndElement|xml></>
<EndElement|rss></>
Yes, this is formatted as it was in my output window. As to be seen it skipped certain elements and outputted a wrong depth for a few others. Therefore, the NodeTypes are correct, unlike Snippet Number 2, which outputs:
<XmlDeclaration|xml></>
<Element|xml></>
<Element|title></>
<EndElement|title>Test Xml File</>
<EndElement|description>This will test my xml reader</>
<EndElement|head></>
<Element|item></>
<EndElement|g:id>1QBX23</>
<EndElement|g:title>Example Title</>
<EndElement|g:description>Example Description</>
<EndElement|item></>
<Element|g:id></>
<EndElement|g:id>2QXB32</>
<EndElement|g:title>Example Title</>
<EndElement|g:description>Example Description</>
<EndElement|item></>
<EndElement|body></>
<EndElement|xml></>
<EndElement|rss></>
Once again, the depth is messed up, but it's not as critical as with Snippet Number 1. It also skipped some elements and assigned wrong NodeTypes.
Why can't it output the expected result? And why do these two snippets produce two totally different outputs with different depths, NodeTypes and skipped nodes?
I'd appreciate any help on this. I searched a lot for any answers on this but it seems like I'm the only one experiencing these issues. I'm using the .NET Framework 4.6.2 with Asp.net Web Forms in Visual Studio 2017.

Firstly, you are using a method XmlReader.ReadString() that is deprecated:
XmlReader.ReadString Method
... reads the contents of an element or text node as a string. However, we recommend that you use the ReadElementContentAsString method instead, because it provides a more straightforward way to handle this operation.
However, beyond warning us off the method, the documentation doesn't precisely specify what it actually does. To determine that, we need to go to the reference source:
public virtual string ReadString() {
if (this.ReadState != ReadState.Interactive) {
return string.Empty;
}
this.MoveToElement();
if (this.NodeType == XmlNodeType.Element) {
if (this.IsEmptyElement) {
return string.Empty;
}
else if (!this.Read()) {
throw new InvalidOperationException(Res.GetString(Res.Xml_InvalidOperation));
}
if (this.NodeType == XmlNodeType.EndElement) {
return string.Empty;
}
}
string result = string.Empty;
while (IsTextualNode(this.NodeType)) {
result += this.Value;
if (!this.Read()) {
break;
}
}
return result;
}
This method does the following:
If the current node is an empty element node, return an empty string.
If the current node is an element that is not empty, advance the reader.
If the now-current node is the end of the element, return an empty string.
While the current node is a text node, add the text to a string and advance the reader. As soon as the current node is not a text node, return the accumulated string.
Thus we can see that this method is designed to advance the reader. We can also see that, given mixed-content XML like <head>text <b>BOLD</b> more text</head>, ReadString() will only partially read the <head> element, leaving the reader positioned on <b>. This oddity is likely why Microsoft deprecated the method.
We can also see why your two snippets function differently. In the first, you get reader.Depth and reader.NodeType before calling ReadString() and advancing the reader. In the second you get these properties after advancing the reader.
Since your intent is to iterate through the nodes and get the value of each, rather than ReadString() or ReadElementContentAsString() you should just use XmlReader.Value:
gets the text value of the current node.
Thus your corrected code should look like:
string xmlcontent = reader.Value;
string xmlname = reader.Name.ToString();
string xmltype = reader.NodeType.ToString();
int xmldepth = reader.Depth;
Console.WriteLine(new string(' ', xmldepth * 2) + "<" + xmltype + "|" + xmlname + ">" + xmlcontent + "</>");
XmlReader is tricky to work with. You always need to check the documentation to determine exactly where a given method positions the reader. For instance, XmlReader.ReadElementContentAsString() moves the reader past the end of the element, whereas XmlReader.ReadSubtree() moves the reader to the end of the element. But as a general rule any method named Read is going to advance the reader, so you need to be careful using a Read method inside an outer while (reader.Read()) loop.
Demo fiddle here.

Escaping and double quotes together with Linq

I am making a small piece of code where I look for all nodes in XML containing "folder name=\"u"" .
I have problems with the string literals, I tried with # and escape or double quotes without any success. Here is the code :
public class Folders
{
public static IEnumerable<string> FolderNames(string xml, char startingLetter)
{
string[] MyString;
List<string> MyList = new List<string>();
string item = "";
StringSplitOptions.None)).ToList();
MyString = xml.Split('>') ;
var matchingvalues = MyString
.Where(stringToCheck => stringToCheck.Contains("<folder name=\\\""));
return matchingvalues;
}
public static void Main(string[] args)
{
string xml =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<folder name=\"c\">" +
"<folder name=\"program files\">" +
"<folder name=\"uninstall information\" />" +
"</folder>" +
"<folder name=\"users\" />" +
"</folder>";
foreach (string name in Folders.FolderNames(xml, 'u'))
Console.WriteLine(name);
Console.ReadLine();
}
How should I write
var matchingvalues = MyString.Where(stringToCheck => stringToCheck.Contains("
?

You're not even using your startingLetter parameter in FolderNames
You say you're looking for "folder name=\"u"", but your code looks for "<"folder name=\\\"". Disregarding the missing "u", you're looking for a literal backslash as well. Which doesn't exist in your xml. The backslashes in your xml are for escaping the quotes.
You haven't posted your real code because your method doesn't even work. WTF is this??
StringSplitOptions.None)).ToList();
You don't use the item variable.
Hopefully the above is enough to show where you went wrong. Better still, use .NET's xml parsing abilities to get the values. Currently your method lies; it doesn't just return "Folder Names", it returns a mess of half-xml as well.

The main problem isn't really with the escaping.
But with the fact you reinvented the wheel a little bit.
There are multiple xml parsers in c#.
Linq to xml is one of them. with it you could write something simple like:
string xml =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<folder name=\"c\">" +
"<folder name=\"program files\">" +
"<folder name=\"uninstall information\" />" +
"</folder>" +
"<folder name=\"users\" />" +
"</folder>";
XElement xElement = XElement.Parse(xml);
IEnumerable<string> values = xElement.
Descendants("folder").
Where(element => element.Attribute("name")?.Value?.StartsWith("u") == true).
Select(element => element.Attribute("name").Value);

Convert unformatted string (not valid json) to json

I have a string which I get from a webservice which looks like this:
({
id=1;
name="myName";
position="5";
})
which is not a parsable json. I wanted to ask if there are any ways besides going character to character and correcting them to convert such string into a parsable json like this:
{
"id":1,
"name":"myName",
"position":"5"
}

Check this link it will be helpful :
https://forum.unity3d.com/threads/json-web-services.366073/

You cold run a bunch of regex replaces for each change. But you'll need captures for the property names etc The performance will be horrible.
If the format is known and reliable (eg what happens with collections/arrays and sub-objects). And the service provider does not provide a client or SDK. Then your best bet is to write your own parser. It's not that hard to create your own from scratch. Or you can use a parser library like Irony.net or eto.parse. Both of these allow you to construct a grammar in c# so it is fully self contained without the need for compiler-compilers and generated code. There is also a class of parser called "monadic" parsers like Sprache which are of a simpler nature (once you wrap your head around them).
Whichever approach is taken you'll end up with a way of recognising each property and object boundary where you can do what you need to do: set a property; create a JToken; whatever...
Then you can wrap the whole lot in a MediaTypeFormatter and call the service via HttpClient and get objects out.

Finally I had to write my own function to convert it to a parsable json, here's the function I wrote:
public string convertToJson(string mJson)
{
mJson = mJson.Replace("(","[");
mJson = mJson.Replace(")","]");
string mJson2 = mJson.Trim('[',']');
string[] modules = mJson2.Split(',');
for(int i = 0;i<modules.Length;i++)
{
Debug.Log("module["+i+"]: " + modules[i]);
}
for(int m=0;m<modules.Length;m++)
{
char[] mCharacter = {'{','}'};
modules[m] = modules[m].Replace("{",string.Empty).Replace("}",string.Empty).Trim();
Debug.Log("module["+m+"] after trim: " + modules[m]);
string[] items = modules[m].TrimEnd(';').Split(';');
modules[m] = "{";
for(int j=0;j<items.Length;j++)
{
Debug.Log("item["+j+"]: " + items[j]);
string[] keyValue = items[j].Split('=');
Debug.Log("key,value: " + keyValue[0] + ", " + keyValue[1]);
modules[m] = modules[m] + "\"" + keyValue[0].Trim() + "\":" + keyValue[1].Trim() + ",";
}
modules[m] = modules[m].Substring(0,modules[m].Length-1) + "}";
Debug.Log("modules["+m+"] final: " + modules[m]);
}
string finalJson = "[";
for(int m=0;m<modules.Length;m++)
{
finalJson = finalJson + modules[m] + ",";
}
finalJson = finalJson.Substring(0,finalJson.Length-1) + "]";
Debug.Log("finalJson: " + finalJson);
return finalJson;
}

How to display query in output

So I have a query and am trying to display it in the Debug Output, when I run the file it gives me a list of output starting with iisexpress.exe : https://gyazo.com/fd9eb832dfcc08571b31490103b85b49
but no actual result? I am trying to run a query on Visual Studios2015 for the first time using the dotnetRDF. My code is below:
public static void Main(String[] args)
{
Debug.WriteLine("SQLAQL query example");
//Define a remote endpoint
//Use the DBPedia SPARQL endpoint with the default Graph set to DBPedia
SparqlRemoteEndpoint endpoint = new SparqlRemoteEndpoint(new Uri("http://dbpedia.org/sparql"), "http://dbpedia.org");
//SPARQL query to show countries, population, capital for countries where population is more than 100000 and limit results to 50
String queryString = "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> " +
"PREFIX type: <http://dbpedia.org/class/yago/> " +
"PREFIX prop: <http://dbpedia.org/property/> " +
"SELECT ?country_name ?population ?cptl " +
"WHERE { " +
"?country rdf:type type:Country108544813. " +
"?country rdfs:label ?country_name. " +
"?country prop:populationEstimate ?population. " +
"?country dbo:capital ?cptl " +
"FILTER (?population > 1000000000) . " +
"}" +
"LIMIT 50 ";
Debug.WriteLine("queryString: [" + queryString + "]");
//Make a SELECT query against the Endpoint
SparqlResultSet results = endpoint.QueryWithResultSet(queryString);
foreach (SparqlResult result in results)
{
Debug.WriteLine(result.ToString());
}
}
Just learning SPARQL so this maybe a very basic question.
Many Thanks:)

You need to make sure that your code is compiled and run in Debug mode. If is not then Debug.WriteLine() will have no effect. The output you have provided is incomplete, future reference it is better to copy and paste into your question rather than posting a screenshot.
Since this appears to be a console application why not just use Console.WriteLine() instead?

Is there a common .NET regex that can be used to match the following numbers?

I would like to match the numbers 123456789 and 012 using only one regex in the following strings. I am not sure how to handle all the following scenarios with a single regex:
<one><num>123456789</num><code>012</code></one>
<two><code>012</code><num>123456789</num></two>
<three num="123456789" code="012" />
<four code="012" num="123456789" />
<five code="012"><num>123456789</num></five>
<six num="123456789"><code>012</code></six>
They also don't have to be on the same line like above, for example:
<seven>
<num>123456789</num>
<code>012</code>
</seven>

Parsing XML with regex is not a good idea. You can use XPath or xlinq. xlinq is easier. You must reference System.Xml.Linq and System.Xml and add using declerations. I wrote the code on here, not in visual studio, so there may be minor bugs...
// var xml = ** load xml string
var document = XDocument.Parse(xml);
foreach(var i in document.Root.Elements())
{
var num = "";
var code = "";
if(i.Attributes("num").Length > 0)
{
Console.WriteLine("Num: {0}", i.Attributes("num")[0].Value);
Console.WriteLine("Code: {0}", i.Attributes("code")[0].Value);
}
else
{
Console.WriteLine("Num: {0}", i.Element("num").Value);
Console.WriteLine("Code: {0}", i.Element("code").Value);
}
}

In a more abstract level, the problem is to parse either an attribute or a node named num or code. Considering C# already has libraries to parse XML documents (and such solutions are also acceptable according to your comments), it's more natural to take advantage of these libraries. The following function will return the specified attribute/node.
static string ParseNode(XmlElement e, string AttributeOrNodeName)
{
if (e.HasAttribute(AttributeOrNodeName))
{
return e.GetAttribute(AttributeOrNodeName);
}
var node = e[AttributeOrNodeName];
if (node != null)
{
return node.InnerText;
}
throw new Exception("The input element doesn't have specified attribute or node.");
}
A test code is like
var doc = new XmlDocument();
var xmlString = "<test><node><num>123456789</num><code>012</code></node>\r\n"
+ "<node><code>012</code><num>123456789</num></node>\r\n"
+ "<node num=\"123456789\" code=\"012\" />\r\n"
+ "<node code=\"012\" num=\"123456789\" />\r\n"
+ "<node code=\"012\"><num>123456789</num></node>\r\n"
+ "<node num=\"123456789\"><code>012</code></node>\r\n"
+ #"<node>
<num>123456789</num>
<code>012</code>
</node>
</test>";
doc.LoadXml(xmlString);
foreach (var num in doc.DocumentElement.ChildNodes.Cast<XmlElement>().Select(x => ParseNode(x, "num")))
{
Console.WriteLine(num);
}
Console.WriteLine();
foreach (var code in doc.DocumentElement.ChildNodes.Cast<XmlElement>().Select(x => ParseNode(x, "code")))
{
Console.WriteLine(code);
}
In my environment (.NET 4), the code captures all the num and code values.

This seems to be doing the trick:
new Regex(#"(?s)<(\w+)(?=.{0,30}(<num>\s*|num="")(\d+))(?=.{0,30}(<code>\s*|code="")(\d+)).*?(/>|</\1>)")
Groups 3 and 5 have "num" and "code" values respectively. It is also reasonably strict, as one of the main concerns when writing regex is to not capture something you don't want (capturing what you want is easy).

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Parallel.Foreach with XML - c#

Related

C# XmlReader reads XML wrong and different based on how I invoke the reader's methods

Escaping and double quotes together with Linq

Convert unformatted string (not valid json) to json

How to display query in output

Is there a common .NET regex that can be used to match the following numbers?

Categories

Resources