c# linq to xml - c#

I have an xml string that I wish to traverse using LINQ to XML (I have never used this, so wish to learn). However when I try to use
XDocument xDoc = XDocument.Load(adminUsersXML);
var users = from result in xDoc.Descendants("Result")
select new
{
test = result.Element("USER_ID").Value
};
I get an error message saying illegal characters in path. reading up on it, it's because I cannot pass a standard string in this way. Is there a way to use XML LINQ qith a standard string?
Thanks.

My guess is that adminUsersXML is the XML itself rather than a path to a file containing XML. If that's the case, just use:
XDocument doc = XDocument.Parse(adminUsersXML);

As said in MSDN, you must use the Parse function to create a XDocument from a string.

I think adminUserXML is not a file but a string containing xml, which should be parsed to convert to XDocument with XDocument.Parse(adminUserXML)

Related

combining xmls files in a loop and removing the nodes not needed

Im trying to combine multiple xml files through a loops.
I put the first XML in a string then add the next one to the same string.
I do remove the xml declaration first before i add using
XmlDocument doc = new XmlDocument();
doc.LoadXml(currentdaydata);
var declarations = doc.ChildNodes.OfType<XmlNode>()
.Where(x => x.NodeType == XmlNodeType.XmlDeclaration)
.ToList();
declarations.ForEach(x => doc.RemoveChild(x));
Each xml response is in the following format like the below but i cant seem to remove the root element.
xml 1 = <response><movie>....<movie></response>
xml 2 = <response><movie>....<movie></response>
xml 3 = <response><movie>....<movie></response>
outputdata += xml(i);
outputdata =
<response><movie>....<movie></response><response><movie>....<movie></response><response><movie>....<movie></response>
I tried to remove it using a string replace but no luck
outputdata.Replace("</response><response>", "");
....
Don't try to manipulate XML as a string: sooner or later you'll get some input you can't handle, or you'll produce some output that your customers can't handle, and the questions to SO that result from this will keep us all busy for years. Always use a real XML parser, even for the simplest of jobs.
If you download an XQuery processor such as Saxon then you can do this as a one-liner:
<response>{$docs/response/*}</response>
where $docs is supplied as the sequence of parsed input documents,

XML String with no parent node to JSON with C#

I have an XML string that does not contain a parent node. This XML is a representation of a json request for an API. It seems pointless, but it is done this way to make it easy for non programmers to read the file. In order to convert the XML to json, pretty much everything i have seen says to convert the string to an XMLDocument and then use the following to get the json.
string jsonText = JsonConvert.SerializeXmlNode(doc);
The problem i have here is that the xml is not really valid and because of this, it cannot be converted to an xml document. What i really want is to be able to do this.
string jsonText = JsonConvert.SerializeXmlNode(doc.InnerXml);
This doesnt work since innerXML is a string and not an object. I have been able to get it to work by creating a root element and then just using a sub string to cut the resulting string, but this seems pointless. There has to be a better way to do this without having to add xml only to have to remove it from the json afterwards. Is it possible to convert a piece of xml like the xml below into json like the example below.
<rootnode>
<fielda>a</fielda>
<fieldb>b</fieldb>
</rootnode>
Converts to
{
"fielda": "a",
"fieldb": "b"
}
There's an overload of SerializeXmlNode that takes a boolean omitRootObject:
string jsonText = JsonConvert.SerializeXmlNode(doc, Formatting.None, true);
JsonConvert.SerializeXmlNode has an overloaded method which you could use to ignore root.
string jsonText = JsonConvert.SerializeXmlNode(doc, Formatting.None, true);
Third parameter is for omitting RootObject

"An error occurred while parsing EntityName" after grabbing content from valid XML

I am reading an XML string with XDocument
XmlReader reader = XmlReader.Create(new StringReader(xmltext));
reader.Read();
XDocument xdoc = XDocument.Load(reader);
Then I grab the content of some tags and put them within tags in a different string.
When I try to Load this string in the same way I did with the first, I get an error "An error occurred while parsing EntityName. Line 1, position 344.".
I think it should be parsed correctly since it has beem parsed before so I guess I am missing something here.
I am reading and copying the content of the first XML with (string)i.Element("field").
I am using .net 4
When I grab the content of the xml that I want to use for building another Xml string I use (string)i.Element("field") and this is converting my Xml into string. My next Xml Parsing does not recognize it as an Element anymore so I solved the problem by not using (string) before I read my element, just i.Element("field") and this works.
It sounds like you've got something like this:
<OriginalDocument>
<Foo>A & B</Foo>
</OriginalDocument>
That A & B represents the text A & B. So when you grab the text from the element, you'll get the string "A & B". If you then use that to build a new element like this:
string foo = "<Foo>" + fooText + "</Foo>";
then you'll end up with invalid XML like this:
<Foo>A & B</Foo>
Basically, you shouldn't be constructing XML in text form. It's not clear what you're really trying to achieve, but you can copy an element from one place to another pretty easily in XElement form; you shouldn't need to build a string and then reparse it.
So after spending hours on this issue:
it turns out that if you have an ampersand symbol ("&") or any other XML escape characters within your xml string, it will always fail will you try read the XML.
TO solve this, replace the special characters with their escaped string format
YourXmlString = YourXmlString.Replace("'", "&apos;").Replace("\"", """).Replace(">", ">").Replace("<", "<").Replace("&", "&");

Reading Xml files with umlaut chars

I have asked this question yesterday and got a reply.
Writing encoded values for umlauts
In the code the parse method works if it's a string like so:
XDocument xDoc = XDocument.Parse("<description>Top Shelf-ÖÄÜookcase</description>");
To pass the input xml file as string, I have to read it first. The read method will fail if there are umlauts in the input xml.
How do I get past that?
Tried both Load and Parse methods of XDocument.
Load:
Invalid character in the given encoding. Line 3, position 35.
Parse:
Data at the root level is invalid. Line 1, position 1.
Here is a sample xml after using CDATA:
<?xml version="1.0" encoding="utf-8"?>
<kal>
<description><![CDATA[Top Shelf-ÖÄÜookcase]]> </description>
</kal>
Change encoding to "iso-8859-1"
Have you tried wrapping the description data with a CDATA?
<description><![CDATA[Top Shelf-ÖÄÜookcase]]> </description>
Special characters don't particularly parse well in XML unless you wrap them with CDATA.
As Besi stated, you have to use the correct encoding of the xml-file in order to achieve correct handling of the umlauts.
Even so you said that the creation of the incoming xml-file is not in your hand, you can still affect the encoding to use for parsing the xml by using a dedicated StreamReader:
// create your XDocument
XDocument Doc;
// setup a StreamReader for your file, specifying the encoding you need
using (StreamReader Reader = new StreamReader(#"C:\your-file.xml", System.Text.Encoding.GetEncoding("ISO-8859-1")))
{
// PARSE the STRING that is RETURNED from the StreamReader.ReadToEnd()-method
Doc = XDocument.Parse(Reader.ReadToEnd());
}

extract a value from a string using regex

I'm trying to extract a value from a string using regex. The string looks like this:
<faultcode><![CDATA[900015The new password is not long enough. PasswordMinimumLength is 6.]]></faultcode>
I am trying to diplay only the error message to end user.
Since you probably want everything <![CDATA[ and ]]> this should fit:
<!\[CDATA\[(.+?)\]\]>
The only sensible thing is to load it into an XElement (or XDocument, XmlDocument) and extract the Value from the CDATA element.
XElement e = XElement.Parse(xmlSnippet);
string rawMsg = (e.FirstNode as XCData).Value;
string msg = rawMsg.Substring("900015".Length);
First, and foremost, using regex to parse XML / HTML is bad.
Now, by error message I assume you mean the text, not including the numbers. An expression like so would probably do the trick:
\<([^>]+)\><!\[CDATA\[\d*(.*)\]\]>\</\1\>
The error message will be in the second group. This will work with the sample that you have given, but I'd sooner use XDocument or XmlDocument to parse it. If you are using C#, there really isn't a good reason to not use either of those classes.
Updated to correspond with the question edit:
var xml = XElement.Parse(yourString);
var allText = xml.Value;
var stripLeadingNumbers = Regex.Match(xml.Value, #"^\d*(.*)").Groups[1].Value;

Categories