How to extract a node from xml string C# - c#

I have am xml string like mentioned below:
<?xml version="1.0" encoding="utf-8" ?>
<NodeA xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.air-watch.com/webapi/resources">
<AdditionalInfo>
<Links>
<Link xsi:type="link">
</Link>
</Links>
</AdditionalInfo>
<TotalResults>100</TotalResults>
<NodeB>
<NodeC>
<Id>1</Id>
<A>valueA</A>
<B>valueB</B>
</NodeC>
<NodeC>
<Id>2</Id>
<A>valueA</A>
<B>valueA</B>
</NodeC>
</NodeB>
</NodeA>
I want to extract NodeB and its child nodes (NodeC elements). How can I do it? Below solution does somewhat similar operation but it needs the xml string to be loaded in a XDocument first:
XDocument doc=XDocument.Parse(xmlstr);
String response=doc.Elements("question")
.Where(x=>x.Attribute("id")==id)
.Single()
.Element("response")
.Value;
Is there a way to do it without having to load it in a doc? Some operation on string object itself.

Why cant you use this
XDocument doc=XDocument.Parse(xmlstr);
String response=doc.Elements("question")
.Where(x=>x.Attribute("id")==id)
.Single()
.Element("response")
.Value; ?
you can use Regular Expressions then.

Related

Removing Attribute value based on value from an XML using VB.Net

I have an XML as below
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope
xmlns="http://com/uhg/uht/uhtSoapMsg_V1"
xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
<env:Header>
<uhtHeader
xmlns="http://com/uhg/uht/uhtHeader_V1">
<consumer>COMET</consumer>
<auditId></auditId>
<sendTimestamp>2020-09-03T18:15:40.942-05:00</sendTimestamp>
<environment>P</environment>
<businessService version="24">getClaimHistory</businessService>
<status>success</status>
</uhtHeader>
</env:Header>
<env:Body>
<srvcRspn
xmlns="http://com/uhg/uht/getClaimHistory_V24">
<srvcErrList arrayType="srvcErrOccur[1]" type="Array">
<srvcErrOccur>
<orig>Foundation</orig>
<rtnCd>00</rtnCd>
<explCd>000</explCd>
<desc></desc>
</srvcErrOccur>
</SrvcErrList>
</srvcRspn>
</env:Body>
</env:Envelope>
I want to remove all the attribute values with "http" like below:
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope
xmlns=""
xmlns:env="">
<env:Header>
<uhtHeader
xmlns="">
<consumer>COMET</consumer>
<auditId></auditId>
<sendTimestamp>2020-09-03T18:15:40.942-05:00</sendTimestamp>
<environment>P</environment>
<businessService version="24">getClaimHistory</businessService>
<status>success</status>
</uhtHeader>
</env:Header>
<env:Body>
<srvcRspn
xmlns="">
<srvcErrList arrayType="srvcErrOccur[1]" type="Array">
<srvcErrOccur>
<orig>Foundation</orig>
<rtnCd>00</rtnCd>
<explCd>000</explCd>
<desc></desc>
</srvcErrOccur>
</SrvcErrList>
</srvcRspn>
</env:Body>
</env:Envelope>
I have tried several ways but none of them has worked for me. Can anyone suggest what is fastest way to do it in VB.NET/C#.
The actual response is very large (approx 100000 lines of XML minimum) and using for each will consume a good amount of time. Is there any parsing method or LINQ query method which can do it faster.
I got the way to do it using Regex as below:
Return Regex.Replace(xmlDoc, "((?<=<|<\/)|(?<= ))[A-Za-z0-9]+:| xmlns(:[A-Za-z0-9]+)?="".*?""", "")
It serves my purpose completely. Thanks Cleptus for your quick reference.

Reading data from XML using C#

I have to read the ordertext ("This is an example text") from this XML File:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<order id="">
<users>
<user id="123456" nick="nick" done="false" />
</users>
<machines>
<machine id="1234" sd="1234" ref="" done="false" />
</machines>
<todos />
<ordertexts>
<ordertext>This is an example text </ordertext>
</ordertexts>
</order>
My C# Code looks like this:
XmlDocument xDoc = new XmlDocument();
xDoc.Load(file);
XmlElement node = (XmlElement)xDoc.SelectSingleNode("/order/ordertexts/ordertext");
When I write the selected data in another XML File it looks like this:
<order>
<oldOrderText>System.Xml.XmlElement</oldOrderText>
</order>
What did I do wrong? Is the XPath incorrect?
I am a C# newbie so I really need every help I can get!
Thanks in advance, geibi
What you're looking for is XmlElement.InnerText.
When you get the node using this:
XmlElement node = (XmlElement)xDoc.SelectSingleNode("/order/ordertexts/ordertext");
You still need to use this:
string neededText = node.InnerText;
to get the value of that node.
Suppose that you're writing the results in a console application. If you try to write the node variable, this way:
Console.WriteLine(node);
Since node is not a string, and it's an XmlElement object, the ToString method of XmlElement is going to be called, which returns the object name, hence your new XML had the result as System.Xml.XmlElement and not the desired text.

How to remove element from XML

I do have the following xml file
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:kml="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom">
<Document>
<open>1</open>
<Placemark>
<name>L14A</name>
<description>ID:01F40BF0
PLACEMENT:Home Woods
RSSI:-82
</description>
<Style>
<IconStyle>
<Icon>
<href>http://chart.apis.google.com/chart?chst=d_map_pin_letter&chld=3|0000CC|FFFFFF</href>
</Icon>
</IconStyle>
</Style>
<Point>
<coordinates>-73.16551208,44.71051217,0</coordinates>
</Point>
</Placemark>
</Document>
</kml>
The file is bigger than that but it does represent the structure. I'm trying to remove the element <Style> but I can't find a way to get it right.
I have tried the following method:
How to remove an element from an xml using Xdocument when we have multiple elements with same name but different attributes
The code is:
XDocument xdoc = XDocument.Load("kkk.kml");
xdoc.Descendants("Style").Remove();
xdoc.Save("kkk-mod.kml");
The Descendants collection is always empty.
Also, when I save the file, it does append "kml:" to each of my elements (see below).
<kml:Placemark>
<kml:name>L14A</kml:name>
<kml:description>ID:01F40BF0
</kml:description>
<kml:Point>
<kml:coordinates>-73.200,44.500,0</kml:coordinates>
</kml:Point>
</kml:Placemark>
How may I get it right?
the remove
the :kml appended in the final file.
You need to include the namespace in order to access the node. Based on the sampel XML you posted, the namespace is http://www.opengis.net/kml/2.2, so something like this should get you going:
XDocument xdoc = XDocument.Load("kkk.kml");
XNamespace ns = "http://www.opengis.net/kml/2.2";
xdoc.Descendants(ns + "Style").Remove();
xdoc.Save("kkk-mod.kml");
If you want to remove the "kml" prefix from the modified document, you can use the following code snippet. This will remove all the namespaces from the document.
XDocument xdoc = XDocument.Load("kkk.kml");
XNamespace ns = "http://www.opengis.net/kml/2.2";
xdoc.Descendants(ns + "Style").Remove();
XElement newDoc = RemoveAllNamespaces(xdoc.Root);
xdoc.Save("kkk-mod.kml");
public static XElement RemoveAllNamespaces(XElement e)
{
return new XElement(e.Name.LocalName,
(from n in e.Nodes()
select ((n is XElement) ? RemoveAllNamespaces(n as XElement) : n)),
(e.HasAttributes) ?
(from a in e.Attributes()
where (!a.IsNamespaceDeclaration)
select new XAttribute(a.Name.LocalName, a.Value)) : null);
}
Taken from this SO answer.
The resulting XML file looks like this:
<?xml version="1.0" encoding="utf-8"?>
<kml>
<Document>
<open>1</open>
<Placemark>
<name>L14A</name>
<description>ID:01F40BF0
PLACEMENT:Home Woods
RSSI:-82
</description>
<Point>
<coordinates>-73.16551208,44.71051217,0</coordinates>
</Point>
</Placemark>
</Document>
</kml>
Of course, you can use a native language for XML restructuring called XSLT requiring no looping. As information, XSLT is a declarative, special-purpose programming language (same type as SQL) used to re-format, style, and re-structure XML documents for various end use needs. Practically all general purpose languages maintain XSLT processors including C#, Java, Python, PHP, Perl, and VB.
Below is a solution for future readers where the XSLT script runs an identity transform to copy entire document as is and then writes an empty template to the <Style> node, thereby removing it:
XSLT script (save as .xsl or .xslt file)
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns="http://www.opengis.net/kml/2.2"
xmlns:gx="http://www.google.com/kml/ext/2.2"
xmlns:kml="http://www.opengis.net/kml/2.2"
xmlns:atom="http://www.w3.org/2005/Atom">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<!-- Identity Transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- Empty Template for Style Elemeent -->
<xsl:template match="kml:Style"/>
</xsl:transform>
C# Script (see tutorial)
using System;
using System.Xml;
using System.Xml.Xsl;
namespace XSLTransformation
{
class Class1
{
static void Main(string[] args)
{
XslTransform myXslTransform;
myXslTransform = new XslTransform();
myXslTransform.Load("XSLTScript.xsl");
myXslTransform.Transform("InputXML.xml", "OutpuXML.xml");
}
}
}

How to read a sitemap using VB.NET

I have been trying to open the following XML file in VB.NET using the Linq library.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="http://wegotflash.com/sitemap.xsl"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://wegotflash.com</loc>
<lastmod>2012-02-19</lastmod>
<changefreq>daily</changefreq>
<priority>1</priority>
</url>
<url>
<loc>http://wegotflash.com/cat/1/shooter/newest-1</loc>
<lastmod>2012-02-19</lastmod>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
</urlset>
The code that I'm using works with normal XML files, but whenever I add the xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" attribute to the root node, nothing is getting returned by the application. Here is the VB.NET code that is reading the XML file:
Dim XMLFile As XDocument = XDocument.Load(TextBox1.Text)
For Each url As XElement In XMLFile.Descendants("url")
If url.HasElements Then
MessageBox.Show(url.Element("loc").Value)
End If
Next
That is because sitemap.xml has default namespace http://www.sitemaps.org/schemas/sitemap/0.9. You should define XNamespace, then use it in queries, i.e.:
C# code:
XNamespace ns = "http://www.sitemaps.org/schemas/sitemap/0.9";
foreach (var element in XMLFile.Descendants(ns + "url"))
{
...
}

How to read nested XML using xDocument in Silver light?

Hi currently I have a nested XMl , having the following Structure :
<?xml version="1.0" encoding="utf-8" ?>
<Response>
<Result>
<item id="something" />
<price na="something" />
<?xml version="1.0" encoding="UTF-8" ?>
<DIDL-Lite xmlns="urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:upnp="urn:schemas-upnp-org:metadata-1-0/upnp/" xmlns:dlna="urn:schemas-dlna-org:metadata-1-0/">
</Result>
<NumberReturned>10</NumberReturned>
<TotalMatches>10</TotalMatches>
</Response>
Any help on how to read this using Xdocument or XMLReader will be really helpfull.
Thanks,
Subhendu
XDocument and XmlReader are both XML parsers that expect a properly formed XML as input. What you have shown is not a XML file. So the first task would be to extract the nested XML and as this is not valid XML you cannot rely on any parser to do this job. You'll need to resort to string manipulation and or regular expressions.
My suggestion would be to fix the procedure generating this invalid XML in the first place. Another suggestion is to never generate a XML file manually but use an appropriate tool for this (XmlWriter, XDocument, ...)

Categories