How to read a sitemap using VB.NET - c#

I have been trying to open the following XML file in VB.NET using the Linq library.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="http://wegotflash.com/sitemap.xsl"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://wegotflash.com</loc>
<lastmod>2012-02-19</lastmod>
<changefreq>daily</changefreq>
<priority>1</priority>
</url>
<url>
<loc>http://wegotflash.com/cat/1/shooter/newest-1</loc>
<lastmod>2012-02-19</lastmod>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
</urlset>
The code that I'm using works with normal XML files, but whenever I add the xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" attribute to the root node, nothing is getting returned by the application. Here is the VB.NET code that is reading the XML file:
Dim XMLFile As XDocument = XDocument.Load(TextBox1.Text)
For Each url As XElement In XMLFile.Descendants("url")
If url.HasElements Then
MessageBox.Show(url.Element("loc").Value)
End If
Next

That is because sitemap.xml has default namespace http://www.sitemaps.org/schemas/sitemap/0.9. You should define XNamespace, then use it in queries, i.e.:
C# code:
XNamespace ns = "http://www.sitemaps.org/schemas/sitemap/0.9";
foreach (var element in XMLFile.Descendants(ns + "url"))
{
...
}

Related

How to extract a node from xml string C#

I have am xml string like mentioned below:
<?xml version="1.0" encoding="utf-8" ?>
<NodeA xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.air-watch.com/webapi/resources">
<AdditionalInfo>
<Links>
<Link xsi:type="link">
</Link>
</Links>
</AdditionalInfo>
<TotalResults>100</TotalResults>
<NodeB>
<NodeC>
<Id>1</Id>
<A>valueA</A>
<B>valueB</B>
</NodeC>
<NodeC>
<Id>2</Id>
<A>valueA</A>
<B>valueA</B>
</NodeC>
</NodeB>
</NodeA>
I want to extract NodeB and its child nodes (NodeC elements). How can I do it? Below solution does somewhat similar operation but it needs the xml string to be loaded in a XDocument first:
XDocument doc=XDocument.Parse(xmlstr);
String response=doc.Elements("question")
.Where(x=>x.Attribute("id")==id)
.Single()
.Element("response")
.Value;
Is there a way to do it without having to load it in a doc? Some operation on string object itself.
Why cant you use this
XDocument doc=XDocument.Parse(xmlstr);
String response=doc.Elements("question")
.Where(x=>x.Attribute("id")==id)
.Single()
.Element("response")
.Value; ?
you can use Regular Expressions then.

How to remove element from XML

I do have the following xml file
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:kml="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom">
<Document>
<open>1</open>
<Placemark>
<name>L14A</name>
<description>ID:01F40BF0
PLACEMENT:Home Woods
RSSI:-82
</description>
<Style>
<IconStyle>
<Icon>
<href>http://chart.apis.google.com/chart?chst=d_map_pin_letter&chld=3|0000CC|FFFFFF</href>
</Icon>
</IconStyle>
</Style>
<Point>
<coordinates>-73.16551208,44.71051217,0</coordinates>
</Point>
</Placemark>
</Document>
</kml>
The file is bigger than that but it does represent the structure. I'm trying to remove the element <Style> but I can't find a way to get it right.
I have tried the following method:
How to remove an element from an xml using Xdocument when we have multiple elements with same name but different attributes
The code is:
XDocument xdoc = XDocument.Load("kkk.kml");
xdoc.Descendants("Style").Remove();
xdoc.Save("kkk-mod.kml");
The Descendants collection is always empty.
Also, when I save the file, it does append "kml:" to each of my elements (see below).
<kml:Placemark>
<kml:name>L14A</kml:name>
<kml:description>ID:01F40BF0
</kml:description>
<kml:Point>
<kml:coordinates>-73.200,44.500,0</kml:coordinates>
</kml:Point>
</kml:Placemark>
How may I get it right?
the remove
the :kml appended in the final file.
You need to include the namespace in order to access the node. Based on the sampel XML you posted, the namespace is http://www.opengis.net/kml/2.2, so something like this should get you going:
XDocument xdoc = XDocument.Load("kkk.kml");
XNamespace ns = "http://www.opengis.net/kml/2.2";
xdoc.Descendants(ns + "Style").Remove();
xdoc.Save("kkk-mod.kml");
If you want to remove the "kml" prefix from the modified document, you can use the following code snippet. This will remove all the namespaces from the document.
XDocument xdoc = XDocument.Load("kkk.kml");
XNamespace ns = "http://www.opengis.net/kml/2.2";
xdoc.Descendants(ns + "Style").Remove();
XElement newDoc = RemoveAllNamespaces(xdoc.Root);
xdoc.Save("kkk-mod.kml");
public static XElement RemoveAllNamespaces(XElement e)
{
return new XElement(e.Name.LocalName,
(from n in e.Nodes()
select ((n is XElement) ? RemoveAllNamespaces(n as XElement) : n)),
(e.HasAttributes) ?
(from a in e.Attributes()
where (!a.IsNamespaceDeclaration)
select new XAttribute(a.Name.LocalName, a.Value)) : null);
}
Taken from this SO answer.
The resulting XML file looks like this:
<?xml version="1.0" encoding="utf-8"?>
<kml>
<Document>
<open>1</open>
<Placemark>
<name>L14A</name>
<description>ID:01F40BF0
PLACEMENT:Home Woods
RSSI:-82
</description>
<Point>
<coordinates>-73.16551208,44.71051217,0</coordinates>
</Point>
</Placemark>
</Document>
</kml>
Of course, you can use a native language for XML restructuring called XSLT requiring no looping. As information, XSLT is a declarative, special-purpose programming language (same type as SQL) used to re-format, style, and re-structure XML documents for various end use needs. Practically all general purpose languages maintain XSLT processors including C#, Java, Python, PHP, Perl, and VB.
Below is a solution for future readers where the XSLT script runs an identity transform to copy entire document as is and then writes an empty template to the <Style> node, thereby removing it:
XSLT script (save as .xsl or .xslt file)
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns="http://www.opengis.net/kml/2.2"
xmlns:gx="http://www.google.com/kml/ext/2.2"
xmlns:kml="http://www.opengis.net/kml/2.2"
xmlns:atom="http://www.w3.org/2005/Atom">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<!-- Identity Transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- Empty Template for Style Elemeent -->
<xsl:template match="kml:Style"/>
</xsl:transform>
C# Script (see tutorial)
using System;
using System.Xml;
using System.Xml.Xsl;
namespace XSLTransformation
{
class Class1
{
static void Main(string[] args)
{
XslTransform myXslTransform;
myXslTransform = new XslTransform();
myXslTransform.Load("XSLTScript.xsl");
myXslTransform.Transform("InputXML.xml", "OutpuXML.xml");
}
}
}

Modify XSLT using C# Code

I am Working on Visual-studio 2012 in C#.
I want to update the value of a node of a XSLT.
This abc.xslt is like:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:template match="/">
<DocumentElement>
<PositionMaster>
<Name>
<xsl:value-of select = "'Ryan'"/>
</Name>
</PositionMaster>
</DocumentElement>
Code i have written to modify this XSLT in the C# is:
XmlDocument xslDoc = new XmlDocument();
xslDoc.Load(abc.xslt);
XmlNamespaceManager nsMgr = new XmlNamespaceManager(xslDoc.NameTable);
nsMgr.AddNamespace("xsl", "http://www.w3.org/1999/XSL/Transform");
I am looking to change the value of Name field to David. What should i write further here?
XmlElement valueOf = xslDoc.SelectSingleNode("/xsl:stylesheet/xsl:template[#match = '/']/DocumentElement/PositionMaster/Name/xsl:value-of", nsMgr);
if (valueOf != null)
{
valueOf.SetAttribute("select", "'David'");
xslDoc.Save("new.xslt");
}
else
{
// handle case here that element was not found
}
You seem to be going about this a very odd way. Why not just use a stylesheet parameter (a global xsl:param element)?
And if you do need to modify a source stylesheet, as you sometimes do, surely it makes more sense to use XSLT for the purpose?

XDocument.Load losing Declaration

I have a XML template file like so
<?xml version="1.0" encoding="us-ascii"?>
<AutomatedDispenseResponse>
<header shipmentNumber=""></header>
<items></items>
</AutomatedDispenseResponse>
When I use XDocument.Load, for some reason the
<?xml version="1.0" encoding="us-ascii"?>
is dropped.
How do I load the file into a XDocument and not losing the declaration at the top?
I suspect it's not really dropping the declaration on load - it's when you're writing the document out that you're missing it. Here's a sample app which works for me:
using System;
using System.Xml.Linq;
class Test
{
static void Main()
{
XDocument doc = XDocument.Load("test.xml");
Console.WriteLine(doc.Declaration);
}
}
And test.xml:
<?xml version="1.0" encoding="us-ascii" ?>
<Foo>
<Bar />
</Foo>
Output:
<?xml version="1.0" encoding="us-ascii"?>
The declaration isn't shown by XDocument.ToString(), and may be replaced when you use XDocument.Save because you may be using something like a TextWriter which already knows which encoding it's using. If you save to a stream or just to a filename, it's preserved in my experience.
It is loaded. You can see it and access parts of it using:
XDocument.Parse(myDocument).Declaration

How to read nested XML using xDocument in Silver light?

Hi currently I have a nested XMl , having the following Structure :
<?xml version="1.0" encoding="utf-8" ?>
<Response>
<Result>
<item id="something" />
<price na="something" />
<?xml version="1.0" encoding="UTF-8" ?>
<DIDL-Lite xmlns="urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:upnp="urn:schemas-upnp-org:metadata-1-0/upnp/" xmlns:dlna="urn:schemas-dlna-org:metadata-1-0/">
</Result>
<NumberReturned>10</NumberReturned>
<TotalMatches>10</TotalMatches>
</Response>
Any help on how to read this using Xdocument or XMLReader will be really helpfull.
Thanks,
Subhendu
XDocument and XmlReader are both XML parsers that expect a properly formed XML as input. What you have shown is not a XML file. So the first task would be to extract the nested XML and as this is not valid XML you cannot rely on any parser to do this job. You'll need to resort to string manipulation and or regular expressions.
My suggestion would be to fix the procedure generating this invalid XML in the first place. Another suggestion is to never generate a XML file manually but use an appropriate tool for this (XmlWriter, XDocument, ...)

Categories