Reading XML file and indenting - c#

I've been having problems with the indentation of my XML files. Everytime I load them from a certain server, the XML nodes all jumble up on a few lines. I want to write a quick application to indent the nodes properly. That is:
<name>Bob<name>
<age>24</age>
<address>
<stnum>2</stnum>
<street>herp derp st</street>
</address>
currently it's coming out as :
<name>bob</name><age>24</age>
<address>
<stnum>2</stnum><street>herp derp st</street>
</address>
since I can't touch the internal program that gives me these xml files and re-indenting them without a program would take ages, I wanted to write up a quick program to do this for me. When I use the XMLdocument library stuff, it only reads the information of the nodes. So my question is, whats a good way to read the file, line by line and then reindenting it for me. All xml nodes are the same.
Thanks.

You can use the XmlTextWritter class. More specifically the .Formatting = Formatting.Indented.
Here is some sample code I found on this blog post.
http://www.yetanotherchris.me/home/2009/9/9/formatting-xml-in-c.html
public static string FormatXml(string inputXml)
{
XmlDocument document = new XmlDocument();
document.Load(new StringReader(inputXml));
StringBuilder builder = new StringBuilder();
using (XmlTextWriter writer = new XmlTextWriter(new StringWriter(builder)))
{
writer.Formatting = Formatting.Indented;
document.Save(writer);
}
return builder.ToString();
}

With LINQ to XML, it's basically a one-liner:
public static string Reformat(string xml)
{
return XDocument.Parse(xml).ToString();
}

Visual Studio or any decent XML editor will format (tabify) XML documents easily. There are also on-line tools available:
http://www.xmlformatter.net/
http://www.shell-tools.net/index.php?op=xml_format

If you are using Visual studio just open xml do Ctrl+a Ctrl+k Ctrl+F and that's it for formatting.

You can also use XSLT:
// This XSLT copies everything but idented
StringReader sr = new StringReader( xsl );
XmlReader reader = XmlReader.Create(sr);
XslTransform xslt = new XslTransform();
xslt.Load(reader);
xslt.Transform(xmlFileUnidentedPath, xmlFileIdentedPath);
Having xsl defined as:
string xsl = #"
<?xml version=""1.0""?>
<xsl:stylesheet version=""1.0"" xmlns:xsl=""http://www.w3.org/1999/XSL/Transform"">
<xsl:output method=""xml"" omit-xml-declaration=""no"" indent=""yes"" encoding=""US-SCII""/>
<xsl:strip-space elements=""*""/>
<xsl:template match=""/"">
<xsl:copy-of select="".""/>
</xsl:template>
</xsl:stylesheet>";

Related

Xslt and c#: Transform the whole input xml to string in output xml

I´m trying to transform the whole input xml to a string in the output xml.
And i´m almost there. I have manage to get all the content into the string element, but i´m missing the xml declaration. I need this because of the charset information.
Anyone have an idea on how to manage this?
I currently use this c# method to do the work:
public static string ConvertNodeToXmlString(XPathNodeIterator node)
{
node.MoveNext();
return node.Current.OuterXml;
}
and it´s called from xslt:
<xsl:variable name="result" xmlns:myScriptPrefix="http://HelperClass" select="myScriptPrefix:ConvertNodeToXmlString(.)" />
All help is much appreciated!
Well which encoding do you want? You could use http://msdn.microsoft.com/en-us/library/system.xml.xpath.xpathnavigator.writesubtree%28v=vs.110%29.aspx
node.MoveNext();
using (StringWriter sw = new StringWriter())
{
using (XmlWriter xw = XmlWriter.Create(sw))
{
node.Current.WriteSubtree(xw);
}
return sw.ToString();
}
but as .NET Strings are UTF-16 encoded you might get <?xml version="1.0" encoding="UTF-16"?>.

need to add comments in an existing xml document

I need to add comments in an existing xml document.a sample xml is shown below i need to write code in c#. XML serialization was used to generate this xml
any help would be great...
thanks in advance
<?xml version="1.0" encoding="utf-8"?>
<Person>
<Name>Job</Name>
<Address>10dcalp</Address>
<Age>12</Age>
</Person>
Try it like this:
string input = #"<?xml version=""1.0"" encoding=""utf-8""?><Person><Name>Job</Name><Address>10dcalp</Address><Age>12</Age></Person>";
XDocument doc = XDocument.Parse(input);
XElement age = doc.Root.Element("Age");
XComment comm = new XComment("This is comment before Age");
age.AddBeforeSelf(comm);
This code gets the document, finds the element named "Age" which is expected to be under the root element ("Person") and adds comment before it.
You can use XmlWriter to write the comment in following way:
MemoryStream stream = new MemoryStream();
XmlWriter writer = XmlWriter.Create(stream);
writer.WriteStartDocument();
writer.WriteComment("Add comment here");
Now, you serialize XmlWriter instance through your serializer.

Generating XSL block in C#

I am trying to generate the following XSL block in my C# application. Can anyone tell me how to to it?
<XSL-Script xmlns:xsl="http://www.w3.org/......">
<xsl:value-of select="$VAR">
</XSL-Script>
I tried to use regular C# XML class, and it removes the xsl: from the tag name, because it thinks xsl: is the namespace. And it also doesn't allow to use "$" in front of VAR for attribute value of "select".
Thanks a lot.
Here is a simple C# program that "generates" a complete XSLT stylesheet and then performs this transformation on a "generated" XML document and outputs the result of the transformation to a file:
using System.IO;
using System.Xml;
using System.Xml.Xsl;
class testTransform
{
static void Main(string[] args)
{
string xslt =
#"<xsl:stylesheet version='1.0'
xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:variable name='vX' select='1'/>
<xsl:template match='/'>
<xsl:value-of select='$vX'/>
</xsl:template>
</xsl:stylesheet>";
string xml = #"<t/>";
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xml);
XmlDocument xslDoc = new XmlDocument();
xslDoc.LoadXml(xslt);
XslCompiledTransform xslTrans = new XslCompiledTransform();
xslTrans.Load(xslDoc);
xslTrans.Transform(xmlDoc, null, new StreamWriter("output.txt"));
}
}
When this application is built and executed it creates a file named "output.txt" and its contents is the expected, correct result from the dynamically generated XSLT transformation:
<?xml version="1.0" encoding="utf-8"?>1

How to read processing instruction from an XML file using .NET 3.5

How to check whether an Xml file have processing Instruction
Example
<?xml-stylesheet type="text/xsl" href="Sample.xsl"?>
<Root>
<Child/>
</Root>
I need to read the processing instruction
<?xml-stylesheet type="text/xsl" href="Sample.xsl"?>
from the XML file.
Please help me to do this.
How about:
XmlProcessingInstruction instruction = doc.SelectSingleNode("processing-instruction('xml-stylesheet')") as XmlProcessingInstruction;
You can use FirstChild property of XmlDocument class and XmlProcessingInstruction class:
XmlDocument doc = new XmlDocument();
doc.Load("example.xml");
if (doc.FirstChild is XmlProcessingInstruction)
{
XmlProcessingInstruction processInfo = (XmlProcessingInstruction) doc.FirstChild;
Console.WriteLine(processInfo.Data);
Console.WriteLine(processInfo.Name);
Console.WriteLine(processInfo.Target);
Console.WriteLine(processInfo.Value);
}
Parse Value or Data properties to get appropriate values.
How about letting the compiler do more of the work for you:
XmlDocument Doc = new XmlDocument();
Doc.Load(openFileDialog1.FileName);
XmlProcessingInstruction StyleReference =
Doc.OfType<XmlProcessingInstruction>().Where(x => x.Name == "xml-stylesheet").FirstOrDefault();

What's the property way to transform with XSL without HTML encoding my final output?

So, I am working with .NET. I have an XSL file, XslTransform object in C# that reads in the XSL file and transforms a piece of XML data (manufactured in-house) into HTML.
I notice that my final output has < and > automatically encoded into < and >. Is there any ways I can prevent that from happening? Sometimes I need to bold or italicize my text but it's been unintentionally sanitized.
Your xsl file should have:
an output of html
omit namespaces for all used in the xslt
i.e.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
exclude-result-prefixes="xsl msxsl">
<xsl:output method="html" indent="no" omit-xml-declaration="yes"/>
<!-- lots -->
</xsl:stylesheet>
And you should ideally use the overloads that accept either a TextWriter or a Stream (not XmlWriter) - i.e. something like:
StringBuilder sb = new StringBuilder();
using (XmlReader reader = XmlReader.Create(source)
using (TextWriter writer = new StringWriter(sb))
{
XslCompiledTransform xslt = new XslCompiledTransform();
xslt.Load("Foo.xslt"); // in reality, you'd want to cache this
xslt.Transform(reader, options.XsltOptions, writer);
}
string html = sb.ToString();
In the xslt, if you really want standalone < / > (i.e. you want it to be malformed for some reason), then you need to disable output escaping:
<xsl:text disable-output-escaping="yes">
Your malformed text here
</xsl:text>
However, in general it is correct to escape the characters.
I have used this in the past to transform XMl documents into HTML strings which is what you need.
public static string TransformXMLDocFromFileHTMLString(string orgXML, string transformFilePath)
{
System.Xml.XmlDocument orgDoc = new System.Xml.XmlDocument();
orgDoc.LoadXml(orgXML);
XmlNode transNode = orgDoc.SelectSingleNode("/");
System.Text.StringBuilder sb = new System.Text.StringBuilder();
XmlWriterSettings settings = new XmlWriterSettings();
settings.ConformanceLevel = ConformanceLevel.Auto;
XmlWriter writer = XmlWriter.Create(sb, settings);
System.Xml.Xsl.XslCompiledTransform trans = new System.Xml.Xsl.XslCompiledTransform();
trans.Load(transformFilePath);
trans.Transform(transNode, writer);
return sb.ToString();
}
(XslTransform is deprecated, according to MSDN. They recommend you switch to XslCompiledTransform.)
Can you post an example of the input/output?

Categories