Its been a while since i've needed to do this so i was looking at the old school methods of writing XMLDocument from code down to a File.
In my application i am writing alot to an XMLdocument with new elements and values and periodically saving it down to disc and also reading from the file and depending on the data i am doing things.
I am using methods like File.Exists(...) _xmldoc.LoadFile(..) etc...
Im wondering probably now a days there are better methods for this with regards
Parsing the XML
Checking its format for saving down
rather then the data being saved down being treated as text but as XML
maybe what i am doing is fine but its been a while and wondered if there are other methods :)
thanks
Well there's LINQ to XML which is a really nice XML API introduced in .NET 3.5. I don't think the existing XMLDocument API has changed much, beyond some nicer ways of creating XmlReaders and XmlWriters (XmlReader.Create/XmlWriter.Create).
I'm not sure what you really mean by your second and third bullets though. What exactly are you doing in code which feels awkward?
Have you looked at the Save method of your XmlDocument? It will save whatever is in your XmlDocument as a valid formatted file.
If your program is able to use the XmlDocument class, the XmlDocument class will be able to save your file. You won't need to worry about validating before saving, and you can give it whatever file extension you want. As to your third point... an XML file is really just a text file. It won't matter how the OS sees it.
I was a big fan of XmlDocument due to its facility to use but recently I got a huge memory problem with that class so I started to use XmlReader and XmlWriter.
XmlReader can be a little bit tricky to use if your Xml file is complex because you read the Xml file sequentially. In that case, the method ReadSubTree of XmlReader can be very useful because this method returns only the xml tree under the current node so you send the new xmlreader to a function to parse the subnode content and once it is done, you continue to the next node.
XmlReader Example:
string xmlcontent = "<BigXml/>";
using(StringReader strContent = new StringReader(xmlcontent))
{
using (XmlReader reader = XmlReader.Create(strContent))
{
while (reader.Read())
{
if (reader.Name == "SomeName" && reader.NodeType == XmlNodeType.Element)
{
//Send the XmlReader created by ReadSubTree to a function to read it.
ReadSubContentOfSomeName(reader.ReadSubtree());
}
}
}
}
XmlWriter Example:
StringBuilder builder = new StringBuilder();
using (XmlWriter writer = XmlWriter.Create(builder))
{
writer.WriteStartDocument();
writer.WriteStartElement("BigXml");
writer.WriteAttributeString("someAttribute", "42");
writer.WriteString("Some Inner Text");
//Write nodes under BigXml
writer.WriteStartElement("SomeName");
writer.WriteEndElement();
writer.WriteEndElement();
writer.WriteEndDocument();
}
Related
I'm trying to create a SEPA XML with XmlWriter. The created XML has to look like this example:
http://www.ebics.de/fileadmin/unsecured/anlage3/anlage3_pain008/pain_ex/pain.008.003.02.xml
This is my code so far:
public void generateSepaXml()
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
using (XmlWriter writer = XmlWriter.Create("C:\\Users\\Sybren\\Documents\\test.xml",settings))
{
String messageId = "Message ID";
writer.WriteStartDocument();
writer.WriteStartElement("Document"); //Document start
writer.WriteAttributeString("xsi",#"schemaLocation=""urn:iso:std:iso:20022:tech:xsd:pain.008.003.02 pain.008.003.02.xsd");
writer.WriteStartElement("CstmrDrctDbtInitn"); // CstmrDrctDbtInitn tag start
writer.WriteStartElement("GrpHdr"); //GrpHeader tag start
writer.WriteStartElement("MsgId",messageId); //Message tag start
writer.WriteEndElement(); //Message tag end
writer.WriteEndElement(); //GrpHdr end
writer.WriteEndElement(); //CstmrDrctDbtInitn tag end
writer.WriteEndElement(); //Document end
writer.WriteEndDocument();
}
}
The result of my code looks like this:
How can set the text in the Document tag the same to the Document tag in my example?
And for the message tag xmlns is added when I open the xml in IE. The xmlns tag isn't visible in the result image above (in Firefox). How to remove this tag? And how can I set the text in the MsgId the same to the MsgId in the linked example xml? Maybe XMLWriter isn't the best option in my case? If so what is another better option?
Ofcourse this is only a small part of the XML but if know how it works , I think I can do the rest myself.
Just a well-meant advise (since I have some experience in the area of financial data-exchange, especially with SEPA)... I would not try to implement that shit using such a low-level XML API; there´re better ways to create complex XML documents than assembling them piece-by-piece with XmlWriter.
Since SEPA documents can become quite big, achieving it the way you´re trying will probably result in unmaintainable and error-prone spaghetti code. Instead, I would recommend to investigate into a model-driven approach (one for each SEPA use-case to implement) and then use a generator (consider using an intermediate XML format that can be transformed using XSLT) or a text-processor to produce the final output...
I have an XML file. I want to convert it to JSON with C#. However, the XML file is over 20 GB.
I have tried to read XML with XmlReader, then append every node to a JSON file. I wrote the following code:
var path = #"c:\result.json";
TextWriter tw = new StreamWriter(path, true, Encoding.UTF8);
tw.Write("{\"A\":");
using (XmlTextReader xmlTextReader = new XmlTextReader("c:\\muslum.xml"))
{
while (xmlTextReader.Read())
{
if (xmlTextReader.Name == "A")
{
var xmlDoc = new XmlDocument();
var v = xmlTextReader.ReadInnerXml();
string json = Newtonsoft.Json.JsonConvert.SerializeXmlNode(xmlDoc, Newtonsoft.Json.Formatting.None, true);
tw.Write(json);
}
}
}
tw.Write("}");
tw.Close();
This code not working. I am getting error while converting json. Is there any best way to perform the conversion?
I would do it the following way
generate classes out of xsd schema using xsd.exe
open file and read top level ( i e your document level ) tags one by one (with XmlTextReader or XmlReader)
serialize each tag into object using generated classes
deserialize resulting object to json and save to whatever
consider saving in batches of 1000-2000 tags
you are right about serialize/deserialize being slow. still doing work in several threads, preferably using TPL will give you good speed. Also consider using json.net serializer, it is really a lot faster than standard ones ( it is standard for web.api though)
I can put some code snippets in the morning if you need them.
We are processing big ( 1-10gigs) files this manner in order to save data to sql server database.
Working with C# Visual Studio 2008, MVC1.
I'm creating an xml file by fetching one from a WebService and adding some nodes to it. Now I wanted to deserialize it to a class which is the model used to strongtyped the View.
First of all, I'm facing problems to achieve that without storing the xml in the filesystem cause I don't know how this serialize and deserialize work. I guess there's a way and it's a matter of time.
But, searching for the previous in the web I came accross LINQ to XML and now I doubt whether is better to use it.
The xml would be formed by some clients details, and basically I will use all of them.
Any hint?
Thanks!!
You can save a XElement to and from a MemoryStream (no need to save it to a file stream)
MemoryStream ms = new MemoryStream();
XmlWriter xw = XmlWriter.Create(ms);
document.Save(xw);
xw.Flush();
Then if you reset the position back to 0 you can deserialize it using the DataContractSerializer.
ms.Position = 0;
DataContractSerializer serializer = new DataContractSerializer(typeof(Model));
Model model = (model) serializer.ReadObject(ms);
There are other options for how serialization works, so if this is not what you have, let me know what you are using and I will help.
try this:
XmlSerializer xmls = new XmlSerializer(typeof(XElement));
FileStream FStream;
try
{
FStream = new FileStream(doctorsPath, FileMode.Open);
_Doctors = (XElement)xmls.Deserialize(FStream); FStream.Close();
FStream = new FileStream(patientsPath, FileMode.Open);
_Patients = (XElement)xmls.Deserialize(FStream)
FStream.Close();
FStream = new FileStream(treatmentsPath, FileMode.Open);
_Treatments = (XElement)xmls.Deserialize(FStream);
FStream.Close();
}
catch
{ }
This will load all of the XML files into our XElement variables. The try – catch block is a form of exception handling that ensures that if one of the functions in the try block throws an exception, the program will jump to the catch section where nothing will happen. When working with files, especially reading files, it is a good idea to work with try – catch.
LINQ to XML is an excellent feature. You can always rely on that. You don't need to write or read or data from file. You can specify either string or stream to the XDocument
There are enough ways to load an XML element to the XDocument object. See the appropriate Load functions. Once you load the content, you can easily add/remove the elements and later you can save to disk if you want.
I am writing a network server in C# .NET 4.0. There is a network TCP/IP connection over which I can receive complete XML elements. They arrive regularly and I need to process them immediately. Each XML element is a complete XML document in itself, so it has an opening element, several sub-nodes and a closing element. There is no single root element for the entire stream. So when I open the connection, what I get is like this:
<status>
<x>123</x>
<y>456</y>
</status>
Then some time later it continues:
<status>
<x>234</x>
<y>567</y>
</status>
And so on. I need a way to read the complete XML string until a status element is complete. I don't want to do that with plain text reading methods because I don't know in what formatting the data arrives. I can in no way wait until the entire stream is finished, as is often described elsewhere. I have tried using the XmlReader class but its documentation is weird, the methods don't work out, the first element is lost and after sending the second element, an XmlException occurs because there are two root elements.
Try this:
var settings = new XmlReaderSettings
{
ConformanceLevel = ConformanceLevel.Fragment
};
using (var reader = XmlReader.Create(stream, settings))
{
while (!reader.EOF)
{
reader.MoveToContent();
var doc = XDocument.Load(reader.ReadSubtree());
Console.WriteLine("X={0}, Y={1}",
(int)doc.Root.Element("x"),
(int)doc.Root.Element("y"));
reader.ReadEndElement();
}
}
If you change the "conformance level" to "fragment", it might work with the XmlReader.
This is a (slightly modified) example from MSDN:
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
XmlReader reader = XmlReader.Create(streamOfXmlFragments, settings);
You could use XElement.Load which is meant more for streaming of Xml Element fragments that is new in .net 3.5 and also supports reading directly from a stream.
Have a look at System.Xml.Linq
I think that you may well still have to add some control logic so as to partition the messages you are receiving, but you may as well give it a go.
I'm not sure there's anything built-in that does that.
I'd open a string builder, fill it until I see a </status> tag, and then parse it using the ordinary XmlDocument.
Not substantially different from dtb's solution, but linqier
static IEnumerable<XDocument> GetDocs(Stream xmlStream)
{
var xmlSettings = new XmlReaderSettings() { ConformanceLevel = ConformanceLevel.Fragment };
using (var xmlReader = XmlReader.Create(xmlStream, xmlSettings))
{
var xmlPathNav = new XPathDocument(xmlReader).CreateNavigator();
foreach (var selectee in xmlPathNav.Select("/*").OfType<XPathNavigator>())
yield return XDocument.Load(selectee.ReadSubtree());
}
}
I ran into a similar problem in PowerShell, but the asker's question was in C#, so I've attempted to translate it (and verified that it works). Here is where I found the clue that got me over the last little bumps (". . .The way the XPathDocument does its magic is by creating a “transparent” root node, and holding the fragments from it. I say it’s transparent because your XPath queries can use the root node axis and still get properly resolved to the fragments. . .")
The fragments of XML I'm working with happen to be smallish. If you had bigger chunks, you'd probably want to look into XStreamingElement - it can add a lot of complexity but also greatly decrease memory usage when dealing with large volumes of XML.
In other words, is there a faster, more concise way of writing the following code:
//Create an object for performing XSTL transformations
XslCompiledTransform xslt = new XslCompiledTransform();
xslt.Load(HttpContext.Current.Server.MapPath("/xslt/" + xsltfile.Value), new XsltSettings(true, false), new XmlUrlResolver());
//Create a XmlReader object to read the XML we want to format
//XmlReader needs an input stream (StringReader)
StringReader sr = new StringReader(node.OuterXml);
XmlReader xr = XmlReader.Create(sr);
//Create a StringWriter object to capture the output from the XslCompiledTransform object
StringWriter sw = new StringWriter();
//Perform the transformation
xslt.Transform(xr, null, sw);
//Retrieve the transformed XML from the StringWriter object
string transformedXml = sw.ToString();
UPDATE (thanks for all the answers so far!):
Sorry for my vagueness: by "faster" and more "concise" I mean, am I including any unnecessary steps? Also, I would love a more "readable" solution if someone has one. I use this code in a small part of a web application I'm developing, and I'm about to move it to a large part of the application, so I want to make sure it's as neat as can be before I make the move.
Also, I get the XML from a static class (in a separate data access class library) which communicates with a database. I also manipulate the transformed XML string before shipping it off to a web page. I'm not sure if the input/response streams are still viable in this case.
One more thing: the XML and the XSLT supplied may change (users of the application can make changes to both), so I think I would be forced to compile each time.
Here's code I did for my ASP.NET, which is very similar to yours:
XDocument xDoc = XDocument.Load("output.xml");
XDocument transformedDoc = new XDocument();
using (XmlWriter writer = transformedDoc.CreateWriter())
{
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load(XmlReader.Create(new StreamReader("books.xslt")));
transform.Transform(xDoc.CreateReader(), writer);
}
// now just output transformedDoc
If you have a large XSLT you can save the overhead of compiling it at runtime by compiling the XSLT into a .NET assembly when you build your project (e.g. as a post-build step). The compiler to do this is called xsltc.exe and is part of Visual Studio 2008.
In order to load such a pre-compiled XSLT you will need .NET Framework 2.0 SP1 or later installed on your server (the feature was introduced with SP1).
For an example check Anton Lapounov's blog article:
XSLTC — Compile XSLT to .NET Assembly
If pre-compiling the XSLT is not an option you should consider caching the XslCompiledTransform after it is loaded so that you don't have to compile it everytime you want to execute the transform.
Don't have time to do a full example, but some notes:
XML is not the same as System.String. Get it from the class library as XDocument or XmlDocument; finish with it as XDocument or XmlDocument.
You can use the ASP.NET Cache to store the compiled XSL, with a cache dependency on when the .XSLT file changes.
Don't convert the XML to a string then back to XML. Use node.CreateNavigator().ReadSubTree().
Similarly, use XPathNavigator.AppendChild to get an XmlWriter that will write into an XML Document.
Since you mention ASP.NET, the question is whether you can use the response stream directly for your transform output and whether you can use the input stream directly if it is a POST...
I'd rewrite the code like this:
string path = HttpContext.Current.Server.MapPath("/xslt/" + xsltfile.Value);
XmlReader reader = CreateXmlReader(node.OuterXml);
string transformedXml = Transform(path, reader);
private XmlReader CreateXmlReader(string text)
{
StringReader reader = new StringReader(text);
return XmlReader.Create(reader);
}
private string Transform(string xsltPath, XmlReader source)
{
XsltCompiledTransform transformer = new XsltCompiledTransform();
transformer.Load(
xsltPath,
new XsltSettings(true, false),
new XmlUrlResolver());
StringWriter writer = new StringWriter();
transformer.Transform(source, null, writer);
return writer.ToString();
}
The reason why I'd rewrite the code like this is because each block of code now has one and only one purpose. This makes it easier to read and understand. Additionally, the code requires less comments because a lot of information can be inferred from the names of the function and its parameters.