C# Reproducing RSS feed - c#

I have made a program that scans rss feeds. This same program creates feeds from elements it has crawled. This means that the rss feeds are not identical, but the items must be. It copies it. It is therefore essential that what comes out is the same thing that comes in.
Now, there are occurences where elmenents in the input rss's has elements with names like this:
<dc:creator>tomatoes</dc:creator>
Now, when i scan this it works perfectly. The element get saved to database and everything is jolly good.
When i try to write it out again to an RSS feed, using these codelines (and a bunch of foreaches, if's +++)
StringBuilder sb = new StringBuilder();
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.IndentChars = " ";
settings.NewLineOnAttributes = true;
XmlWriter feedWriter = XmlWriter.Create(sb, settings);
And this line for each element:
feedWriter.WriteElementString(keyAndValue[0], keyAndValue[1]);
I get this error message if i hit the example element above:
Invalid name character in 'dc:creator'. The ':' character, hexadecimal value 0x3A, cannot be included in a name.
Now, i have found a lot of articles where this error has been mentioned. And in almost all of them they questioneer is told that this is not correct XML, and should drop writing the ':'. I however can't.
I found one example where you could use another overloaded method of XmlWriter, this one:
feedWriter.WriteElementString(prefixAndKey[0],prefixAndKey[1],"Namespace",keyAndValue[1]);
However this causes the element to look like this:
<dc:creator xmlns:something="NameSpace">tomatoes</dc:creator>
This is, as you all know not the same as the one above because it contains the xmlns bit.
I also tried another 'hack' which would work as follows:
StringBuilder sb = new StringBuilder();
StringWriter stringWriter = new StringWriter(sb);
XmlTextWriter xmlTextWriter = new XmlTextWriter(stringWriter);
and
feedWriter.WriteElementString(keyAndValue[0], keyAndValue[1]);
This built and did not return errors, but when i opend it in Firefox, it displayed 0 items.
I then took a closer look at the feed i was getting this elements from, and it contained a rss element like this:
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
I am currently trying to replicate this.
Is there a reason why this might work? why?
Is there an easier way to do this?
Do i have to add a xmlns:dc or xmlns:itunes or whatever tag for all the different kinds of tags there is out there?
I need a simple and secure way of dealing with this, no matter what comes in the input rss feeds.

A quick snippet with XDocument:
XNamespace dc = #"http://purl.org/dc/elements/1.1/";
XElement doc = new XElement("items",
new XAttribute(XNamespace.Xmlns + "dc", dc),
new XElement("item",
new XElement("title", "test"),
new XElement(dc + "creator", "tomatoes"))) ;
Gives
<items xmlns:dc="http://purl.org/dc/elements/1.1/">
<item>
<title>test</title>
<dc:creator>tomatoes</dc:creator>
</item>
</items>

Related

XML adding "ns0" prefix to new XML doc

I am creating a somewhat complex XML file and I need to include the "ns0" prefix to each XmlElement.
Here are the opening lines of code:
var asnFile = new XmlDocument();
var dec = asnFile.CreateXmlDeclaration("1.0", "UTF-8",null);
asnFile.AppendChild(dec);
var advancedShippingNoticesNode = asnFile.CreateElement("AdvancedShippingNotices");
var advancedShippingNoticesNodeAttr = asnFile.CreateAttribute("xmlns");
advancedShippingNoticesNodeAttr.Value = "http://www.testschema.com/schema/AdvancedShippingNotices.xsd";
advancedShippingNoticesNode.Attributes.Append(advancedShippingNoticesNodeAttr);
asnFile.AppendChild(advancedShippingNoticesNode);
var asnIdNode = asnFile.CreateElement("ASNID");
asnIdNode.InnerText = "TestASN";
advancedShippingNoticesNode.AppendChild(asnIdNode);
I have tried adding a prefix in the following way but the prefix never shows up when opening the saved XML file.
advancedShippingNoticesNode.Prefix = "ns0";
I read here that I'm not able to add a prefix, but since I am creating the XmlDocument on the fly and not loading it from an existing file, I feel like this doesn't apply to my case.
I did try the sample solution in the question/answer linked above, but this XmlDocument has so much nesting that it's hard for me to translate that solution into a working solution for myself. I also feel like that is far too complex just to add a prefix.
Is there a simple way to add a prefix to a new XmlDocument?

Getting specific XML Node from XmlWriter C#

I would like to know if there is any chance of getting specific XML node from XmlWriter.
I've got a method, which exports employees to XML document and has a structure, something like:
<header>
<agency>
...
<employees>
<employee>
....
</employee>
<empolyee>
...
</employee>
</employees>
...
in for cycle in this method each employee is created and filled with elements. The whole method for exporting this XML has already been written.
Now, at the end of the loop, I would like to create a string, which contains the node of the employee, but I don't know, how to do it with the writer which is filled with the data in the loop.
I've tried to search for some solution, but I didn't find anything good for my case. I've considered something like MemoryStream, but I think that you need to pass the stream while you are creating the XmlWriter, which in my case firstly writes the header node etc.
Could you please help me with this? I don't want to create another Writer or Reader and write the nodes twice. At first, I would like to find some better solution.
Thank you!
EDIT: I'm pasting some of the code as you want:
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.IndentChars = ("\t");
settings.OmitXmlDeclaration = false;
XmlWriter xmlWriter = XmlWriter.Create(filenameXml, settings);
xmlWriter.WriteStartElement("message");
xmlWriter.WriteStartElement("data");
xmlWriter.WriteStartElement("header");
xmlWriter.WriteElementString("agency", agencyNumber);
xmlWriter.WriteElementString("date", DateTime.Now.ToShortDateString());
xmlWriter.WriteEndElement();
xmlWriter.WriteStartElement("employees");
//Loading of employees from DB and some other work
for (int i = 0; i < employeesDs.RowCount(); i++)
{
xmlWriter.WriteStartElement("employee");
//A lot of work - xmlWriter.WriteElementString so many times or WriteStartElement (address, numbers, dates etc)
xmlWriter.WriteEndElement(); // employee
//HERE I would like to add new method which stores whole Employee element created in this iteration i to the string
//After that I need to store this string to DB to VARCHAR MAX, but I don't know how to get the string of the Employee
//from the writer in this part of the code
}
//another work - closing and flushing xmlwriter, disposing etc and exporting whole xml document

XmlWriter create SEPA XML

I'm trying to create a SEPA XML with XmlWriter. The created XML has to look like this example:
http://www.ebics.de/fileadmin/unsecured/anlage3/anlage3_pain008/pain_ex/pain.008.003.02.xml
This is my code so far:
public void generateSepaXml()
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
using (XmlWriter writer = XmlWriter.Create("C:\\Users\\Sybren\\Documents\\test.xml",settings))
{
String messageId = "Message ID";
writer.WriteStartDocument();
writer.WriteStartElement("Document"); //Document start
writer.WriteAttributeString("xsi",#"schemaLocation=""urn:iso:std:iso:20022:tech:xsd:pain.008.003.02 pain.008.003.02.xsd");
writer.WriteStartElement("CstmrDrctDbtInitn"); // CstmrDrctDbtInitn tag start
writer.WriteStartElement("GrpHdr"); //GrpHeader tag start
writer.WriteStartElement("MsgId",messageId); //Message tag start
writer.WriteEndElement(); //Message tag end
writer.WriteEndElement(); //GrpHdr end
writer.WriteEndElement(); //CstmrDrctDbtInitn tag end
writer.WriteEndElement(); //Document end
writer.WriteEndDocument();
}
}
The result of my code looks like this:
How can set the text in the Document tag the same to the Document tag in my example?
And for the message tag xmlns is added when I open the xml in IE. The xmlns tag isn't visible in the result image above (in Firefox). How to remove this tag? And how can I set the text in the MsgId the same to the MsgId in the linked example xml? Maybe XMLWriter isn't the best option in my case? If so what is another better option?
Ofcourse this is only a small part of the XML but if know how it works , I think I can do the rest myself.
Just a well-meant advise (since I have some experience in the area of financial data-exchange, especially with SEPA)... I would not try to implement that shit using such a low-level XML API; there´re better ways to create complex XML documents than assembling them piece-by-piece with XmlWriter.
Since SEPA documents can become quite big, achieving it the way you´re trying will probably result in unmaintainable and error-prone spaghetti code. Instead, I would recommend to investigate into a model-driven approach (one for each SEPA use-case to implement) and then use a generator (consider using an intermediate XML format that can be transformed using XSLT) or a text-processor to produce the final output...

Read entire elements from an XML network stream

I am writing a network server in C# .NET 4.0. There is a network TCP/IP connection over which I can receive complete XML elements. They arrive regularly and I need to process them immediately. Each XML element is a complete XML document in itself, so it has an opening element, several sub-nodes and a closing element. There is no single root element for the entire stream. So when I open the connection, what I get is like this:
<status>
<x>123</x>
<y>456</y>
</status>
Then some time later it continues:
<status>
<x>234</x>
<y>567</y>
</status>
And so on. I need a way to read the complete XML string until a status element is complete. I don't want to do that with plain text reading methods because I don't know in what formatting the data arrives. I can in no way wait until the entire stream is finished, as is often described elsewhere. I have tried using the XmlReader class but its documentation is weird, the methods don't work out, the first element is lost and after sending the second element, an XmlException occurs because there are two root elements.
Try this:
var settings = new XmlReaderSettings
{
ConformanceLevel = ConformanceLevel.Fragment
};
using (var reader = XmlReader.Create(stream, settings))
{
while (!reader.EOF)
{
reader.MoveToContent();
var doc = XDocument.Load(reader.ReadSubtree());
Console.WriteLine("X={0}, Y={1}",
(int)doc.Root.Element("x"),
(int)doc.Root.Element("y"));
reader.ReadEndElement();
}
}
If you change the "conformance level" to "fragment", it might work with the XmlReader.
This is a (slightly modified) example from MSDN:
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
XmlReader reader = XmlReader.Create(streamOfXmlFragments, settings);
You could use XElement.Load which is meant more for streaming of Xml Element fragments that is new in .net 3.5 and also supports reading directly from a stream.
Have a look at System.Xml.Linq
I think that you may well still have to add some control logic so as to partition the messages you are receiving, but you may as well give it a go.
I'm not sure there's anything built-in that does that.
I'd open a string builder, fill it until I see a </status> tag, and then parse it using the ordinary XmlDocument.
Not substantially different from dtb's solution, but linqier
static IEnumerable<XDocument> GetDocs(Stream xmlStream)
{
var xmlSettings = new XmlReaderSettings() { ConformanceLevel = ConformanceLevel.Fragment };
using (var xmlReader = XmlReader.Create(xmlStream, xmlSettings))
{
var xmlPathNav = new XPathDocument(xmlReader).CreateNavigator();
foreach (var selectee in xmlPathNav.Select("/*").OfType<XPathNavigator>())
yield return XDocument.Load(selectee.ReadSubtree());
}
}
I ran into a similar problem in PowerShell, but the asker's question was in C#, so I've attempted to translate it (and verified that it works). Here is where I found the clue that got me over the last little bumps (". . .The way the XPathDocument does its magic is by creating a “transparent” root node, and holding the fragments from it. I say it’s transparent because your XPath queries can use the root node axis and still get properly resolved to the fragments. . .")
The fragments of XML I'm working with happen to be smallish. If you had bigger chunks, you'd probably want to look into XStreamingElement - it can add a lot of complexity but also greatly decrease memory usage when dealing with large volumes of XML.

Is there a quick way to format an XmlDocument for display in C#?

I want to output my InnerXml property for display in a web page. I would like to see indentation of the various tags. Is there an easy way to do this?
Here's a little class that I put together some time ago to do exactly this.
It assumes that you're working with the XML in string format.
public static class FormatXML
{
public static string FormatXMLString(string sUnformattedXML)
{
XmlDocument xd = new XmlDocument();
xd.LoadXml(sUnformattedXML);
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
XmlTextWriter xtw = null;
try
{
xtw = new XmlTextWriter(sw);
xtw.Formatting = Formatting.Indented;
xd.WriteTo(xtw);
}
finally
{
if(xtw!=null)
xtw.Close();
}
return sb.ToString();
}
}
You should be able to do this with code formatters. You would have to html encode the xml into the page first.
Google has a nice prettifyer that is capable of visualizing XML as well as several programming languages.
Basically, put your XML into a pre tag like this:
<pre class="prettyprint">
<link href="prettify.css" type="text/css" rel="stylesheet" />
<script type="text/javascript" src="prettify.js"></script>
</pre>
Use the XML Web Server Control to display the content of an xml document on a web page.
EDIT: You should pass the entire XmlDocument to the Document property of the XML Web Server Control to display it. You don't need to use the InnerXml property.
If identation is your only cocern and if you can afford to launch xternall process, you can process xml file with HTML Tidy console tool (~100K).
The code is:
tidy --input-xml y --output-xhtml y --indent "1" $(FilePath)
Then you can display idented string on web page once you get rid of special chars.
It would be also easy to create recursive function that makes such output - simply iterate nodes starting from the root and enter next recursion step for child node, passing identation as a parameter to each new recursion call.
Check out the free Actipro CodeHighlighter for ASP.NET - it can neatly display XML and other formats.
Or are you more interested in actually formatting your XML? Then have a look at the XmlTextWriter - you can specify things like Format (indenting or not) and the indent level, and then write out your XML to e.g. a MemoryStream and read it back from there into a string for display.
Marc
Use an XmlTextWriter with the XmlWriterSettings set up so that indentation is enabled. You can use a StringWriter as "temporary storage" if you want to write the resulting string onto screen.

Categories