finding the namespace from an xml stream in C# - c#

I am having an app that gets an xml stream continuosly and then use it to process some information. So far i had only one name space for all the streams and i did it easily as
doc = new XPathDocument(ds + "/probe");
navigator = doc.CreateNavigator();
ns = new XmlNamespaceManager(navigator.NameTable);
ns.AddNamespace("m", "urn:namsp.org:namSpDev:1.1");
nodes = navigator.Select("//m:DataItem", ns);
while (nodes.MoveNext())
{
node = nodes.Current;
}
But now i have a problem. THere is another stream that has the namespace
"urn:namsp.org:namSpDev:1.2"
So in my application i have to check the stream and see which namespace it is and then only i can add the app name space using
ns.AddNamespace("m", "urn:namsp.org:namSpDev:1.1");
How should i do this?
I tried converting the doc.toString() and used .contains() to check if any one of this passes but it doesnt work.

These links may be useful:
Detecting Xml namespace fast
Parsing XML with elements containing colon / namespace
How to Select XML Nodes with XML Namespaces from an XmlDocument?

What i finally did is retrieved the xml stream and converted into a string. Then using
string.contains("xmlns")
I splitted the tag and used the tag identifier to get the value of the name space. This works for me as there will not be much difference in the name spaces in the stream that i use.

Related

Generate XML in C# Using Specific XML Namespace

Target XML I am trying to achieve:
<Row><Data ss:Type="String">value</Data></Row>
Output I am currently getting:
<Row xmlns=""><Data Type="Number">0</Data></Row>
I am trying to create that target XML code using the System.Xml library, but the namespace prefix is not showing up when I create new elements. Snippet of the code that is generating the above output:
XmlElement eRow = xDoc.CreateElement("Row");
XmlElement eData = xDoc.CreateElement("Data");
XmlAttribute xAt = xDoc.CreateAttribute("ss", "Type", null);
xAt.Value = "Number";
eData.Attributes.Append(xAt);
eData.InnerText = "0";
eRow.AppendChild(eData);
I am trying to append this XML to a file that already exists. I have loaded the file as
XmlDocument xTemp = new XmlDocument();
xTemp.Load(templatePath);
and there are already namespaces in DocumentElement.Attributes that have already declared the prefix that I want to use: <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">. Essentially, I am trying to get the "ss" prefix to show up before "Type" like in the target I provided above. Additionally, the output is displaying xmlns="" as an attribute in the "Row" tag, something that I never added. I assume these are both issues with the namespace not being declared, but as mentioned above, it should already be declared in the original document that I loaded.
How can I generate the target XML code I want?
You have created your attribute xAt without specifying the namespace uri, which is equivalent to empty string namespace uri (see the corresponding MSDN doc here), that is certainly why you get the <Row xmlns="">
Actually you need to specify the exact namespace uri for it to work as you expect it.
Let me illustrate using the namespace uri you have given in your question (Illustration very similar to your initial code but might have a few small differences that you can easily modify).
String namespaceUri = "urn:schemas-microsoft-com:office:spreadsheet";
XmlDocument xDoc = new XmlDocument();
XmlElement workbook = xDoc.CreateElement("ss", "Workbook", namespaceUri);
XmlElement rows = xDoc.CreateElement("Rows");
At this step I can assume that I have an XmlDocument similar to what you have after initially loading your file. My XmlDocument has the workbook node as its DocumentElement, it uses the given prefix and namespace uri.
Now we can create the attribute:
var attribute = xDoc.CreateAttribute("ss", "Type", "urn:schemas-microsoft-com:office:spreadsheet");
attribute.Value = "String";
The namespace uri should be specified correctly otherwise it won't be correctly rendered. When this attribute is used, since the namespace it is refering to is found on the nesting element (workbook), it is not necessary to mention it again here, and the framework will automatically remove the reference to the namespace uri.
Now we can go ahead and create the Row and data elements and add the attribute to the collection of attributes of the Data element.
XmlElement eRow = xDoc.CreateElement("Row");
XmlElement eData = xDoc.CreateElement("Data");
eData.Attributes.Append(attribute);
eData.InnerText = "value";
eRow.AppendChild(eData);
rows.AppendChild(eRow);
workbook.AppendChild(rows);
xDoc.AppendChild(workbook);
We can then display the document, for example with:
Console.WriteLine(xDoc.OuterXml);
Result:
<ss:Workbook xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"><Rows><Row><Data ss:Type="String">value</Data></Row></Rows></ss:Workbook>
I hope this helps.

How can I ignore the namespace in an XML document in c#?

I'm trying to open a XML file in c#, find a node by attribute name, which is working fine and then displaying the name of an XML attribute in the same node.
My code is simple (as I pinched it from other sources!) and works on my test XML doc. However, when I try it with an actual file it doesn't work. I've been pulling my hair out (not that I have much left) and have discovered it's because of the xmlns attribute in the actual files I'm using. The path to the namespace does not exist.
My code is as follows:
XmlDocument doc = new XmlDocument();
doc.Load(#"c:\deroschedule\test.sym");
var orient = doc.SelectSingleNode("//Attr[#name='Orientation]/#value");
the above code works perfectly when xmlns is not included in the file. However, when xmlns is included the orient variable is null. The xmlns path doesn't exist, when i try to navigate to it in a browser I get a 404 error.
Not sure what a xml namespace is to be honest, but I have thousands of these files and can't manually edit them. Is there an easy way to get C# to overlook the namespace and just pretend it's not there? I've tried with Xpath, but that just blew my mind!
Ok I figured it out for myself. Thought I would post the answer here even though there are thousands of other answers apparently.
Where I went wrong was misunderstanding what the namespace actually does. Anyway I had to use xmlnamespacenmanager to declare the same namespace as in the xml file. Then I had to use the namespace in the query.
XmlDocument doc = new XmlDocument();
doc.Load(#"C:\deroschedule\test6.sym");
XmlNamespaceManager ns = new XmlNamespaceManager(doc.NameTable);
ns.AddNamespace("ma", "http://www.yournamespaceinfohere.com/");
var orient = doc.SelectSingleNode("//ma:attr[#name='Orientation']/#value", ns);
Now my next challenge is to try and read the bmp from the xml file, should be easy, right?!

Fastest way to add new node to end of an xml?

I have a large xml file (approx. 10 MB) in following simple structure:
<Errors>
<Error>.......</Error>
<Error>.......</Error>
<Error>.......</Error>
<Error>.......</Error>
<Error>.......</Error>
</Errors>
My need is to write add a new node <Error> at the end before the </Errors> tag. Whats is the fastest way to achieve this in .net?
You need to use the XML inclusion technique.
Your error.xml (doesn't change, just a stub. Used by XML parsers to read):
<?xml version="1.0"?>
<!DOCTYPE logfile [
<!ENTITY logrows
SYSTEM "errorrows.txt">
]>
<Errors>
&logrows;
</Errors>
Your errorrows.txt file (changes, the xml parser doesn't understand it):
<Error>....</Error>
<Error>....</Error>
<Error>....</Error>
Then, to add an entry to errorrows.txt:
using (StreamWriter sw = File.AppendText("logerrors.txt"))
{
XmlTextWriter xtw = new XmlTextWriter(sw);
xtw.WriteStartElement("Error");
// ... write error messge here
xtw.Close();
}
Or you can even use .NET 3.5 XElement, and append the text to the StreamWriter:
using (StreamWriter sw = File.AppendText("logerrors.txt"))
{
XElement element = new XElement("Error");
// ... write error messge here
sw.WriteLine(element.ToString());
}
See also Microsoft's article Efficient Techniques for Modifying Large XML Files
First, I would disqualify System.Xml.XmlDocument because it is a DOM which requires parsing and building the entire tree in memory before it can be appended to. This means your 10 MB of text will be more than 10 MB in memory. This means it is "memory intensive" and "time consuming".
Second, I would disqualify System.Xml.XmlReader because it requires parsing the entire file first before you can get to the point of when you can append to it. You would have to copy the XmlReader into an XmlWriter since you can't modify it. This requires duplicating your XML in memory first before you can append to it.
The faster solution to XmlDocument and XmlReader would be string manipulation (which has its own memory issues):
string xml = #"<Errors><error />...<error /></Errors>";
int idx = xml.LastIndexOf("</Errors>");
xml = xml.Substring(0, idx) + "<error>new error</error></Errors>";
Chop off the end tag, add in the new error, and add the end tag back.
I suppose you could go crazy with this and truncate your file by 9 characters and append to it. Wouldn't have to read in the file and would let the OS optimize page loading (only would have to load in the last block or something).
System.IO.FileStream fs = System.IO.File.Open("log.xml", System.IO.FileMode.Open, System.IO.FileAccess.ReadWrite);
fs.Seek(-("</Errors>".Length), System.IO.SeekOrigin.End);
fs.Write("<error>new error</error></Errors>");
fs.Close();
That will hit a problem if your file is empty or contains only "<Errors></Errors>", both of which can easily be handled by checking the length.
The fastest way would probably be a direct file access.
using (StreamWriter file = File.AppendText("my.log"))
{
file.BaseStream.Seek(-"</Errors>".Length, SeekOrigin.End);
file.Write(" <Error>New error message.</Error></Errors>");
}
But you lose all the nice XML features and may easily corrupt the file.
I would use XmlDocument or XDocument to Load your file and then manipulate it accordingly.
I would then look at the possibility of caching this XmlDocument in memory so that you can access the file quickly.
What do you need the speed for? Do you have a performance bottleneck already or are you expecting one?
How is your XML-File represented in code? Do you use the System.XML-classes? In this case you could use XMLDocument.AppendChild.
Try this out:
var doc = new XmlDocument();
doc.LoadXml("<Errors><error>This is my first error</error></Errors>");
XmlNode root = doc.DocumentElement;
//Create a new node.
XmlElement elem = doc.CreateElement("error");
elem.InnerText = "This is my error";
//Add the node to the document.
if (root != null) root.AppendChild(elem);
doc.Save(Console.Out);
Console.ReadLine();
Here's how to do it in C, .NET should be similar.
The game is to simple jump to the end of the file, skip back over the tag, append the new error line, and write a new tag.
#include <stdio.h>
#include <string.h>
#include <errno.h>
int main(int argc, char** argv) {
FILE *f;
// Open the file
f = fopen("log.xml", "r+");
// Small buffer to determine length of \n (1 on Unix, 2 on PC)
// You could always simply hard code this if you don't plan on
// porting to Unix.
char nlbuf[10];
sprintf(nlbuf, "\n");
// How long is our end tag?
long offset = strlen("</Errors>");
// Add in an \n char.
offset += strlen(nlbuf);
// Seek to the END OF FILE, and then GO BACK the end tag and newline
// so we use a NEGATIVE offset.
fseek(f, offset * -1, SEEK_END);
// Print out your new error line
fprintf(f, "<Error>New error line</Error>\n");
// Print out new ending tag.
fprintf(f, "</Errors>\n");
// Close and you're done
fclose(f);
}
The quickest method is likely to be reading in the file using an XmlReader, and simply replicating each read node to a new stream using XmlWriter When you get to the point at which you encounter the closing </Errors> tag, then you just need to output your additional <Error> element before coninuing the 'read and duplicate' cycle. This way is inevitably going to be harder than than reading the entire document into the DOM (XmlDocument class), but for large XML files, much quicker. Admittedly, using StreamReader/StreamWriter would be somewhat faster still, but pretty horrible to work with in code.
Using string-based techniques (like seeking to the end of the file and then moving backwards the length of the closing tag) is vulnerable to unexpected but perfectly legal variations in document structure.
The document could end with any amount of whitespace, to pick the likeliest problem you'll encounter. It could also end with any number of comments or processing instructions. And what happens if the top-level element isn't named Error?
And here's a situation that using string manipulation fails utterly to detect:
<Error xmlns="not_your_namespace">
...
</Error>
If you use an XmlReader to process the XML, while it may not be as fast as seeking to EOF, it will also allow you to handle all of these possible exception conditions.
I attempted to use code other answers had suggested but ran into an issue where sometimes calling .length on my strings was not the same as the number of bytes for the string so I was inconsistently losing characters. I modified it to get the byte count instead.
var endTag = "</Errors>";
var nodeText = GetNodeText();
using (FileStream file = File.Open("my.log", FileMode.Open, FileAccess.ReadWrite))
{
file.BaseStream.Seek(-(Encoding.UTF8.GetByteCount(endTag)), SeekOrigin.End);
fileStream.Write(Encoding.UTF8.GetBytes(nodeText), 0, Encoding.UTF8.GetByteCount(nodeText));
fileStream.Write(Encoding.UTF8.GetBytes(endTag), 0, Encoding.UTF8.GetByteCount(endTag));
}

UTF-8 encoding issue

I am trying to fetch data from rss feed (feed location is http://www.bgsvetionik.com/rss/ ) in c# win form. Take a look at the following code:
public static XmlDocument FromUri(string uri)
{
XmlDocument xmlDoc;
WebClient webClient = new WebClient();
using (Stream rssStream = webClient.OpenRead(uri))
{
XmlTextReader reader = new XmlTextReader(rssStream);
xmlDoc = new XmlDocument();
xmlDoc.XmlResolver = null;
xmlDoc.Load(reader);
}
return xmlDoc;
}
Although xmlDoc.InnerXml contains XML definition with UTF-8 encoding, I get š instead of š etc.
How can I solve it?
The feed's data is incorrect. The š is inside a CDATA section, so it isn't being treated as an entity by the XML parser.
If you look at the source XML, you'll find that there's a mixture of entities and "raw" characters, e.g. čišćenja in the middle of the first title.
If you need to correct that, you'll have to do it yourself with a Replace call - the XML parser is doing exactly what it's meant to.
EDIT: For the replacement, you could get hold of all the HTML entities and replace them one by one, or just find out which ones are actually being used. Then do:
string text = element.Value.Replace("š", "š")
.Replace(...);
Of course, this means that anything which is actually correctly escaped and should really be that text will get accidentally replaced... but such is the problem with broken data :(

Read in an XML String with Namespaces for Use in an XSL Transformation

In an ASP.NET 2.0 website, I have a string representing some well-formed XML. I am currently creating an XmlDocument object with it and running an XSL transformation for display in a Web form. Everything was operating fine until the XML input started to contain namespaces.
How can I read in this string and allow namespaces?
I've included the current code below. The string source comes from an HTML encoded node in a WordPress RSS feed.
XPathNavigator myNav= myPost.CreateNavigator();
XmlNamespaceManager myManager = new XmlNamespaceManager(myNav.NameTable);
myManager.AddNamespace("content", "http://purl.org/rss/1.0/modules/content/");
string myPost = HttpUtility.HtmlDecode("<post>" +
myNav.SelectSingleNode("//item[1]/content:encoded", myManager).InnerXml +
"</post>");
XmlDocument myDocument = new XmlDocument();
myDocument.LoadXml(myPost.ToString());
The error is on the last line:
"System.Xml.XmlException: 'w' is an undeclared namespace. Line 12, position 201. at System.Xml.XmlTextReaderImpl.Throw(Exception e) ..."
Your code looks right.
The problem is probably in the xml document you're trying to load.
It must have elements with a "w" prefix, without having that prefix declared in the XML document
For example, you should have:
<test xmlns:w="http://...">
<w:elementInWNamespace />
</test>
(your document is probably missing the xmlns:w="http://")
Gut feel - one of the namespaces declared in //content:encoding is being dropped (probably because you're using the literal .InnerXml property)
What's 'w' namespace evaluate to in the myNav DOM? You'll want to add xmlns:w= to your post node. There will probably be others too.

Categories