I want to save the following string in an XML File:
<text><![CDATA[<p>what is my pet name</p>]]></text>
When I am saving it, it looks like:
<text><![CDATA[<p>what is my pet name</p>]]></text>
I have tried File.WriteAllText(), XmlDocument.Save() methods but didnt get the proper response.
basically everywhere other than opening and closing tags in the XML, < is replaced by < and > is replaced by >.
What is happening is that the XML parser is encoding your string. When you try to access the string later, it can be decoded again at that time.
What I suggest, is that you either try to load the text as into a new 'XmlDocument' with XmlDocument.LoadXml(string s), and then import that into your current document, or leave it encoded.
You should not try to both use an XML parser, and manually add text at the same time.
I guess you add the CDATA manually and the XML writing mechanism correctly escapes your CDATA because it treats it as text content. Instead explicitly add a CDATA section with just the contents.
If you are using the old XML API (System.XML), then use this method to create the CDATA Section: http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.createcdatasection
Then append the node to the element just like in the example in the link.
XML is being written correctly.
XML has special characters that are reserved for commands, just like C# reserves words like "if" and "string".
XML is encoding your string for storage. What you need to do is when you retrieve your string, run it through a similar decode process.
Use this: HttpServerUtility.HtmlDecode(encodedString)
Reference:
Decode XML returned by a webservice (< and > are replaced with < and >)?
Related
I have looked at most of the parsing of XML into SQL with special Chars and could not find anything relevant that didnt include having control over the XML output itself.
I understand that the way to do this would be make sure all special characters are escaped, the issue i have is that i do not have control over the XML that gets generated until after the fact. The output i could have could be something like the below. I need to find a way to replace all the special characters within the without touching the characters that are valid for the xml. This could be done using a CLR or in Straight up SQL, i will even consider other options.
<?xml version="1.0" ?>
<A>
<B>this is my test <myemail#gmail.com</B>
<B>>>>this is another test<<<</B>
</A>
You are probably looking for something similar to HtmlEncode() of the contents. Loop through your XML structure and encode the fields you need to prior to writing to the DB, and perform the HtmlDecode() on the read from the DB.
https://msdn.microsoft.com/en-us/library/w3te6wfz%28v=vs.110%29.aspx
IF you are sure the XML element names are valid then the solution could be using regular expressions to parse the XML as text and substitute the & with & and the > with > and < with <.
Have a look here regular expression to find special character & between xml tags for example.
I am writing some code to send an XML document to a Servlet. For one of the XML tag fields, I need to fill it with a string that is retrieved from an external file.
I have found a couple of external files that contain some < and > characters. The servlet will not accept this XML document in this case.
If I remove the < and > characters from the XML tag field, the XML document is sent correctly.
As I am going to be using 1000s of external files, I am sure there will be other occurances of "illegal" characters. Is there an XML encode or similar function that can be used to format a string such that it can be stored in an XML tag with no errors?
I have tried HTML encode, but this does not work. Is there an equivilent action for XML?
If you really want to build your own XML strings, put your external character in a CDATA tag. You just need to make sure that the end sequence (which is ]]>) is not in the external file. If you find this, you have to encore or replace that with some other string before. So:
<![CDATA[*your external stuff containing < and > here*]]>
I have an XML Document where it contains data with < character.
<Tunings>
<Notes>Norm <150 mg/dl</Notes>
</Tunings>
The code I am using is:
StreamReader objReader = new StreamReader(strFile);
string strData = objReader.ReadToEnd();
XmlDocument doc = new XmlDocument();
// Here I want to strip those characters from "strData"
doc.LoadXml(strData);
So it gives error:
Name cannot begin with the '1' character, hexadecimal value 0x31.
So is there a way to strip those characters from XML before Load calls.?
If this is only occurring in the <Notes> section, I'd recommend you modify the creation of the XML file to use a CDATA tag to contain the text in Notes, like this:
<Notes><![CDATA[Norm <150 mg/dl]]></Notes>
The CDATA tag tells XML parsers to not parse the characters between the <![CDATA[ and ]]>. This allows you have characters in your XML that would otherwise break the parsing.
You can use the CDATA tag for any situation where you know (or have reasonable expectations) of special characters in that data.
Trying to handle special characters at parsing time (without the CDATA) will be more labor intensive (and frustrating) than simply fixing the creation of the XML in the first place, IMO. Plus, "Norm <150 mg/dl" is not the same thing as "Norm 150 mg/dl", and that distinction might be important for whoever needs that information.
As the comments state, you do not have an XML document. If you know that the only way that these documents deviate from legal XML is as in your example, you could run the file through a regular expression and replace <(?:\d) with &. This will find the < adjacent to a number and properly encode it.
I need to create a xml file which is to be converted to an excel file(.xls), and this means that the xml has a lot of meta info in it. Its easy to write all the contents into the xml file as a text file.
var sw = new FileInfo(tempReportFilePath).CreateText();
sw.WriteLine("meta info and other tags")
However, this method does not escape characters, and when the data contains '<' or '>' or '&' etc. the xml is rendered invalid and the .xls file does not open. I can easily do a replace ( '<' with '<' and so on), but for performance reasons, this method is not suitable.
The other alternative is to use xml text writer, but with a ton of meta info, it will mean writing a lot of tags in code. With sw.WriteLine('stuff'), I could simply put parts of meta info in one tag (as a string) and write them to file. Using xslt, the problem I faced was that tags required spaces. For example, for tabular data, the top row fields could have spaces.
How to go about creating a well formed xml file with a lot of meta info, and where the chareacters ('<', '>' etc) are excaped?
Uri.EscapeDataString(string stringToEscape);
XDocument tutorials.
Why not create xls in the first place, there is a nice library to do so :
http://npoi.codeplex.com/
I used the WriteRaw method for writing the meta info tags. For the other data, which was required to be escaped, I used WriteString method.
I am using XDocument to switch a value in an xml document.
In the new value I need to use the character '&' (ampersand)
but after XDocument.save() the xml has & instead!
I tried using encoding and stuff… nothing worked
XDocument is doing exactly what it's supposed to do.
& is invalid XML. (it's an unfinished character/entity reference)
& means "Start of an entity" in XML so if you want to include an & as data you must express it as an entity — & (or use it in a CDATA block).
What you describe is normal behaviour and the XML would break otherwise.
There are two options. Either to ensure proper XML encoding/decoding of all your content in the XML document. Remember that HTML and XML encoding/decoding is slightly different.
Option two is to use base64 encoding on whatever content in the xml that might contain invalid elements.
Is your output file app.config supposed to be an XML file?
If it is, then the & must be escaped as &.
If it isn't, then you should be using the text output method instead of the xml output method: use <xsl:output method='text'/>.
PS: this question appears to be a duplicate of How can I add an ampersand for a value in a ASP.net/C# app config file value