I have an & character in one of the xml nodes as below.
<dependents>9 & 5</dependents>
When I try to load the file as below, it is giving an error "An error occured while parsing EntityName.". Is it possible to escape this character and load successfully? Thank you.
m_InputXMLDoc = new XmlDocument();
if (System.IO.File.Exists(InputFile))
{
m_InputXMLDoc.Load(InputFile);
}
Your XML is invalid.
You need to change it to &.
Use a CDATA section
<dependents><![CDATA[9 & 5]]></dependents>
Related
I am trying to read webpage text by using Xml Document:
XmlDocument document = new XmlDocument();
string site = "https://emailhunter.co/search/a-bs.com";
document.Load(site);
string allText = document.InnerText;
This is the exception i get:
An unhandled exception of type 'System.Xml.XmlException' occurred in System.Xml.dll
Additional information: The ';' character, hexadecimal value 0x3B, cannot be included in a name. Line 5, position 383.
I really don't understand what's wrong here. If you can give me some tips, I would really appreciate it.
You can use the Html Agility Pack like written in this post: What is the best way to parse html in C#?
I got this error while Parse an string to XDocument after edit and save it. But anyone can help me locate error position - The Line 1, position 10475. How can i get that position ???
System.Xml.XmlException: Unexpected XML declaration. The XML
declaration must be the first node in the document, and no white space
characters are allowed to appear before it. Line 1, position 10475.
if (storage.FileExists("APPSDATA.xml"))
{
var reader = new StreamReader(new IsolatedStorageFileStream("APPSDATA.xml", FileMode.Open, storage));
string xml = reader.ReadToEnd();
var xdoc = XDocument.Parse(xml);//error here
reader.Close();
The XML is big, this is jus a part of it
<?xml version="1.0" encoding="UTF-8"?>
<Ungdungs>
<Ungdung>
<Name>HERE City Lens</Name>
<Id>b0a0ac22-cf9e-45ba-8120-815450e2fd71</Id>
<Path>/Icon/herecitylens.png</Path>
<Version>1.0.0.0</Version>
<Category>HERE</Category>
<Date>Uknown</Date>
</Ungdung>
<Ungdung>
<Name>HERE Transit</Name>
<Id>adfdad16-b54a-4ec3-b11e-66bd691be4e6</Id>
<Path>/Icon/heretransit.png</Path>
<Version>1.0.0.0</Version>
<Category>HERE</Category>
<Date>Uknown</Date>
</Ungdung>
Make sure your <?xml tag is the first thing in the document (and that it doesn't have anything before that, this includes whitespace).
You can have <?xml only once per document, so if you have a large chunk of XML and you have this tag repeated somewhere down the lines your document won't be valid.
In my case this was related to the byte order mark - BOM. I opened the file in Notepad++ selected encoding "encode in UTF-8 without BOM" and was then able to see the annoying charater and delete it.
This error might occur if you previously saved the xml file with the boolean 'append = true'.
Make if 'false', it should work.
I have found that there exists "&" in my code that's why error is showing
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(dsExport.Tables[0].Rows[i]["SubmissionData"].ToString());
The "&" is there in submissiondata . How can I remove the special characters so that the error doesn't show again ?
Thanks in advance
Replace your "&" with "&"
& is not an illegal XML character. This is not your problem. You need to log the XML that you receive, before you ask anyone about your problem. You probably need to
HTTPUtility.HTMLDecode(yourformdata)
But I smell SQL injection a long way.
Try:
XmlDocument xmlDoc = new XmlDocument();
string str = dsExport.Tables[0].Rows[i]["SubmissionData"].ToString();
str = System.Web.HTTPUtility.HTMLDecode(str);
xmlDoc.LoadXml(str);
Sorry I am replying too late but I hope it will help some other guys. This issue is because of the encoding of special characters in XML. Please find the below link which may help you https://support.google.com/checkout/sell/answer/70649?hl=en
Thanks,
Vijay Sherekar
HttpContext.Current.Response.ContentType = "text/xml";
HttpContext.Current.Response.ContentEncoding = Encoding.UTF8;
HttpPostedFile file = HttpContext.Current.Request.Files[0];
// If file.InputSteam contains an "&", exception is thrown
XDocument doc = XDocument.Load(XmlReader.Create(file.InputStream));
HttpContext.Current.Response.Write(doc);
Is there any way to replace & with & before generating the xml document? My current code crashes whenever the file contains a &.
Thanks
Your code will only crash if it's not valid XML. For example, this should be fine:
<foo>A & B</foo>
If you've actually got
<foo>A & B</foo>
Then you haven't got an XML file. You may have something which looks a bit like XML, but it isn't really valid XML.
The best approach here isn't to transform the data on the fly - it's to fix the source of the data so that it's real XML. There's really no excuse for anything producing invalid XML in this day and age.
Additionally, there's no reason to use XmlReader.Create here - just use
XDocument doc = XDocument.Load(file.InputStream);
use HttpEncoder.HtmlEncode()
http://msdn.microsoft.com/en-us/library/system.web.util.httpencoder.aspx
you can use "& amp;" to escape "&".
In xml document, there are some characters should be escaped.
& ---- &
< ---- <
> ---- >
" ---- "
' ---- '
I'm using c# to interact with a database that has an exposed REST API. The table that I'm interested in contains forum posts, some of which themselves contain xml.
Whenever my result set contains a post that has xml, my application throws an error as follows:
Exception Details: System.Xml.XmlException: '>' is an unexpected token. The expected token is '"' or '''. Line 1, position 62.
And this is the line that fails:
Line 44: ds.ReadXml(xmlData);
And this is the code I'm using:
var webClient = new WebClient();
string searchString = searchValue.Text;
string requestUrl = "http://myserver/restapi.ashx/search.xml?pagesize=4&pageindex=0&query=";
requestUrl += searchString;
XmlReaderSettings settings = new XmlReaderSettings();
settings.ProhibitDtd = false;
XmlReader xmlData = XmlReader.Create(webClient.OpenRead(requestUrl),settings);
DataSet ds = new DataSet();
ds.ReadXml(xmlData);
Repeater1.DataSource = ds.Tables[1];
Repeater1.DataBind();
And this is the type of XML record that it's choking on (the stuff in the node is causing the problem):
<SearchResults PageSize="1" PageIndex="0" TotalCount="342">
<SearchResult>
<ContentId>994</ContentId>
<Title>Help Files: What are they written in?</Title>
<Url>http://myserver/linktest.aspx</Url>
<Date>2008-10-16T16:18:00+01:00</Date><ContentType>post</ContentType>
<Body><div class="ForumPostBodyArea"> <div class="ForumPostContentText"> <p>Can anyone see anything obviously wrong with this xml, when its fired to CRM Its creating 13 null records.</p> <p><?xml version="1.0" encoding="UTF-8"?><soap:Envelope xmlns:typens="http://tempuri.org/type" soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:wsdlns="http://tempuri.org/wsdl/" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Header><SessionHeader><sessionId xsi:type="xsd:long">18208442035524</sessionId></SessionHeader></soap:Header><soap:Body><typens:add><entityname xsi:type="xsd:string">lead</entityname><records xsi:nil="true" xsi:type="typens:ewarebase" /><status xsi:type="xsd:string">PreRegistration</status><requester xsi:type="xsd:string">Mimnagh</requester><personfirstname xsi:type="xsd:string">Sean</personfirstname><personlastname xsi:type="xsd:string">Test2</personlastname><personsalutation xsi:type="xsd:string">Mr</personsalutation><details xsi:type="xsd:string">test project details</details><description xsi:type="xsd:string">test description details</description><comments xsi:type="xsd:string">test project comments</comments><personemail xsi:type="xsd:string">smimnagh#mac.com</personemail><personphonenumber xsi:type="xsd:string">12334566777</personphonenumber><type xsi:type="xsd:string">PreReg</type><companyname xsi:type="xsd:string">Site Client</companyname></typens:add></soap:Body></soap:Envelope></p> <p>Many thanks</p> </div> </div>
</Body>
<Tags>
<Tag>xml</Tag>
</Tags>
<IndexedAt>2010-07-08T11:53:46.848+01:00</IndexedAt>
</SearchResult>
</SearchResults>
Is there something that I can do with the xmlreader to make it ignore whatever's causing the problem?
Please note that I can't change the XML prior to consuming it - so if it's malformed then I wonder if there's a way to ignore or modify that particular record without generating an error?
Thanks!
It looks like some of your quotes need escaping in the contents of some of your elements. Try using
"
for quote marks that aren't wrapping attribute values.
UPDATE:
Because the data you want to read isn't strictly XML (it's nearly XML) you're best bet is to
Either you or your boss, if you have one, screams at the third party because they're not sending you well formed XML.
Perform some horrible hack to try and convert whatever you might get to XML.
If you have to go with point 2, the simplest thing that pops into my head is to read the characters of the 'XML' counting in and out of angle brackets. If you find any " characters and you're not within any angle brackets, replace the " with
"
But note that doing that is a complete last resort.
The Content of your <Body> tag is not well formed. XML is very strict with the syntax of data. Either embed a CDATA section into your XML or escape the string properly.