I'm using c# to interact with a database that has an exposed REST API. The table that I'm interested in contains forum posts, some of which themselves contain xml.
Whenever my result set contains a post that has xml, my application throws an error as follows:
Exception Details: System.Xml.XmlException: '>' is an unexpected token. The expected token is '"' or '''. Line 1, position 62.
And this is the line that fails:
Line 44: ds.ReadXml(xmlData);
And this is the code I'm using:
var webClient = new WebClient();
string searchString = searchValue.Text;
string requestUrl = "http://myserver/restapi.ashx/search.xml?pagesize=4&pageindex=0&query=";
requestUrl += searchString;
XmlReaderSettings settings = new XmlReaderSettings();
settings.ProhibitDtd = false;
XmlReader xmlData = XmlReader.Create(webClient.OpenRead(requestUrl),settings);
DataSet ds = new DataSet();
ds.ReadXml(xmlData);
Repeater1.DataSource = ds.Tables[1];
Repeater1.DataBind();
And this is the type of XML record that it's choking on (the stuff in the node is causing the problem):
<SearchResults PageSize="1" PageIndex="0" TotalCount="342">
<SearchResult>
<ContentId>994</ContentId>
<Title>Help Files: What are they written in?</Title>
<Url>http://myserver/linktest.aspx</Url>
<Date>2008-10-16T16:18:00+01:00</Date><ContentType>post</ContentType>
<Body><div class="ForumPostBodyArea"> <div class="ForumPostContentText"> <p>Can anyone see anything obviously wrong with this xml, when its fired to CRM Its creating 13 null records.</p> <p><?xml version="1.0" encoding="UTF-8"?><soap:Envelope xmlns:typens="http://tempuri.org/type" soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:wsdlns="http://tempuri.org/wsdl/" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Header><SessionHeader><sessionId xsi:type="xsd:long">18208442035524</sessionId></SessionHeader></soap:Header><soap:Body><typens:add><entityname xsi:type="xsd:string">lead</entityname><records xsi:nil="true" xsi:type="typens:ewarebase" /><status xsi:type="xsd:string">PreRegistration</status><requester xsi:type="xsd:string">Mimnagh</requester><personfirstname xsi:type="xsd:string">Sean</personfirstname><personlastname xsi:type="xsd:string">Test2</personlastname><personsalutation xsi:type="xsd:string">Mr</personsalutation><details xsi:type="xsd:string">test project details</details><description xsi:type="xsd:string">test description details</description><comments xsi:type="xsd:string">test project comments</comments><personemail xsi:type="xsd:string">smimnagh#mac.com</personemail><personphonenumber xsi:type="xsd:string">12334566777</personphonenumber><type xsi:type="xsd:string">PreReg</type><companyname xsi:type="xsd:string">Site Client</companyname></typens:add></soap:Body></soap:Envelope></p> <p>Many thanks</p> </div> </div>
</Body>
<Tags>
<Tag>xml</Tag>
</Tags>
<IndexedAt>2010-07-08T11:53:46.848+01:00</IndexedAt>
</SearchResult>
</SearchResults>
Is there something that I can do with the xmlreader to make it ignore whatever's causing the problem?
Please note that I can't change the XML prior to consuming it - so if it's malformed then I wonder if there's a way to ignore or modify that particular record without generating an error?
Thanks!
It looks like some of your quotes need escaping in the contents of some of your elements. Try using
"
for quote marks that aren't wrapping attribute values.
UPDATE:
Because the data you want to read isn't strictly XML (it's nearly XML) you're best bet is to
Either you or your boss, if you have one, screams at the third party because they're not sending you well formed XML.
Perform some horrible hack to try and convert whatever you might get to XML.
If you have to go with point 2, the simplest thing that pops into my head is to read the characters of the 'XML' counting in and out of angle brackets. If you find any " characters and you're not within any angle brackets, replace the " with
"
But note that doing that is a complete last resort.
The Content of your <Body> tag is not well formed. XML is very strict with the syntax of data. Either embed a CDATA section into your XML or escape the string properly.
Related
I got this error while Parse an string to XDocument after edit and save it. But anyone can help me locate error position - The Line 1, position 10475. How can i get that position ???
System.Xml.XmlException: Unexpected XML declaration. The XML
declaration must be the first node in the document, and no white space
characters are allowed to appear before it. Line 1, position 10475.
if (storage.FileExists("APPSDATA.xml"))
{
var reader = new StreamReader(new IsolatedStorageFileStream("APPSDATA.xml", FileMode.Open, storage));
string xml = reader.ReadToEnd();
var xdoc = XDocument.Parse(xml);//error here
reader.Close();
The XML is big, this is jus a part of it
<?xml version="1.0" encoding="UTF-8"?>
<Ungdungs>
<Ungdung>
<Name>HERE City Lens</Name>
<Id>b0a0ac22-cf9e-45ba-8120-815450e2fd71</Id>
<Path>/Icon/herecitylens.png</Path>
<Version>1.0.0.0</Version>
<Category>HERE</Category>
<Date>Uknown</Date>
</Ungdung>
<Ungdung>
<Name>HERE Transit</Name>
<Id>adfdad16-b54a-4ec3-b11e-66bd691be4e6</Id>
<Path>/Icon/heretransit.png</Path>
<Version>1.0.0.0</Version>
<Category>HERE</Category>
<Date>Uknown</Date>
</Ungdung>
Make sure your <?xml tag is the first thing in the document (and that it doesn't have anything before that, this includes whitespace).
You can have <?xml only once per document, so if you have a large chunk of XML and you have this tag repeated somewhere down the lines your document won't be valid.
In my case this was related to the byte order mark - BOM. I opened the file in Notepad++ selected encoding "encode in UTF-8 without BOM" and was then able to see the annoying charater and delete it.
This error might occur if you previously saved the xml file with the boolean 'append = true'.
Make if 'false', it should work.
On my web app (ASP.net 4,C#) I use FOR XML PATH('') to convert Data from SQL Server to XML,
and use the following lines to feed it to XSLT:
MemoryStream stream = new MemoryStream(UTF8Encoding.UTF8.GetBytes(xml));
XPathDocument document = new XPathDocument(stream);
StringWriter writer = new StringWriter();
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load(xsltPath);
transform.Transform(document, null, writer);
return writer.ToString();
Now when I feed messages from my forum, in sunny day scenarios, there should be no problem at all and there isn't.
When a user decides to use special characters like < > in their messages thought, there we have the rainy day.
I get an error which by the way differs from time to time (From message to message depending on what they write there).
I have already tried disable-output-escaping="yes"
Needless to say, I want the users to be able to use some tags like
<a href... or <font ...>
Below is an example of one of the messages that causes the issue:
setting-->about phone----< software update
Any possible solutions?
You need to encode such special characters. As far as XML is concerned, there are 5 of them:
> - >
< - <
& - &
" - "
' - '
You need to encode these from the use input.
An alternative is to place all user generated content within <!\[CDATA\[\]\]> sections, which effectively achieves the same.
I am querying a soap based service and wish to analyze the XML returned however when I try to load the XML into an XDoc in order to query the data. am getting an 'illegal characters in path' error message? This (below) is the XML returned from the service. I simply want to get the list of competitions and put them into a List I have setup. The XML does load into an XML Document though so must be correctly formatted?.
Any advice on the best way to do this and get round the error would be greatly appreciated.
<?xml version="1.0" ?>
- <gsmrs version="2.0" sport="soccer" lang="en" last_generated="2010-08-27 20:40:05">
- <method method_id="3" name="get_competitions">
<parameter name="area_id" value="1" />
<parameter name="authorized" value="yes" />
<parameter name="lang" value="en" />
</method>
<competition competition_id="11" name="2. Bundesliga" soccertype="default" teamtype="default" display_order="20" type="club" area_id="80" last_updated="2010-08-27 19:53:14" area_name="Germany" countrycode="DEU" />
</gsmrs>
Here is my code, I need to be able to query the data in an XDoc:
string theXml = myGSM.get_competitions("", "", 1, "en", "yes");
XmlDocument myDoc = new XmlDocument();
MyDoc.LoadXml(theXml);
XDocument xDoc = XDocument.Load(myDoc.InnerXml);
You don't show your source code, however I guess what you are doing is this:
string xml = ... retrieve ...;
XmlDocument doc = new XmlDocument();
doc.Load(xml); // error thrown here
The Load method expects a file name not an XML itself. To load an actual XML, just use the LoadXml method:
... same code ...
doc.LoadXml(xml);
Similarly, using XDocument the Load(string) method expects a filename, not an actual XML. However, there's no LoadXml method, so the correct way of loading the XML from a string is like this:
string xml = ... retrieve ...;
XDocument doc;
using (StringReader s = new StringReader(xml))
{
doc = XDocument.Load(s);
}
As a matter of fact when developing anything, it's a very good idea to pay attention to the semantics (meaning) of parameters not just their types. When the type of a parameter is a string it doesn't mean one can feed in just anything that is a string.
Also in respect to your updated question, it makes no sense to use XmlDocument and XDocument at the same time. Choose one or the another.
Following up on Ondrej Tucny's answer :
If you would like to use an xml string instead, you can use an XElement, and call the "parse" method. (Since for your needs, XElement and XDocument would meet your needs)
For example ;
string theXML = '... get something xml-ish...';
XElement xEle = XElement.Parse(theXML);
// do something with your XElement
The XElement's Parse method lets you pass in an XML string, while the Load method needs a file name.
Why not
XDocument.Parse(theXml);
I assume this will be the right solution
If this is really your output it is illegal XML because of the minus characters ('-'). I suspect that you have cut and pasted this from a browser such as IE. You must show the exact XML from a text editor, not a browser.
I need to a parse an xml string(.NET, C#) which , unfortunately, is not well formed.. the xml stream that i am getting back is
<fOpen>true</fOpen>
<ixBugParent>0</ixBugParent>
<sLatestTextSummary></sLatestTextSummary>
<sProject>Vantive</sProject>
<ixArea>9</ixArea>
I have tried using a xml reader, but its crashing out because it thinks ,and rightfully so, there are 2 node elements wheneever it tries to parse
Is there something that I can do with this ? I cant change the XML, cause I have no control of the code that sends the XML back ..
Any help, would be appreciated.
Thanks and Regards
Gagan Janjua
I think you can use the XmlParserContext in one of the XmlTextReader overloads to specify that the node type is an XmlNodeType.Element, similar to this example from MSDN (http://msdn.microsoft.com/en-us/library/cakk7ha0.aspx):
XmlTextReader tr = new XmlTextReader("<element1> abc </element1>
<element2> qrt </element2>
<?pi asldfjsd ?>
<!-- comment -->", XmlNodeType.Element, null);
while(tr.Read()) {
Console.WriteLine("NodeType: {0} NodeName: {1}", tr.NodeType, tr.Name);
}
What you are getting back is a well-formed XML fragment but as you pointed out, not a well-formed XML document. Can you
wrap a top-level element around the returned elements? or
reference the returned XML fragment as an external entity from within a shell XML document, and pass the shell document to the XML reader?
In an ASP.NET 2.0 website, I have a string representing some well-formed XML. I am currently creating an XmlDocument object with it and running an XSL transformation for display in a Web form. Everything was operating fine until the XML input started to contain namespaces.
How can I read in this string and allow namespaces?
I've included the current code below. The string source comes from an HTML encoded node in a WordPress RSS feed.
XPathNavigator myNav= myPost.CreateNavigator();
XmlNamespaceManager myManager = new XmlNamespaceManager(myNav.NameTable);
myManager.AddNamespace("content", "http://purl.org/rss/1.0/modules/content/");
string myPost = HttpUtility.HtmlDecode("<post>" +
myNav.SelectSingleNode("//item[1]/content:encoded", myManager).InnerXml +
"</post>");
XmlDocument myDocument = new XmlDocument();
myDocument.LoadXml(myPost.ToString());
The error is on the last line:
"System.Xml.XmlException: 'w' is an undeclared namespace. Line 12, position 201. at System.Xml.XmlTextReaderImpl.Throw(Exception e) ..."
Your code looks right.
The problem is probably in the xml document you're trying to load.
It must have elements with a "w" prefix, without having that prefix declared in the XML document
For example, you should have:
<test xmlns:w="http://...">
<w:elementInWNamespace />
</test>
(your document is probably missing the xmlns:w="http://")
Gut feel - one of the namespaces declared in //content:encoding is being dropped (probably because you're using the literal .InnerXml property)
What's 'w' namespace evaluate to in the myNav DOM? You'll want to add xmlns:w= to your post node. There will probably be others too.