XSD validation error human readable

XSD validation error human readable - c#

I want to be able to validate a XML against a XSD and generate user readable errors, for example, including XSD documentation tag.
I just wanted to know if C# provides this in a easy, elegant and non-painful way, otherwise I'll parse down the error and find the node within XSD.

XML Schema itself doesn't provide a way to do what you want!
XML Schema is not meant to communicate to humans in a human manner. Forget about it.

Related

How do I validate this kind of xml errors in the c# code?

<?xml version="1.0" encoding="utf-8"?>
<xml>
<a>str1234</a>xxxx
</xml>
I got this xml file, as you can see, there's "xxxx" after the "a" close tag.
I tried the xmldocument.load() method but it wouldn't throw any exceptions.
I tried to generate a xsd file from this xml, then validate this xml with the generated xsd.
However, it also wouldn't throw any errors.

It is important to understand the difference between valid and well-formed XML.
Commenters have sloppily said that your XML is valid. They actually should not make such a statement without a schema against which to assess validity. They should be saying that your XML is well-formed.
You seem to be concerned that xxxx text as a sibling to an a element is not well formed, but it is perfectly well-formed XML. It might also be valid, if the parent element, xml is defined by a schema to allow mixed-content.
I tried to generate a xsd file from this xml, then validate this xml
with the generated xsd.
Well, if you used a tool to generate an XSD from an XML document instance, and the XSD said the XML was valid, then the tool is working as designed.
But when my code loading this kind of .config file,it got stuck.
Just like being well-formed doesn't guarantee validity, it also doesn't guarantee that it meets the needs of any given consuming XML application. A configuration file has rules, perhaps expressed in an XSD, that the XML must follow. These rules are in addition to being well-formed (the rules that a parser requires in order to parse XML).
See also How to validate xml code file though .NET? + How would I do it if I use XML serialization?

Is CDATA required to validate/deserialize against a schema if a string element contains valid XML

I am hosting a C# WCF SOAP which service that has a call which contains the following element
<element name="SomeXmlElement" type="xsd:string" minOccurs="0"/>
The WSDL in question is provided by the client.
The content of this element is valid XML which in general will conform to a different XSD, but for our purposes is arbitrary valid XML
If the data is passed "raw" which is the way the client prefers to send it, SomeXmlElement is null after being deserialized
<SomeXmlElement><SomeArbitraryXml/></SomeXmlElement>
If I have them wrap it in a CDATA it works correctly, but the customer/client complains that they don't have to do that for other implementations, and it causes compatability issues
<SomeXmlElement><![CDATA[<SomeArbitraryXml/>]]></SomeXmlElement>
My understanding is that there are only a few choices to have this deserialize correctly.
wrap in CDATA (nested cdata ugh)
Change the schema to use a complex type instead of string, where the complex type references the other XSD schema
xs:any in the schema (what would this deserialize as?)
The customer insists that this is just a deficiency in my code/.Net and that this should deserialize/process fine in the raw format.
Rolling my own deserializer would be possible, or just loading into a DOM and accessing the InnerXml property or whatnot, but thats a lot of work to override default expected behavior imo.
Thoughts? Suggestions? Am I interpreting the XML specs correctly? Are there any choices that don't require schema changes or rewriting lots of WCF default behavior?

Your client has no right to complain. They're publishing an interface and then telling you out-of-band to ignore parts of the interface specification.
If they want to allow arbitrary XML under SomeXmlElement, then they should use xsd:any.
If they want to restrict the XML under SomeXmlElement to that given by another XSD, then they should import or include the other XSD and explicitly reference the allowed elements.
But they should not specify that SomeXmlElement contains an xsd:string and then expect its content model to really be XML. You're the one who has the right to complain.
That their implementations are 10 years old or Java based is irrelevant. XML and XSD specifications go back that far and work well in Java.
So, besides looking for validation here, you probably want advice beyond telling your client to fix their broken interface definition...
Consider rewriting their XSD to be what they really mean, and hold yourself and your code to a higher standard (an actual standard, that is). Anything else would be a hack upon a hack and make you an accessory to their crime.

Do I need to write my own validator for xml validation against xsd schema?

I'm trying to support my users in creating an xml based on an xsd (xml schema).
So I show possible elements and the user can add it to an xml.
However, i have problems to determine the possible elements or to validate that what the user adds is correct. How do I check complex elements?
Let's say we have a sequence element. How am I going to check that the user adds an element at the right place?
Let's say we have a choice element. How am I going to check that an element from the other particle has been added already?
I can validate the xml against the schema in c# but the errors it returns can (maybe) be showed to the user but I can't use them in my code since the format is inappropriate for that and it just doesn't return enough details.
Do I need to write my own validator (and implement all the w3c specs)??
Thanks!

You shouldn't need to implement your own validator. XmlSchemaValidator will actually give you a good amount of information. See the answer to my own similar question here : XML Schemas -- List allowed attributes/tags at position in XML

What is the best way to read and write cXML documents in C#?

I know this is a vague open ended question. I'm hoping to get some general direction.
I need to add cXML punchout to an ASP.NET C# site / application. This is replacing something that I wrote years ago in ColdFusion.
I'm a reasonably experienced C# developer but I haven't done much with XML. There seems to be lots of different options for processing XML in .NET.
Here's the open ended question: Assuming that I have an XML document in some form, eg a file or a string, what is the best way to read it into my code? I want to get the data and then query databases etc. The cXML document size and our traffic volumes are easily small enough so that loading the a cXML document into memory is not a problem.
Should I:
1) Manually build classes based on the dtd and use the XML Serializer?
2) Use a tool to generate classes. There are sample cXML files downloadable from Ariba.com.
I tried xsd.exe to generate an xsd and then xsd.exe /c to generate classes. When I try to deserialize I get errors because there seems to be "confusion" around whether some elements should be single values or arrays.
I tried the CodeXS online tool but that gives errors in it's log and errors if I try to deserialize a sample document.
2) Create a dataset and ReadXml()?
3) Create a typed dataset and ReadXml()?
4) Use Linq to XML. I often use Linq to Objects so I'm familiar with Linq in general but I'm struggling to see what it gives me in this situation.
5) Some other means.
I guess I need to improve my understanding of XML in general but even so ... am I missing some obvious way of doing this? In the old ColdFusion site I found a free component ("tag") which basically ignored any schema and read the XML into a "structure" which is essentially a series of nested hash tables which was then easy to read in code. That was probably quite sloppy but it worked.
I also need to generate XML files from my C# objects. Maybe Linq to XML will be good for that. I could start with a default "template" document and manipulate it before saving.
Thanks for any pointers ...

If you need to generate arbitrary XML in an exact format, you should generate it manually using LINQ-to-XML.

What would be the best way to validate XML?

I been looking at XML Serialization for C# and it looks interesting. I was reading this tutorial
http://www.switchonthecode.com/tutorials/csharp-tutorial-xml-serialization
and of course you can de serialize it back to a list of objects. So I am wondering would it be better to de serialize it back to to a list of objects and then go through each object and validate it or validate it by using a schema then de serializing it and doing stuff with it?
http://support.microsoft.com/kb/307379
Thanks

I guess it would depend a bit on what you want to validate, and for what purpose. If it is intended for interop to other systems, then validating via xsd is a reasonable idea not least because you can use xsd.exe to write your classes for you from the xsd (you can also generate xsd from xml or dll, but it isn't as accurate). Likewise you can use XmlReader (appropriately configured) to check against xsd,
If you just want valid .NET objects, I'd be tempted to leave the serialized form as an implementation detail, and write some C# validation code - perhaps implementing IDataErrorInfo, or using data-annotations.

You can create an XmlValidatingReader and pass that into your serializer. That way you can read the file in one pass and validate it at the same time.
I believe the same technique will work even if you are using hand rolled XML classes (for extremely large XML files) so you might find it worth a look.
Edit:
Sorry just reread some of my code, XmlValidatingReader is obsolete, you can do what you need with the XmlReader.
See XmlReader Settings

For speed I would do it in C#, however for completeness you might want to do it using an XSD. The issue with that is you have to learn the verbose and cumbersome XSD syntax, which from experience takes a lot of trial and error, is time consuming and holds not a lot of reward for serialization. Particularly with constants where you have to map them in C# and also in the XSD.
You'll always be writing the XML as C#. Anything not known when read back in is simply ignored. If you aren't editing the XML with a text editor you can guarantee that it will come back in the right way, in which case XSD is definitely not needed.

If you validate the XML, you can only prove that it's structurally correct. An attempt to deserialize from the XML will tell you the same thing.
Typically business objects can implement business logic/rules/conditions that go beyond a valid schema. That type of knowledge should stay with the business objects themselves, rather than being duplicated in some sort of external validation routine (otherwise, if you change a business rule, you have to update the validator at the same time).

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.