Currently we have a XML Schema and the code reads the xml file, validates against the schema and save to database. In future there would be schema changes, how can the code handles them without needing to rewrite the code for new schema.
Thanks,
Let me give an example
<Products>
<product id="1">
<name> ABC </name>
<desc> good one </desc>
</product>
</products>
XPath mapping configuration
Table Column XPath
Product id //Products/product/id
Product name //Products/product/name
Product desc //Products/product/desc
Now the C# code reads id, name and desc and generates an insert statement based on the Mapping configuraiton
If the schema changes and new element is added say price and we would add that price to mapping, so the new insert statement that is generated includes price.
Will this work?
I hate parsing XML and loading it into objects. Due to this, you can try the following approach.
Create a C# object that represents the XML data you are talking about. Serialize that C# class, and viola you have an XML schema that is strongly typed. Also, if in the future you need additional schema changes, simply modify the C# class and reserialize and you're all set.
This also removes the need to parse the XML document (assuming you are utilizing it within the CLR), as you can simply reference the C# class and you can deserialize it back into memory without any parsing.
The way of handling something like this that immediately comes to mind would be to have a known good skeletal XML schema with no data in it, have the code parse and learn that schema, and then have it run on whatever arbitrary input you give it. When the XML schema changes, simply have a trusted user/admin go in and change the known good skeleton.
You should make sure your database can handle these changes without any extra prodding, and you should most definitely have at least a few tests that run regularly and throw off alerts if a problem is detected. One of the most dangerous elements in 'low maintenance' processes like these is that they often fail quietly and there's no way to tell they're broken!
I'm a little afraid I'm not getting your whole question because you added a bunch of tags that aren't obviously in your question, but hopefully this helps.
If the location for the XML data changes, unless you want to abstract the crap out of your XML file (include metadata in the document describing where to find things) you're out of luck. If your data elements will always be in the same place, all you have to do is keep your XSD file as a separate file, and change it when necessary to validate the document.
Related
The project I'm working on is an Extranet. I need to call a webservice in this project that communicates with the database. This works as an APPserver.
The procedures between the APPserver and the database are written in Progress. The output that I receive from the webservice is an object that contains XML.
Is it possible to convert the XML file to objects? For example, I have a node
<user>
<uid></uid>
<lastname></lastname>
<firstname></firstname>
</user>
Can this user node convert to a User entity?
The complexity is much higher when it starts with relationships. How the XML will look like, I can't really say at this time.
Are there any other possible frameworks / languages I could use, so they simplify this process?
What will happen with the structure of the relationships and how to handle them?
This example is from an old version of .NET, but it is still relevant. Use XML deserialization to load objects based on an XML format. You can have nested classes. Just decorate all classes/properties as necessary to create the proper format when the object is serialized, and you'll be able to deserialize XML into objects back at the webservice.
http://www.codeproject.com/Articles/4491/Load-and-save-objects-to-XML-using-serialization
I've been tasked with the job of importing a set of XML files, transform them and upload them to an SQL database, and then re-transforming them to a different XML-format.
The XML files are rather large, and some of them a little complex, so I'm unsure of the best way to do this. I'd of course like to automate this process somehow - and was actually hoping there'd be some kind of Entity Framework-esque solution to this.
I'm quite new to handling and dealing with XML in .NET, so I don't really know what my options are. I've read about XSLT, but that seems to me, to be a "language" I need to learn first, making it kind of not a solution for me.
Just to set a bit of context, the final solution actually needs to import new/updated versions of the XML on a weekly basis, uploading the new data to sql, and re-exporting as the other XML-format.
If anyone could give me any ideas as to how to proceed, I'd be much obliged.
My first instict was to use something like XSD2DB or XML SPY to first create the database structure, but I don't really see how I'm then supposed to proceed either.
I'm quite blank in fact :)
XSLT is language used by XML processors to transform XML document in one format to XML document in another format. XSLT would be your choice if you don't need to store data in database as well.
All tools like XSD2DB or XML SPY will create some database schema for you but the quality of the schema will be very dependent on quality of XML document and XSD (do you have XSD or are you going to generate it from sample XML?). The generated database will probably not be to much useful for EF.
If you have XSD you can use xsd.exe tool shipped with Visual studio and generate classes representing data of your XML files in .NET code. You will be able to use XmlSerializer to deserialize the XML document into your generated classes. The problem is that some XSD constructs like choice are modeled in .NET code by very ugly way. Another problem can be performance if your XML files are really huge because deserialization must read all data at once. The last problem can be again EF - classes generated by XSD will most probably not be usable as entities and you will not be able to map them.
So either use EF and in such case you will have to analyze XSD and create custom entities and mapping to your own designed database and you will fill your classes either from XmlReader (best performance), XmlDocument or XDocument or use some tool helping you creating classes or database from XML and in such case use direct SQL to work with a database.
Reverse operation will again require custom approach. You will have data represented either by your custom EF entities or by some autogenerated classes and you will have to transform them to a new format. You can again use xsd.exe to get classes for a new format and write a custom .NET code filling new classes from old ones (and use XmlSerializer to persist a new structure to XML) or you can use XmlWriter, XDocument or XmlDocument to build target XML document directly.
Data migration in any form is not easy task with ready to use solution. In case of really huge data processing you can use tools like SQL Server Integration Services where you will interact with XML and SQL directly and process data in batches.
Have a look at SQLXML 4.0. It does exactly what you want (in upload part).
I want to use the powerful DataContractSerializer to write or read data to the XML file.
But as my concept, DataContractSerializer can only read or write data with entire structure or list of structure.
My use case is describe below....I cannot figure out how to optimize the performance by using this API.
I have a structure named "Information" and have a List<Information> with unexpectable number of elements in this list.
User may update or add new element into this list very often.
Per operation (Add or Update), I must serialize all the element in the list to the same XML file.
So, I will write the same data even they are not modified into XML again. It does not make sense but I cannot find any approach to avoid this happened.
Due to the tombstoning mechanism, I must save all the information in 10 secs.
I'm afraid of the performance and maybe make UI lag...
Could I use any workaround to partially update or add a data information into the XML file by DataContractSerializer?
DataContractSerializer can be used to serialize selected items - what you need to do is to come up with scheme to identify changed data and way to efficiently serialize it. For example, one of the way could be
You start by serializing entire list of structures to an file.
Whenever some object is added/updated/removed from list, you create a diff object that will identify kind of change and the object changed. Then you can serialize this object to xml and append the xml to file.
While reading the file, you may have to apply similar logic, first read list and then start applying diffs one after another.
Because you want to continuous append to file, you shouldn't have root element in your file. In other words, the file with diff info will not be an valid xml document. It would contain series of xml fragments. To read it, you have to enclose these fragments in a xml declaration and root element.
You may use some background task to write the entire list periodically to generate valid xml file. At this point, you may discard your diff file. Idea is to mimic transactional system - one data structure to have serialized/saved info and then another structure containing changes (akin to transaction log).
If performance is a concern then using something other than DataContractSerializer.
There is a good comparison of the options at
http://blogs.claritycon.com/kevinmarshall/2010/11/03/wp7-serialization-comparison/
If the size of the list is a concern, you could try breaking it into smaller lists. THe most appropriate way to do this will depend on the data in your list and typical usage/edit/addition patterns.
Depending on the frequency with which the data is changed you could try saving it whenever it is changed. This would remove the need to save it in the time available for deactivation.
Possible Duplicate:
Programmatically Create XML File From XSD
XML instance generation from XML schema (xsd)
How to generate sample XML documents from their DTD or XSD?
Here's the scenario: I've created an application that hooks into a commercial CRM product using their web service API, which unfortunately has a different schema for every installation, based on how the users create their custom fields. This schema can also be modified at any time. This application will be installed at the customer location, and will need to function even when they change their field structure.
In order to insert or update a record, I first call their Project.GetSchema() method, which returns the XSD file based on the current set of fields, and then I can call the Project.AddProject() method, passing in an XML file containing the project data.
My question is: What's the best way to generate the XML from the XSD file at runtime? I need to be able to check for the existence of fields, and fill them out only if they exist (for instance, if the customer deleted or renamed some fields).
I really don't want to have the application attempting to recompile classes on the fly using xsd.exe. There simply must be a better way.
[update] My current solution, that I'm working on, is to basically parse out the XSD file myself, since the majority of the schema is going to be the same for each installation. It's just an ugly solution, and I was hoping there was a better way. The biggest problem I have is that their schema uses xsd:sequence, so putting things in a different order always breaks validation.
If I have thousands of hierarchical records to take from database and generate xml, what will be the best way to do it with a good performance and less CPU utilization?
You can output XML directly from SQL Server 2005 using
FOR XML
The results of a query
are returned as an XML document. Must be used with
one of the three RAW, AUTO
and EXPLICIT options
RAW
Each row in the result set is an XML element with a generic
identifier as the element tag
AUTO
Results returned in a simple
nested XML tree. An element will
be generated for each table field in the
SELECT clause
EXPLICIT
Specifies the shape of the resulting
XML tree explicitly.
A query must be written in a
particular way so that additional
information about the nesting is
specified
XMLDATA
Returns the schema, but does not add the root element to the result
ELEMENTS
Specifies that the columns are
returned as child elements to the table
element. If not specified, they are mapped as
attributes
Generate an inline XSD schema at the same time using
XMLSCHEMA
You can handle null values in records using
XSINIL
You can also return data in Binary form.
You might want to have a look on MSDN for XML support in SQL Server 2005, for technologies such as XQuery, XML data type, etc.
That depends - if your application and database servers are on separate machines, then you need to specify which CPU you want to reduce the load on. If your database is already loaded up, you might be better off doing the XML transform on your application server, otherwise go and ahead and use SQL Server FOR XML capabilities.
Oracle has tools for that, so I guess SQL-Server does too, but you'll need a schema. Personally for small set I use a php script I have around, but for big stuff with need for customization is another story.