Mapping XML to Unrelated Objects

Mapping XML to Unrelated Objects - c#

I'm designing a process to get XML files from our client and load them to our database, creating an order on our side.
The snag is, and isn't there always one?, the client's XML really doesn't resemble the business objects we use to load data to our database.
So I have to design a way to get the format they specify into our custom objects.
I'm considering creating "on the fly" custom objects FROM their XML and then coming up with a "map" to translate their objects into ours. That's where my head is at right now.
Essentially I don't want to write another data-load process that supports their data, I just want to get their data into our format.
I know this is basically a design question so I'm just throwing out my idea to see if it rings true with anyone else. Or if someone has done this and has a suggestion, I'm very open to hearing it. Thanks!

From your tag, c# and xml, I would generate an event upon file reception (OS level) that triggers the small app you will have to make. Structure wise, I would go with CompanyName.Object1.
Read up on XDocument for parsing and what not. XElement and its Attributes.
Bottom line, it looks like a CRM kind of implementation and from my implementation experience, it's the longuest process: parsing of incoming data. You'll have to be thorough with your clients and have them write specific..
<Nodes name="SpecificName">
Nodes = LocalName
name = Attribute("name")
Good luck.

Related

How to properly save application data for later use

Ok, so I am working on a c# windows forms application and it uses different types of structures that hold data and display to the user. I want to use a saveDialogBox to allow the user to save the information(i.e configuration, state). The only way I can think to do this is to make a routine that goes through the structures and write the corresponding elements to a text file. Upon loading this routine would be used to load the data back.
This is of course a dumb way to do it I'll admit. Anything I've done in school was only writing to text files. Is there other ways to make some formatted file to save and load from?
I've been looking at serialization to save objects to files. I am not too sure how all this works though. help.

to save your application setting .. I think these links will help you
http://msdn.microsoft.com/en-us/library/aa730869%28VS.80%29.aspx
http://www.thescarms.com/dotnet/AppSettings.aspx
and
How to use settings in Visual C#

My 'Old School' way of doing this has always been to save settings during the program execution to a database (providing that you take the time to ensure you're not hammering the database with updates / inserts).
If my application needs to be more efficient AND I need to easily be able to recall the saved settings I serialize to XML using System.Xml.Serialization (from memory). XML serialization is human readable which is helpful (but not the most efficient in terms of processing time).
If I need even more efficiency you can go the whole way and serialize to binary.
I'd suggest reading / understanding http://msdn.microsoft.com/en-us/library/Vstudio/ms233843.aspx in it's entirety before coming back here. I'd say once you read this you'll be far better equipped to make a decision on which way you want to take your application.
In my experience there aren't that many DUMB ways to solve problems however there is almost always a better way to solve them given enough time and research.

Data structure for hierarchical members in C#

I'm trying to read data from WSDL file and get stuck, because there could be a big hierarchical tree and I don't know what kind data structure use to get inputs and outputs, because they can have input as a object and object can point to couple simple inputs and second object... this could go on and on. So I don't know what to use. Maybe tree, maybe indexes. What is the best practise and can you give small example how data could be controlled?
P.S. I'm developing automated tests generation tool, whose will use WSDL files for generation.

Your best bet is to use good old classes. First thing to do is to use utility like svcutils.exe (Code generator tool) to create the client code from WSDL. Form this you will get the idea about how deep the tree is going to be.
Once you have Object View of the structure then start creating Classes and apply OOP design patterns. This will help with at least two things:
Avoiding code duplication and
When you start constructing your object in the code it will give you idea which node comes under which parent etc.
Hope this helps.
Another thing also to consider is use some sort of object serialization meach. Serialization will help you in great deal when dealing with complex tree like data from XML to objects and vice a versa.

WSDL is based on XML, which already is a tree structure. Not sure why you want to read it into objects first -- just use Linq to XML to read the WSDL directly.

Acord Standard for Insurance. Has anybody dealt with this mess?

We need to implement a WCF Webservice using the ACORD Standard.
However, I don't know where to start with this since this standard is HUMONGOUS and very convoluted. A total chaos to my eyes.
I am trying to use WSCF.Blue to extract the classes from the multiple XSD I have but so far all I get is a bunch of crap: A .cs file with 50,000+ lines of code that freezes my VS2010 all the time.
Has anybody walked already thru the Valley of Death (ACORD Standard) and made it? I really would appreciate some help.

I wrote a ACORD to c# class library converter which was then used in several large commercial insurance products. It featured a very nice mapping of all of the ACORD XML into nice concise, extendable C# classes. So I know from whence you come!
Once you dig into it its not so bad, but I maintain the average coder will not 'get it' for about 3-4 months if they work at it full time (assuming anything but inquiry style messages). The real problem comes when trying to do mapping from a backend database and to/from another ACORD WS. All of the carriers, vendors, and agencies have custom rules.
My best suggestion is to find working code examples (I have tons if you need them) and maybe even a vendor or carrier who will let you hook up to a ACORD ws in a test environment.

It sounds like you are heading down the right path but are lost in the forest.
The ACORD Standard is huge and intentionally so, as it provides support for hundreds of different messages. Just as you do not download all of Wikipedia to get just a few articles, you do not need all of the classes in the ACORD Standard to support an implementation of a few messages. If you know what messages you need to support then you can generate a subset of the full XSD that will be quite manageable.
As mentioned in Hugh’s response, for any one message only a fraction of the full XSD is used. How you go about doing that will depend on the specifics of your project. If you are looking for ideas on how generate a subset of the full XSD try reaching out to the ACORD staff for help at PCS#acord.org. They should be able to offer you some help in getting started.

I have worked with the Accord PCS exposure reporting standards and yes it was a nightmare. I have also worked with other large standards like FPML and SportsML.
You need to work out exactly which types from the schema that are needed. How you do this is up to you, but VS schema viewer should be able to handle it. If not try XmlSpy or just go through it by hand if you have to. Make sure you have a good BA to hand...
Chances are you will find that you can meet your requirements by using around 1% of the types available in the standard.
What you'll probably find is that you can express the core objects with a very minimal set of values, as most nodes will be minOccurs=0 or nillable.
Then you can use the /element switch on xsd.exe to generate the code for just the types you need.
As one commenter says there is no easy pill to swallow here. The irony is that standards are supposed to make everyone's lives easier.

If you are looking to read/write ACORD documents using .NET, I just stumbled across the "IVC Software Factory for ACORD Standards" on CodePlex at http://ivc.codeplex.com.
From the limited documentation it appears as if this library can convert objects to ACORD XML documents, and vice-versa. The source code comes with different "providers" i.e. different ACORD transaction types, like 103 or 121.
Hope this helps.

I would recommend not creating a model for the entire standard. One could just pass XML and not serialize into a model but instead load it into XDocument/XElement and use Linq to query it and update the DOM using Linq to Xml. So, one is not loading the XML to a strongly typed model, but just loading the XML. There is no model, just an XML document.
From there, one can pick the data off of the XML as needed.
Using this approach, the code will be ugly and have little context since XElements will be passed everywhere, and there will be tons of magic strings of XPaths to query and define elements, but it can work. Also, everything is a string so there will be utility conversion methods to convert to numbers, date times, etc.
From my prospective, I have modeled part of the Acord into an object model using the XmlSerializer but it's well over 500 classes. The model was not tooled from XSD or other, but crafted manually and took some time. Tooling will produce monster unusable classes (as you have mentioned) and/or flat out crash. As an example, I tried to load the XSD into Stylus Studio and it crashed several times.
So, your best bet if your strapped for time is loading into an XDocument as opposed to trying to map out everything in a model. I know that sucks but Acord in general is basically a huge data hot mess.

Should I use XML to store configuration settings in my C#.Net application?

My question relates to the performance implications of reading application configuration data from an XML file.
I am building an application that lists information from a database and needs to know how to display the lists, depending on the types of data returned.
This is difficult to explain, but basically I would like to have an XML config file that lists the types and describes how to display them. This will allow me to change the display methods without re-compiling the application.
My question is really around performance. Given that my application will need to use this data many times during each page load...
Should I be reading directly from the XML file and parse it each time I need it?
Or should I cache the XML object and parse it each time I need it?
Or should I parse the XML once, generate some sort of object and cache that object?
My guess is option 3, but I'm basically fishing for best practice around this.
Thanks.

There is already a convention for this, called the App.config file.
It is XML, and Visual Studio has tooling support for it.
My suggestion is: Don't reinvent the wheel, if you can help it.
Now, given that your format is too complex for that, you probably want to go with option 3, but load it lazily.

What is the best approach to generalize and aggregate XML dumps in C#?

Here is the business part of the issue:
Several different companies send a
XML dump of the information to be
processed.
The information sent by the companies
are similar ... not exactly same.
Several more companies would be soon
enlisted and would start sending
information
Now, the technical part of the problem is I want to write a generic solution in C# to accommodate this information for processing. I would be transforming the XML in my C# class(es) to fit in to my database model.
Is there any pattern or solution for this issue to be handled generically without needing to change my solution in case of addition of many companies later?
What would be the best approach to write my parser/transformer?

This is how I have done something similar in the past.
As long as each company has its own fixed format which they use for their XML dump,
Have an specific XSLT for each company.
Have a way of indicating which dump is sourced from where (maybe different DUMP folders for each company )
In your program, based on 2, select 1 and apply it to the DUMP
All the XSLT's will transform the XML to your one standard database schema
Save this to your DB
Each new company addition is at the most a new XSLT
In cases where the schema is very similar, the XSLT's can be just re-used and then specific changes made to them.
Drawback to this approach: Debugging XSLT's can be a bit more painful if you do not have the right tools. However a LOT of XML Editors (eg XML Spy etc) have excellent XSLT debugging capabilities.

Sounds to me like you are just asking for a design pattern (or set of patterns) that you could use to do this in a generic, future-proof manner, right?
Ideally some of the attributes that you probably want
Each "transformer" is decoupled from one another.
You can easily add new "transformers" without having to rewrite your main "driver" routine.
You don't need to recompile / redeploy your entire solution every time you modify a transformer, or at least add a new one.
Each "transformer" should ideally implement a common interface that your driver routine knows about - call it IXmlTransformer. The responsibility of this interface is to take in an XML file and to return whatever object model / dataset that you use to save to the database. Each of your transformers would implement this interface. For common logic that is shared by all transformers you could either create a based class that all inherit from, or (my preferred choice) have a set of helper methods which you can call from any of them.
I would start by using a Factory to create each "transformer" from your main driver routine. The factory could use reflection to interrogate all assemblies it can see that, or something like MEF which could do a lot of the work for you. Your driver logic should use the factory to create all the transformers and store them.
Then you need some logic and mechanism to "lookup" each XML file received to a given Transformer - perhaps each XML file has a header that you could use to identify or something similar. Again, you want to keep these decoupled from your main logic so that you can easily add new transformers without modification of the driver routine. You could e.g. supply the XML file to each transformer and ask it "can you transform this file", and it is up to each transformer to "take responsibility" for a given file.
Every time your driver routine gets a new XML file, it looks up the appropriate transformer, and runs it through; the result gets sent to the DB processing area. If no transformer can be found, you dump the file in a directory for interrogation later.
I would recommend reading a book like Agile Principles, Patterns and Practices by Robert Martin (http://www.amazon.co.uk/Agile-Principles-Patterns-Practices-C/dp/0131857258), which gives good examples of appropriate design patterns for situations like yours e.g. Factory and DIP etc.
Hope that helps!

Solution proposed by InSane is likley the most straigh forward and definitely XML friendly approach.
If you looking for writing your own code to do conversion of different data formats than implementing multiple reader entities that would read data from each distinct format and transform to unified format, than your main code would work with this entities in unified way, i.e. by saving to the database.
Search for ETL - (Extract-Trandform-Load) to get more information - What model/pattern should I use for handling multiple data sources? , http://en.wikipedia.org/wiki/Extract,_transform,_load

Using XSLT as proposed in the currently most upvoted answer, is just moving the problem, from c# to xslt.
You are still changing the pieces that process the xml, and you are still exposed to how good/poor is the code structured / whether it is in c# or rules in the xslt.
Regardless if you keep it in c# or go xslt for those bits, the key is to separate the transformation of the xml you receive from the various companies into a unique format, whether that's an intermediate xml or a set of classes where you load the data you are processing.
Whatever you do avoid getting clever and trying to define your own generic transformation layer, if that's what you want Do use XSLT since that's what's for. If you go with c#, keep it simple with a transformation class for each company that implements the simplest interface.
On the c# way, keep any reuse you may have between the transformations to composition, don't even think of inheritance to do so ... this is one of the areas where it gets very ugly quickly if you go that way.

Have you considered BizTalk server?

Just playing the fence here and offering another solution for other readers.
The easiest way to get the data into your models within C# is to use XSLT to convert each companies data into a serialized form of your models. These are the basic steps I would take:
Create a complete model of all your data and use XmlSerializer to write out the model.
Create an XSLT that takes Company A's data and converts it into a valid serialized xml model of your data. Use the previously created XML file as a reference.
Use Deserialize on the new XML you just created. You will now have a reference to your model object containing all the data from the company.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.