Data structure for hierarchical members in C#

Data structure for hierarchical members in C# - c#

I'm trying to read data from WSDL file and get stuck, because there could be a big hierarchical tree and I don't know what kind data structure use to get inputs and outputs, because they can have input as a object and object can point to couple simple inputs and second object... this could go on and on. So I don't know what to use. Maybe tree, maybe indexes. What is the best practise and can you give small example how data could be controlled?
P.S. I'm developing automated tests generation tool, whose will use WSDL files for generation.

Your best bet is to use good old classes. First thing to do is to use utility like svcutils.exe (Code generator tool) to create the client code from WSDL. Form this you will get the idea about how deep the tree is going to be.
Once you have Object View of the structure then start creating Classes and apply OOP design patterns. This will help with at least two things:
Avoiding code duplication and
When you start constructing your object in the code it will give you idea which node comes under which parent etc.
Hope this helps.
Another thing also to consider is use some sort of object serialization meach. Serialization will help you in great deal when dealing with complex tree like data from XML to objects and vice a versa.

WSDL is based on XML, which already is a tree structure. Not sure why you want to read it into objects first -- just use Linq to XML to read the WSDL directly.

Related

Mapping XML to Unrelated Objects

I'm designing a process to get XML files from our client and load them to our database, creating an order on our side.
The snag is, and isn't there always one?, the client's XML really doesn't resemble the business objects we use to load data to our database.
So I have to design a way to get the format they specify into our custom objects.
I'm considering creating "on the fly" custom objects FROM their XML and then coming up with a "map" to translate their objects into ours. That's where my head is at right now.
Essentially I don't want to write another data-load process that supports their data, I just want to get their data into our format.
I know this is basically a design question so I'm just throwing out my idea to see if it rings true with anyone else. Or if someone has done this and has a suggestion, I'm very open to hearing it. Thanks!

From your tag, c# and xml, I would generate an event upon file reception (OS level) that triggers the small app you will have to make. Structure wise, I would go with CompanyName.Object1.
Read up on XDocument for parsing and what not. XElement and its Attributes.
Bottom line, it looks like a CRM kind of implementation and from my implementation experience, it's the longuest process: parsing of incoming data. You'll have to be thorough with your clients and have them write specific..
<Nodes name="SpecificName">
Nodes = LocalName
name = Attribute("name")
Good luck.

Retrieving data LINQ vs Reflection

I was hoping someone could tell me which is the more efficient and/or correct way to retrieve some data.
I have some XML files coming from a 3rd party and their attached DTDs. So I've converted the DTD into a C# Class so I can deserialize the XML into the classes. I now need to map that data to match the way my data structures are set up.
The question ultimately is; should I use reflection or LINQ. The format of the XML is somewhat generic by design, where things are held in Items [Array] or Item [Object].
I've done the following:
TheirClass class = theirMessage.Items.Where(n=> n.GetType() == typeof(TheirClass)).First() as TheirClass;
MyObject.Param1 = ConversionHelperClass.Convert(class.Obj1);
MyObject.Param2 = ConversionHelperClass.Convert(class.Obj2);
I can also do some stuff with Reflection where I pass in the names of the Classes and Attributes I'm trying to snag.
Trying to do things the right way here.

As a general rule I'd suggest avoiding reflection unless it is absolutely necessary! It introduces a performance overhead AND means that you miss out on all of the lovely compile time checks that the compiler team have worked so hard to give us.
Linq to entities essentially queries against an in memory data set, so it can be very fast.
If your ultimate goal is to parse information from an xml document, I'd suggest checking out the XDocument class. It provides a very nice abstraction for querying xml documents.

Easiest way to serialize and store objects in c#?

Im looking for a simple solution to serialize and store objects that contain configuration, application state and data. Its a simple application, its not alot of data. Speed is no issue. I want it to be in-process. I want it to be more easy-to-edit in a texteditor than xml.
I cant find any document database for .net that can handle it in-process.
Simply serializing to xml Im not sure I want to do because its... xml.
Serializing to JSON seems very javascript specific, and I wont use this data in javascript.
I figure there's very neat ways to do this, but atm im leaning to using JSON despite its javascript inclenation.

Just because "JSON" it's an acronym for JavaScript Object Notation, has no relevance on if it fits your needs or not as a data format. JSON is lightweight, text based, easily human readable / editable and it's a language agnostic format despite the name.
I'd definitely lean toward using it, as it sounds pretty ideal for your situation.

I will give a couple of choices :
Binary serialization: depends on content of your objects, if you have complicated dependecy tree it can create a problems on serializing. Also it's not very flexible, as standart binary serialization provided by Microsoft stores saving type information too. That means if you save a type in binary file, and after one month decide to reorganize your code and let's say move the same class to another namespace, on desirialization from binary file previously saved it will fail, as the type is not more the same. There are several workarrounds on that, but I personally try to avoid that kind of serialization as much as I can.
ORM mapping and storing it into small database. SQLite is awesome choice for this kind of stuff as it small (single file) and full ACID support database. You need a mapper, or you need implement mapper by yourself.
I'm sure that you will get some other choice from the folks in a couple of minutes.
So choice is up to you.
Good luck.

What is the best approach to generalize and aggregate XML dumps in C#?

Here is the business part of the issue:
Several different companies send a
XML dump of the information to be
processed.
The information sent by the companies
are similar ... not exactly same.
Several more companies would be soon
enlisted and would start sending
information
Now, the technical part of the problem is I want to write a generic solution in C# to accommodate this information for processing. I would be transforming the XML in my C# class(es) to fit in to my database model.
Is there any pattern or solution for this issue to be handled generically without needing to change my solution in case of addition of many companies later?
What would be the best approach to write my parser/transformer?

This is how I have done something similar in the past.
As long as each company has its own fixed format which they use for their XML dump,
Have an specific XSLT for each company.
Have a way of indicating which dump is sourced from where (maybe different DUMP folders for each company )
In your program, based on 2, select 1 and apply it to the DUMP
All the XSLT's will transform the XML to your one standard database schema
Save this to your DB
Each new company addition is at the most a new XSLT
In cases where the schema is very similar, the XSLT's can be just re-used and then specific changes made to them.
Drawback to this approach: Debugging XSLT's can be a bit more painful if you do not have the right tools. However a LOT of XML Editors (eg XML Spy etc) have excellent XSLT debugging capabilities.

Sounds to me like you are just asking for a design pattern (or set of patterns) that you could use to do this in a generic, future-proof manner, right?
Ideally some of the attributes that you probably want
Each "transformer" is decoupled from one another.
You can easily add new "transformers" without having to rewrite your main "driver" routine.
You don't need to recompile / redeploy your entire solution every time you modify a transformer, or at least add a new one.
Each "transformer" should ideally implement a common interface that your driver routine knows about - call it IXmlTransformer. The responsibility of this interface is to take in an XML file and to return whatever object model / dataset that you use to save to the database. Each of your transformers would implement this interface. For common logic that is shared by all transformers you could either create a based class that all inherit from, or (my preferred choice) have a set of helper methods which you can call from any of them.
I would start by using a Factory to create each "transformer" from your main driver routine. The factory could use reflection to interrogate all assemblies it can see that, or something like MEF which could do a lot of the work for you. Your driver logic should use the factory to create all the transformers and store them.
Then you need some logic and mechanism to "lookup" each XML file received to a given Transformer - perhaps each XML file has a header that you could use to identify or something similar. Again, you want to keep these decoupled from your main logic so that you can easily add new transformers without modification of the driver routine. You could e.g. supply the XML file to each transformer and ask it "can you transform this file", and it is up to each transformer to "take responsibility" for a given file.
Every time your driver routine gets a new XML file, it looks up the appropriate transformer, and runs it through; the result gets sent to the DB processing area. If no transformer can be found, you dump the file in a directory for interrogation later.
I would recommend reading a book like Agile Principles, Patterns and Practices by Robert Martin (http://www.amazon.co.uk/Agile-Principles-Patterns-Practices-C/dp/0131857258), which gives good examples of appropriate design patterns for situations like yours e.g. Factory and DIP etc.
Hope that helps!

Solution proposed by InSane is likley the most straigh forward and definitely XML friendly approach.
If you looking for writing your own code to do conversion of different data formats than implementing multiple reader entities that would read data from each distinct format and transform to unified format, than your main code would work with this entities in unified way, i.e. by saving to the database.
Search for ETL - (Extract-Trandform-Load) to get more information - What model/pattern should I use for handling multiple data sources? , http://en.wikipedia.org/wiki/Extract,_transform,_load

Using XSLT as proposed in the currently most upvoted answer, is just moving the problem, from c# to xslt.
You are still changing the pieces that process the xml, and you are still exposed to how good/poor is the code structured / whether it is in c# or rules in the xslt.
Regardless if you keep it in c# or go xslt for those bits, the key is to separate the transformation of the xml you receive from the various companies into a unique format, whether that's an intermediate xml or a set of classes where you load the data you are processing.
Whatever you do avoid getting clever and trying to define your own generic transformation layer, if that's what you want Do use XSLT since that's what's for. If you go with c#, keep it simple with a transformation class for each company that implements the simplest interface.
On the c# way, keep any reuse you may have between the transformations to composition, don't even think of inheritance to do so ... this is one of the areas where it gets very ugly quickly if you go that way.

Have you considered BizTalk server?

Just playing the fence here and offering another solution for other readers.
The easiest way to get the data into your models within C# is to use XSLT to convert each companies data into a serialized form of your models. These are the basic steps I would take:
Create a complete model of all your data and use XmlSerializer to write out the model.
Create an XSLT that takes Company A's data and converts it into a valid serialized xml model of your data. Use the previously created XML file as a reference.
Use Deserialize on the new XML you just created. You will now have a reference to your model object containing all the data from the company.

What would be the best way to validate XML?

I been looking at XML Serialization for C# and it looks interesting. I was reading this tutorial
http://www.switchonthecode.com/tutorials/csharp-tutorial-xml-serialization
and of course you can de serialize it back to a list of objects. So I am wondering would it be better to de serialize it back to to a list of objects and then go through each object and validate it or validate it by using a schema then de serializing it and doing stuff with it?
http://support.microsoft.com/kb/307379
Thanks

I guess it would depend a bit on what you want to validate, and for what purpose. If it is intended for interop to other systems, then validating via xsd is a reasonable idea not least because you can use xsd.exe to write your classes for you from the xsd (you can also generate xsd from xml or dll, but it isn't as accurate). Likewise you can use XmlReader (appropriately configured) to check against xsd,
If you just want valid .NET objects, I'd be tempted to leave the serialized form as an implementation detail, and write some C# validation code - perhaps implementing IDataErrorInfo, or using data-annotations.

You can create an XmlValidatingReader and pass that into your serializer. That way you can read the file in one pass and validate it at the same time.
I believe the same technique will work even if you are using hand rolled XML classes (for extremely large XML files) so you might find it worth a look.
Edit:
Sorry just reread some of my code, XmlValidatingReader is obsolete, you can do what you need with the XmlReader.
See XmlReader Settings

For speed I would do it in C#, however for completeness you might want to do it using an XSD. The issue with that is you have to learn the verbose and cumbersome XSD syntax, which from experience takes a lot of trial and error, is time consuming and holds not a lot of reward for serialization. Particularly with constants where you have to map them in C# and also in the XSD.
You'll always be writing the XML as C#. Anything not known when read back in is simply ignored. If you aren't editing the XML with a text editor you can guarantee that it will come back in the right way, in which case XSD is definitely not needed.

If you validate the XML, you can only prove that it's structurally correct. An attempt to deserialize from the XML will tell you the same thing.
Typically business objects can implement business logic/rules/conditions that go beyond a valid schema. That type of knowledge should stay with the business objects themselves, rather than being duplicated in some sort of external validation routine (otherwise, if you change a business rule, you have to update the validator at the same time).

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.