We're looking for a Transformation library or engine which can read any input (EDIfact files, CSV, XML, stuff like that. So files (or webservices results) that contain data which must be transformed to a known business object structure.) This data should be transformed this to a existing business object using custom rules. XSLT is both to complex (to learn) and to simple (not enough features)
Can anybody recommend a C# library or engine? I have seen Altova MapForce but would like something I can send out to dozens of people who will build / design their own transformations without having to pay dozens of Altova licenses.
If you think that XSLT is too difficult for you, I think you can try LINQ to XML for parsing XML files. It is integrated in the .NET framework, and you can use C# (or, if you use VB.NET 9.0, better because of the XML literals) instead of learning another language. You can integrate it with the existing application without much effort and withouth the paradigm mismatch between the language and the file management that occurs with XSLT.
Microsoft LINQ to XML
Sure, it's not a framework or library for parsing files, but neither XSLT is, so...
XSLT is not going to work for EDI and CSV. If you want a completely generic transformation engine, you might have to shell out some cash. I have used Symphonia for dealing with EDI, and it worked, but it is not free.
The thing is the problem you are describing sounds "enterprisey" (I am sure nobody uses EDI for fun), so there's no open source/free tooling for dealing with this stuff.
I wouldn't be so quick to dismiss XSLT as being too complex or not contain the features you require.
There are plenty of books/websites out there that describe everything you need to know about XSLT. Yes, there is a bit of a learning curve but it doesn't take much to get into it, and there's always a great community like stackoverflow to turn to if you need help ;-)
As for lack of features you can always extend xslt and call .NET assemblies from the xslt using the
XsltArgumentList.AddExtensionObject() method, which would give you the power you need.
MSDN has a great example of using this here
It's true that the MapForce and Biztalk applications make creating xslt very easy but they also cost a bit. Also, depending on your user base (assuming non developers), I think you'll find that these applications have there own learning curves and are often too feature rich for what you need.
I'd recommend you to consider building and distributing your own custom mapping tool specific to your users needs.
Also if you need a library to assist with file conversions I'd recommend FileHelpers at SourceForge
DataDirect Technologies has a product that does exactly this.
At http://www.xmlconverters.com/ there is a library called XmlConverters which converts EDI to XML and vice-versa. There are also converters for CSV, JSON, and other formats.
The libraries are available as 100% .net managed code, and a parallel port in 100% Java.
The .net side supports XmlReader and XmlWriter, while the Java side supports SAX, StAX and DOM. Both also support stream and reader/writer I/O.
DataDirect also has an XQuery engine optimized for merging relational data with EDI and XML, but it is Java only.
Microsoft BizTalk Server does a very good job of this.
Related
I'm currently programming an application, that uses WPF.
Therefore I'm planning to load the GUI dynamically via XAML based upon a given XML.
As I see it, I have two choices:
Evaluate XML by myself with xpath and create GUI elements by myself.
Generate XAML through a XSLT transformation and load that file.
So, the question is, which way is more suitable? Or is there no difference and it's just a question of which way I prefer more?
XSLT sounds like a bad choice:
As soon as things get a bit harder, you start hacking around, plus .NET framework uses XSLT version which is older than the last one. Meaning; you have a lot less capabilities available, unless you start using third-party library for XSL transformations.
Forcing developers to learn new technology which you can easily avoid. Imagine new developer taking over of your work with no experience on XSLT. I imagine the code will be even hard to read for experienced developers.
With XML, it's pretty straight forward. However, XPath can be also quite a mess, if you start nesting, and nesting.
Define a XML format, use xml-->object deserialization, and start building UI from the objects. Don't bother with Xpath. Use XmlSerializer for "parsing".
Is it possible to serialize the class/object in C# and deserialize the same in java. I want to serialize the class and not any XML/JSON data. Please clarify.
Thanks
I see 3 options here. I suggest option 1, Protobufs.
Look into Google's ProtoBufs
Or some equivalent. Here's the java version. Here's a C# port.
Protobufs meant for this sort of language interop. Its binary, small, fast, and language agnostic.
Also it has backwards compatibility, so if you change the serialized objects in the future, you can still read them. This feature is transparent to you too, long as you write code understanding newer variables could be missing when unserialized old objects. This is a huge advantage!
Implement one language's default serialization in the other
You can try implementing the java serialization logic in C#, or the C# serialization routines in Java. I don't suggest this as it will be more difficult, more verbose, almost certainly slower as you're writing new code, and will net you the same result.
Write your serialization routines by hand
This will certainly be fast, but tedious, more error prone, harder to maintain, less flexible...
Here's some benchmarks for libraries like ProtoBufs. This should aide you in selecting the best one for your use case.
We did this a while ago, it worked after lot of tinkering, it really depends on byte encoding, i think JAva uses one and C# uses another (little endian vs. big endian) so you will need to implement a deserializer which takes this affects into account. hope this helps
As others have suggested, your options are going to be external serialization libraries (Google Protobuff, Apache Thrift, etc), or simply using something built-in that's slower/less efficient bandwidth-wise (JSON, XML, etc). You could also write your own, but believe me, it's a maintenance nightmare.
Not using native serialization. The built-in defaults are tied to the binary representation of the data types, which are different for the different VMs. The purpose of XML, JSON, and similar technologies is precisely to provide a format that's generic and can be moved between differing systems. For what it's worth, the overhead in serializing to JSON is usually small, and there's a lot of benefit to being able to read the serialized objects manually, so I'd recommend JSON unless you have a very specific reason why you can't.
Consider OMG's standard CORBA IIOP.
While you many not need the full-on "remote object" support of CORBA, IIOP is the underlying binary protocol for "moving language-neutral objects" (such as an object value parameter) across the wire.
For Java: Java EE EJB's are based on IIOP, there is RMI-IIOP; various support libraries. The IDL-to-Java compiler is delivered with the JDK.
For C# IIOP & integration with Java EE, see IIOP.NET
You can also consider BSON, which is used by MongoDB.
If it is OK for your C#/Java programs to communicate with a mongodb database, you could store your objects there and read it with the appropriate driver.
Regarding BSON itself, see BSON and Data Interchange at the mongoDB blog.
The xml fields seems filled wit jargon, (well to new XML users its jargon), DTD, DOM, and SGML just to name a few.
I've read up on what an XML document is, and what makes a document valid. What I need are the next steps, or how to actually use an XML document. For the .Net platform there seems to be a plethora of ways to traverse an XML document, xpath, XMLReader (from System.Xml), datasets, and even the lowly streamreader.
What is the best approach? Where can I find more "advanced beginner" material? Most of the material I find is about differences in XML parsing approaches (like performance, more advanced stuff that assumes one has XML experience), or explaining XML in general terms for non-programmers (how it's platform independent, human readable, etc.)
Thanks!
Also for specifics I'm using C# (so .Net). I've tinkered around with XML in vba, but Ive run into the same problems. Practical application here is getting an iOS application to dump info into a SQL server.
Download Linqpad and it's samples. It has quite a large library of examples of Linq to XML that you might find very usefull.
http://www.linqpad.net/
It's hard to do this without some idea of the problem you want to solve.
You need to make a decision whether you want to process the XML using procedural languages like C#, or declarative languages like XSLT and XQuery. For many tasks, the declarative languages will make your life much easier, but there is more of a learning curve, and a lot depends on where you are coming from in terms of previous experience. Generally working at the C# level is appropriate if your application is 10% XML processing and 90% other things, while XSLT/XQuery are more appropriate if it's 90% XML manipulation and 10% other things.
Learn the two primary (early) methods of processing an XML document: SAX and DOM. Then learn how to use one of the new "pull" parsers.
Without learning how to parse XML, you are in danger of designing XML that poorly supports the task(s) at hand.
Recommended reading, even if you are working in C#
Java & XML, 2nd Edition (OReilly)
Java & XML data binding (OReilly)
SAX and DOM are universal enough that the language differences between C# and Java are not the hardest part of using XML effectively. Perhaps there are C# equivalents of the above, if so then use them.
As far as the "best" means of using XML is concerned, it depends heavily on the task at hand. There's no "best" way of using a text document, either! If you are processing very large streams, SAX works great until you need to cross reference. DOM is great for "whole document in memory" processing, but due to it's nature suffers when the documents get "too big".
The "right" solution is to tailor your XML to exploit the strengths of the means by which it will be processed and transformed into useful work, while avoiding the pitfalls that accompany the chosen processing methodology. That's pretty vague, but there's more than one way to skin this proverbial cat.
For a little night project I would like to write a validation component that could be used in .NET application to do the usual and tedious validation of object, input parameters and post conditions.
My first idea was to dump all this validation setup logic into a XML configuation file and provide a liquid interface for the people that would like to have it in code.
Because I would like to deliver something that is actually usable I thought about providing a specialized DSL (domain specific language). The question is what tools should I use to do this?
I thought about parsing it by hand using regex. But personally I would like to have something more...usable.
So what would you suggest?
It sounds like you're talking about implementing one of .Net 4.0's features, code contracts.
So I guess my recommended tool would be VS.Net 2010.
If you're looking specifically at a DSL, have a look at the ANTLR project. We've used it at my company quite successfully in the past.
The thing with DSl's is that they rarely are effective in isolation. To be useful for writing real software, you really need to be able to embed the DSL inside a host language. Compare, for example, the way Linq works vs just straight SQL. Another good example is XML literal feature in VB. Both let you write real code, in a general purpose PL and inter weave it with simpler declaritive DSL code.
The result is something much more powerful than stand alone SQL or a simple XML editor.
The downside to this, unfortunately, is that neither C# nor VB offers any meta programming features, so the only way to do that for mainstream .net devs is to build your own language. If this is something you are doing just for fun you might be able to modify the mono C# compiler to add the features you are interested on to the language. Another alternative might be to try ruby. It has a flexible syntax whic let's you get away with a lot of crazyness. Personaly, however, I would prefer the hacked C# approach.
Might want to check out Building Domain Specific Languages in Boo (Boo is a CLR language, concepts should carry over to C#). An example project is Simple State Machine.
Take a look at Oslo from Microsoft, not sure if it does what you want, but I know that you can build DSL parsers and specify grammars and have it generate class libraries that can parse data based on the grammar.
More details of the "M" language and how to use it are here
Why go so far? Start off using generics and a fluent interface. Start simple and work it through some production. If the friction is too high dealing with a fluent interface, then look at using a DSL.
You can try DSL Tools for Visual Studio.
You should look at Irony project at http://irony.codeplex.com/
I'm about to embark on a new project within which we require the ability to re-use validations based on (preferably XML) on both the client and server.
We would setup a service to provide the XML validation configuration data to the client side.
The following is not meant to be inflammatory in any way.
The Enterprise library does have support for the validation of objects to be configured in XML but java developers would not have access to a java reader version of this XML interpretation.
There is also Spring.Net validation but again I think this may be tied too much to .net. Is the Spring.Net validation suite straight ported over from the java spring framework i.e. without changes to the xml config?
Is there any other frameworks for validation which are able to be used in both .Net and Java?
The project will be fully SOA and the validation is one of the last things I have to figure out.
EDIT:
To clarify the validation needs to occur within the language that the receiving client is using, i.e. if the client to the web service is Java then the validation would be read into java and validated within java so that error conditions could be reported to the UI for the user to rectify. Equally if it was a .net client the .net client would be able to read it in and provide the same functionality.
I don't want to validate within the xml, the xml will be a set of rules, i.e. Customer.Name will be a maximum 50 chars long and must be at least 5 chars, and is a required field.
Thanks
Pete
Have a look at DROOLS.
There are .Net and Java versions of the rules engine.
Java Link and .Net Link
I've not user the libraries, so cannot comment on how "seamlessly" the one set of rules could be used in both environments.
How about trying the validation in a scripting language that can be run in both the jvm and by .net.
Scripting languages would be ideal for this kind of logic so maybe:
Ruby - http://www.ironruby.net/ and http://www.jruby.org/
or Perl.
This approach would allow use to the exact same code for validation and then call this from Java or .net.
Using jruby wouldn't be much of a performance overhead and it can integrate very closely with java. I've less experience with Ironruby but from what I've read once the code has been loaded and is running the performance is ok and it can be integrated well into the .net code - see: http://www.ironruby.net/Documentation/.NET/Hosting
Not to take away from my answer but regardless of how you do this it will involve introducing a new technology with all the associated overheads - dev environment etc. A better approach may be just to do it in .net and java seperately but maintain a very extensive test suite of examples to ensure that two validations remain in sync.
Not sure what sort of validation your are trying to accomplish. If your business objects are going to be serialized in XML form, then aside from schema validation, you can augment that with additional business rules and checks using Schematron.
Schematron is an ISO standard and provides a way of encoding business rules, restrictions and validation that is not possible in XML Schema.
The Schematron differs in basic
concept from other schema languages in
that it not based on grammars but on
finding tree patterns in the parsed
document. This approach allows many
kinds of structures to be represented
which are inconvenient and difficult
in grammar-based schema languages. If
you know XPath or the XSLT expression
language, you can start to use The
Schematron immediately.