Advantage XPath Evaluation over XSLT

Advantage XPath Evaluation over XSLT - c#

I'm currently programming an application, that uses WPF.
Therefore I'm planning to load the GUI dynamically via XAML based upon a given XML.
As I see it, I have two choices:
Evaluate XML by myself with xpath and create GUI elements by myself.
Generate XAML through a XSLT transformation and load that file.
So, the question is, which way is more suitable? Or is there no difference and it's just a question of which way I prefer more?

XSLT sounds like a bad choice:
As soon as things get a bit harder, you start hacking around, plus .NET framework uses XSLT version which is older than the last one. Meaning; you have a lot less capabilities available, unless you start using third-party library for XSL transformations.
Forcing developers to learn new technology which you can easily avoid. Imagine new developer taking over of your work with no experience on XSLT. I imagine the code will be even hard to read for experienced developers.
With XML, it's pretty straight forward. However, XPath can be also quite a mess, if you start nesting, and nesting.
Define a XML format, use xml-->object deserialization, and start building UI from the objects. Don't bother with Xpath. Use XmlSerializer for "parsing".

Related

Proper factoring when generating files with XSLT

In a particular segment of a system I'm working on, we are generating PDF and HTML files using XSLT (for email, print and display.) The business model being printed is in code (c#).
When designing the schema, I made special considerations for the requirements of the printed documents as XSLT is much more difficult (possibly just for me?) to work with than C#. For example, I generate aggregate values and tables from the business model for display on the document. These decisions don't translate well to other areas where similar Xml might be used.
I'm now facing a problem of others using the Xml as well and therefore breaking SoC.
I'm leaning towards taking a snapshot of the Xml they originally latched on to and giving them a new method. I personally don't see a problem with this (in the face of DRY), but others might have a hard time understanding the trade-off. Is my reasoning flawed? Is there a better approach?

I know it's a personal thing, but my choice would always be to put as much of the logic as possible in the XSLT code rather than the C# code - the opposite of what you are doing. It means you're working in a higher-level declarative language, and one that is expressly designed for manipulating XML. There is a learning curve, of course, but at the top of the learning curve you will find sunlit uplands. And don't allow yourself to be put off by the limitations of XSLT 1.0: 2.0 leaves all those problems behind, you just have to be prepared to ditch Microsoft and use third-party technology (Microsoft stopped doing anything new in the XML space about a decade ago, but that doesn't mean you have to stay stuck in the past).

Programatically filter XML in a streaming fashion (XmlWrappingReader/Writer alternatives?)

I'm working with some .NET services that have the potential to process significantly large XML documents, and I need to ensure that all processing is done in a streaming / pipelining fashion. I'm already using the XmlReader and XmlWriter classes. My question is, what is the best way to programmatically provide a filter into the reader and writer (either, depending upon the flow)?
(I am not looking for XSLT. I already do a lot with XSLT, and many of the things I'm looking to do are outside the scope of XSLT - or at least, implementing within XSLT would not be ideal.)
In Java & SAX, this would best be handled through a XMLFilterImpl. I do not see that .NET provides anything similar for working with a XmlReader. I did find this blog post, "On creating custom XmlReaders/XmlWriters in .NET 2.0, Part 2", which includes the following (I've fixed the first link from a broken link from the original post):
Here is the idea - have an utility wrapper class, which wraps
XmlReader/XmlWriter and does nothing else. Then derive from this class
and override methods you are interested in. These utility wrappers are
called XmlWrapingReader and XmlWrapingWriter. They are part of
System.Xml namespace, but unfortunately they are internal ones -
Microsoft XML team has considered making them public, but in the
Whidbey release rush decided to postpone this issue. Ok, happily these
classes being pure wrappers have no logic whatsoever so anybody who
needs them can indeed create them in a 10 minutes. But to save you
that 10 minutes I post these wrappers here. I will include
XmlWrapingReader and XmlWrapingWriter into the next Mvp.Xml library
release.
These 2 classes (XmlWrappingReader and XmlWrappingWriter) from the Mvp.Xml library are currently meeting my needs nicely. (As an added-bonus, it is a free & open-source library, BSD licensed.) However, due to the stale status of this project, I do have some concerns with including these classes in a contracted, commercial development project that will be handed-off. The last release of Mvp.Xml was 4.5 years ago in July 2007. Additionally, there is this comment from a "project coordinator" in response to this project discussion:
Anyway, this is not really a supported project anymore. All devs moved
out. But it's open source, you are on your own.
I've also found SAX equivalent in .Net, but SAXDotNet doesn't seem to be in any better shape - with its last release being in 2006.
I'm well aware that a stale project doesn't necessarily mean that it is any less useable, and will be moving forward with the 2 wrapper classes from the Mvp.Xml library - at least for now.
Are there any better alternatives that I should be considering? (Again, any solution must not require the entire XML to exist in-memory at any one time - whether as a DOM, a string, or otherwise.) Are there any other libraries available (preferably something from a more active project), or maybe something within the LINQ features that would meet these requirements?

Personally I find that writing a pipeline of filters works much better with a push model than a pull model, although both are possible. With a pull model, a filter that needs to generate multiple output events in response to a single input event is quite tricky to program, though of course it can be done by keeping track of the state. So I think that looking for a SAX-like approach makes sense.
I would look again at SaxDotNet or equivalents. Be prepared to look at the source code and bend it to your needs; consider contributing back your improvements. Intrinsically the job it is doing is very simple: a loop that reads events from the (pull) input and writes events to the (push) output. In fact, it's so simple that perhaps the reason it hasn't changed since 2006 is that it doesn't need to.

Learning XML, what are the next steps? (navigating a document elegantly)

The xml fields seems filled wit jargon, (well to new XML users its jargon), DTD, DOM, and SGML just to name a few.
I've read up on what an XML document is, and what makes a document valid. What I need are the next steps, or how to actually use an XML document. For the .Net platform there seems to be a plethora of ways to traverse an XML document, xpath, XMLReader (from System.Xml), datasets, and even the lowly streamreader.
What is the best approach? Where can I find more "advanced beginner" material? Most of the material I find is about differences in XML parsing approaches (like performance, more advanced stuff that assumes one has XML experience), or explaining XML in general terms for non-programmers (how it's platform independent, human readable, etc.)
Thanks!
Also for specifics I'm using C# (so .Net). I've tinkered around with XML in vba, but Ive run into the same problems. Practical application here is getting an iOS application to dump info into a SQL server.

Download Linqpad and it's samples. It has quite a large library of examples of Linq to XML that you might find very usefull.
http://www.linqpad.net/

It's hard to do this without some idea of the problem you want to solve.
You need to make a decision whether you want to process the XML using procedural languages like C#, or declarative languages like XSLT and XQuery. For many tasks, the declarative languages will make your life much easier, but there is more of a learning curve, and a lot depends on where you are coming from in terms of previous experience. Generally working at the C# level is appropriate if your application is 10% XML processing and 90% other things, while XSLT/XQuery are more appropriate if it's 90% XML manipulation and 10% other things.

Learn the two primary (early) methods of processing an XML document: SAX and DOM. Then learn how to use one of the new "pull" parsers.
Without learning how to parse XML, you are in danger of designing XML that poorly supports the task(s) at hand.
Recommended reading, even if you are working in C#
Java & XML, 2nd Edition (OReilly)
Java & XML data binding (OReilly)
SAX and DOM are universal enough that the language differences between C# and Java are not the hardest part of using XML effectively. Perhaps there are C# equivalents of the above, if so then use them.
As far as the "best" means of using XML is concerned, it depends heavily on the task at hand. There's no "best" way of using a text document, either! If you are processing very large streams, SAX works great until you need to cross reference. DOM is great for "whole document in memory" processing, but due to it's nature suffers when the documents get "too big".
The "right" solution is to tailor your XML to exploit the strengths of the means by which it will be processed and transformed into useful work, while avoiding the pitfalls that accompany the chosen processing methodology. That's pretty vague, but there's more than one way to skin this proverbial cat.

What components of the .Net framework should a professional developer typically avoid?

.Net is a huge framework with some functionality that appears to target beginners or becomes problematic if much customization is involved. So what functionality available in the .Net framework do you feel professional developers should avoid and why?
For example, .Net has a wizard for common user management functions. Is using this functionality considered appropriate for professional use or a beginner only?
One component/feature/class, etc per answer please so votes are specific to a single item.

Typed DataSets
ASP.NET *View Controls
ASP.NET *DataSource Controls

MS Ajax
jquery, and other js frameworks like prototype etc., are a more lightweight and flexible alternative. The MS Ajax controls may seem great initially, until you really need a custom behaviour out of the scope of the controls.
Microsoft themselves have recognised this to some extent in that jquery will be bundled with upcoming versions of visual studio, with intellisense support.

I think generally most controls/features that do a lot of work "behind the scenes" can cause a lot of trouble. No problem using a GridView if that layout is exactly what you want - but it very rarely is, and a Repeater is probably a better choice. UpdatePanels can save you lots of work if you want an ajaxy feel to your site, but compared with a jQuery AJAX call they - sorry to say so - suck. The user wizard you mention can be really useful during development, but if membership functionality is required in the project it should be built as an integrated part of it.
So in summary: Professional programmers should do the job themselves and write code that specifically satisfies their clients needs, and only take in ready made parts of the .Net Framework when that is in fact exactly what they need.

Thread.Abort
Here is an excellent article by Ian Griffiths about Why Thread.Abort is Evil and some better alternatives.

Remoting is generally a good one to avoid, at least if you're targeting 3.0 or above and can therefore easily host messaging endpoints in-process.

Linq To XML
XmlDocument/Xpath is easier to use, if you want strong typing to parse your document use xsd.exe or Xsd2Code.
EDIT
which one do you prefer ?
IEnumerable<XElement> partNos =
from item in purchaseOrder.Descendants("Item")
where (int) item.Element("Quantity") *
(decimal) item.Element("USPrice") > 100
orderby (string)item.Element("PartNumber")
select item;
or, with XmlDocument and XPath
var nodes = myDocument.SelectNodes("//Item[USPrice * Quantity > 100]");

.Net is a huge framework with some functionality that appears to target beginners or becomes problematic if much customization is involved.
It's the "appears to target beginners" that's the real problem.
Typed data sets are a great example. VS provides a nice simple UI to functionality that should be used only by rank beginners building extremely simple demo applications and experienced professionals who understand every nuance of the ADO.NET object model and what typed data sets are actually doing. Nobody in between those two poles should touch it, because there's a good way to learn every nuance of the ADO.NET object model and typed data sets aren't it.
Or take LINQ. It's seductively easy to write LINQ code without having a good understanding of IEnumerable<T>. But it's not so easy to write maintainable LINQ code without that knowledge.

You can think of .NET like an onion with many layers. For example the .NET compact framework is a subset of full .NET. Further there are "extra" layers on top on .NET in the form of "Extensions" which are optional installs for new features which have not yet been made part of .NET proper. An example of this would be when Microsoft released ASP.NET 3.5 Extensions which has now been rolled into .NET 3.51.
Another way to think of .NET is as a set of "libraries" you can use. For example there are a set or routines to support RegEx. If you want or need regular expressions, then you use these functions, if not you can simply ignore them. SImilary functions for things like trigonometry or security.
So I guess it really boils down to what do you need for your application? If you are doing scientific programming you may well want the trig functions. A graphical app will require functions that a console application would not. Web apps probably do not need to use the clipboard functions etc.
I really don't think there are any bad APIs in .NET, just programmers who use them in inappropriate ways.

There is lots to avoid in the WinForms library.
Avoid DataBinding to most standard WinForms controls. There are many bugs in that area which will lead to lots of head scratching. Or at least that has been my experience. NumericUpDown is a good example of this buggy mess.
Also avoid the standard WinForms controls when dealing with large datasets. They do a lot of data copying and can't deal well with large datasets.
Avoid ListView in "Virtual" mode as it is full of bugs.
In general I just recommend staying away from WinForms. If you have the option go for WPF or at least buy a good, well supported (and hopefully less buggy) 3rd party forms library.

Is there a Transformation engine or library using .NET?

We're looking for a Transformation library or engine which can read any input (EDIfact files, CSV, XML, stuff like that. So files (or webservices results) that contain data which must be transformed to a known business object structure.) This data should be transformed this to a existing business object using custom rules. XSLT is both to complex (to learn) and to simple (not enough features)
Can anybody recommend a C# library or engine? I have seen Altova MapForce but would like something I can send out to dozens of people who will build / design their own transformations without having to pay dozens of Altova licenses.

If you think that XSLT is too difficult for you, I think you can try LINQ to XML for parsing XML files. It is integrated in the .NET framework, and you can use C# (or, if you use VB.NET 9.0, better because of the XML literals) instead of learning another language. You can integrate it with the existing application without much effort and withouth the paradigm mismatch between the language and the file management that occurs with XSLT.
Microsoft LINQ to XML
Sure, it's not a framework or library for parsing files, but neither XSLT is, so...

XSLT is not going to work for EDI and CSV. If you want a completely generic transformation engine, you might have to shell out some cash. I have used Symphonia for dealing with EDI, and it worked, but it is not free.
The thing is the problem you are describing sounds "enterprisey" (I am sure nobody uses EDI for fun), so there's no open source/free tooling for dealing with this stuff.

I wouldn't be so quick to dismiss XSLT as being too complex or not contain the features you require.
There are plenty of books/websites out there that describe everything you need to know about XSLT. Yes, there is a bit of a learning curve but it doesn't take much to get into it, and there's always a great community like stackoverflow to turn to if you need help ;-)
As for lack of features you can always extend xslt and call .NET assemblies from the xslt using the
XsltArgumentList.AddExtensionObject() method, which would give you the power you need.
MSDN has a great example of using this here
It's true that the MapForce and Biztalk applications make creating xslt very easy but they also cost a bit. Also, depending on your user base (assuming non developers), I think you'll find that these applications have there own learning curves and are often too feature rich for what you need.
I'd recommend you to consider building and distributing your own custom mapping tool specific to your users needs.
Also if you need a library to assist with file conversions I'd recommend FileHelpers at SourceForge

DataDirect Technologies has a product that does exactly this.
At http://www.xmlconverters.com/ there is a library called XmlConverters which converts EDI to XML and vice-versa. There are also converters for CSV, JSON, and other formats.
The libraries are available as 100% .net managed code, and a parallel port in 100% Java.
The .net side supports XmlReader and XmlWriter, while the Java side supports SAX, StAX and DOM. Both also support stream and reader/writer I/O.
DataDirect also has an XQuery engine optimized for merging relational data with EDI and XML, but it is Java only.

Microsoft BizTalk Server does a very good job of this.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.