I am writing a parser to parse incoming text files. I have it to where it will parse everything accurately.
I have an option for it to output to text - this was done to check the accuracy of the parsing. I am currently implementing an option to write to a spreadsheet but it doesn't output everything yet.
I have a request to output as static HTML. Is it worth outputting to XML and then generating HTML from that?
I see C# has the XMLTransform class which looks like it would do what I need. Is using the XML designer in VS and writing the XSLT file easier than hand-coding all of the HTML output? I know Excel will import XML files, but it is a little messy and I don't get the formatting options I can get if I generate the .xls file directly
I would give you a qualified No.
It is generally not worth building XML then running it through an XSLT transformation to build HTML.
That said, I might consider such an option if I wanted to easily swap out transformations, such as if this is an app used by multiple clients and the generated HTML would be client dependent. Even then I'd investigate using a simple tokenized HTML template in which I just plugged in the data I wanted. However, if the transformation was sufficiently complex then, yes, I'd go the XSLT route.
The reason for the No is that by the conversion adds such a level of complexity that it is usually not worth the time involved.
Related
I'm using XML configuration files with custom XML tags in a small javascript opensource project.
The framework takes xml config files written by the user.
Because simple XML is a little bit too much for some users and it feels like handcoding, I want to build a little editor.
The XML Files are really small and simple: An Text (maybe with html formatting), and two other text parameters (simple tags) (one has an additional binary option)
I thought this use case happens regularly and perhaps someone has coded a framework or template for building such an editor. I could really save some time.
I googled for that, but found nothing.
The language, which the template/framework uses isn't important, but I would prefer Java, C# or Python to run it on many platforms.
Does anybody know a tool, framework, template I could use?
I'm beginning to work on a project which has some extensive XML XSLT processing to render output HTML.
Some changes need to be made to the XSLT and I need some tool that can help me modify it without having to run the solution every time. Something that can help me visualize the changes I'm making to the rendered HTML.
I've found StylusStudio but I preferably would want a freeware that I could use
It's not freeware, but Altova XMLSPY is pretty powerful XML IDE. It offers an XSLT debugger where you can step through your conversion, as well as generate output(HTML in your case) from a sample XML document with the XSLT document you are working on.
I have a requirement to hand-code an text file from data residing in a SQL table. Just wondering if there are any best practices here. Should I write it as an XMLDocument first and transform using XSL or just use Streamwriter and skip transformation altogether? The generated text file will be in EDIFACT format, so layout is very specific.
The normal thing to do is just write the EDIFACT data directly.
Creating it as an XMLDocument and transforming it to EDIFACT might be useful if there's a library already available to do the transformation. I say this because there's a lot of language support for XML output.
I can't see how XSL will help you here, but I've never had to output EDIFACT data.
http://www.stylusstudio.com/edi/XML_to_EDIFACT.html
This URL has an example XSLT for translating XML to EDIFACT which might solve your problem.
I've been tasked with converting some text log files from a test reporting tool that I've inherited. The tool is a compiled C# (.NET 3.5) application.
I want to parse and convert a group of logically connected log files to a single XML report file, which is not a problem. The System.Xml classes are easy enough to use.
However, I also want to create a more "readable" file to accompany each report. I've chosen HTML, and because I like standardization, I'd prefer to do it in proper XHTML.
My question is how should I go about creating the HTML files along with the XML reports? My initial thought is to build the XML file, then to use LINQ and a simple StreamWriter to build an HTML file within my C# code. I could also use XSLT as opposed to LINQ to make the C# code easier. But since I have to compile this anyway, I don't like the idea of adding more files to the installation/distribution.
Is using LINQ going to cause me any problems as opposed to XSLT? Are there any nice HTML writing libraries for .NET that conform to XHTML? Since I have everything parsed from the log files in working memory, is there an easy way to create both files at the same time easily?
I'd create an xslt transform and just run that against the XML. Linq really isn't designed to transform XML of one schema (e.g., your report) to another (e.g., xhtml). You could brute force it, but xslt is an elegant way to do it.
I would actually recommend using XSL transform. Since you already have the XML doc. If you write a good XSL transform you will get very good results.
http://www.w3schools.com/xsl/xsl_transformation.asp
small snippet:
XslCompiledTransform xsl = new XslCompiledTransform();
xsl.Load(HttpContext.Current.Server.MapPath(xslPath));
StringBuilder sb = new StringBuilder();
using (TextWriter tw = new StringWriter(sb))
{
// Where the magic happens
xsl.Transform(xmlDoc, null, tw);
//return of text which you could save to file...
return sb.ToString();
}
One nice thing with using the XSLT is that you can put a processing instruction at the top of the XML that tells the user's browser how to generate the report itself:
<?xml-stylesheet type="text/xsl" href="Your_XSLT.xslt"?>
That way you don't have to have a separate step to generate the report. The user just opens the XML file directly in the browser, but they see the generated report instead.
What is the best way to convert between HTML, XML, and XSL-FO in C#?
I already have the HTML (piped in from FCKEditor) and I'd like to print a PDF (I have an XSL->PDF converter). I just can't seem to find a library that will convert from HTML into anything XSL friendly.
A year or two back, I had to generate pdfs from a C++/C# program. In the end I settled on launching Apache's Java FOP as a separate process to do the conversion. The experience with xsl-fo was not a pleasant one. At the time, there didn't appear to be a single tool that had implemented xsl-fo completely. Tools tended to pick a subset of the specification and hack away at that. Given the sprawling complexity of xsl-fo, I'm starting to wonder if there will ever be a full implementation.
FOP tended to be buggy and considerable time was spent working around issues. XSLT and XPaths were difficult to learn. It took a few weeks before I was seeing past the verbosity and could quickly get things done. I don't think I ever quite got my head around xsl-fo though. It makes the html and css model look like a child's toy. Luckily, the pdfs generate, and don't have too many problems. :-)
Anyway, the task at hand: generating pdfs from xhtml output from FCKEditor.
I just can't seem to find a library that will convert from HTML into anything XSL friendly.
Heh. Yeah, that's 'cos there isn't one, and probably won't be an html to xsl-fo converter that's any good. Such a converter has a few things against it: complexity of browsers and complexity of xsl-fo. For such a converter to deal with an average html document, it needs the guts of a web browser: the layout, css support probably even JavaScript. Then it has to take the rendered page, and figure out what xsl-fo is needed to get something which looks similar, and fits within the paged constraints of xsl-fo.
It's like the problem with making a word viewer: without reimplementing a lot of word, it sucks most of the time because it doesn't look the same.
So... what can you do? Well, having a small subset of html to work with is a good start. Hopefully the output from FCKEditor is xhtml, as getting html into xml is a world of pain in itself (which tidy can be useful for). Next, unless some poor soul has already made an FCKEditor xhtml -> xsl-fo xslt for your xsl-fo implementation, you'll have to make one. That involves learning xsl-fo, xslt and xpath. In my experience it'll take a few weeks and will be a cobbled together solution.
To get started with xsl-fo I found the following links useful:
XSL-FOTutorial
XSL Standard
Apache FOP Compliance Page
XSL-FO: Ready for Prime Time? outlines the problem xsl-fo tries to solve
For three quick intros see a, b and c
So what's all this xsl-fo, xslt stuff and all the other things? The XSL-FO: Ready for Prime Time? lays it out as:
The Extensible Stylesheet Language Family (XSL) XSL is a family of recommendations for defining XML document transformation and presentation. It consists of three parts:
XSL Transformations (XSLT), a language for transforming XML
The XML Path Language (XPath), an expression language used by XSLT to access or refer to parts of an XML document. (XPath is also used by the XML Linking specification)
XSL Formatting Objects (XSL-FO), an XML vocabulary for specifying formatting semantics
My advice? Run. Find another away. Find another solution. Generate LaTeX files, and convert them into pdfs. Generate something else. Make word documents and print them using PDFCreator. Generate images. Control Firefox to print pages as pdfs. Find away to avoid needing pdfs at all. Anything, as long as it isn't fighting html, xsl-fo, FOP, xslt and xpath.
PS: Let me know if you need any help. :-)
I'd first try XSLT. When you're talking about formatting XML documents (and that's pretty much what you're talking about), that's the tool designed to do it.
From Wiki:
"The general idea behind XSL-FO's use
is that the user writes a document,
not in FO, but in an XML language.
XHTML, DocBook, and TEI are all
possibilities, but it could be any XML
language. Then, the user obtains an
XSLT transform, either by writing one
themselves or by finding one for the
document type in question. This XSLT
transform converts the XML into
XSL-FO."
You need an XSLT transform for HTML to XSL-FO. Not sure where to get one, but apparently the concept isn't alien.
Very informative exchange here. I have created a web application using ASP.NET and C#.NET for my IT contract business. One of the major goals of the web app is to generate customized resumes in various formats. I store my resume content in a SQL Server database and build the XML mostly raw in a C# method. I used XSLT to convert to HTML and with a little akwardness have finally got a basic presentable resume. My next goal is to get a printable version of the resume. I got a book on XML from the library and touched up the XSLT a little. Then I came to the XSL-FO chapter. That's when the iceberg hit. I wanted to take on the challenge of having a PDF option that would be a menu choice and do a tranform to XSLT to XSL-FO to PDF. Thing is all the book recommendations had references to commercial products. It is just not worth the money as PDF is not neccessary. I looked at Altova XMLSpy on a 30 day trail basis but as soon as I tried my first transform of a XSL-FO example file I got a message stating that I needed to download more software. That download was taking forever from their site so I gave up and removed the software. Free versions of the commmercial software from other vendors do not have the transform option. After reading the notes here I have decided to avoid the XSL-FO myself. I am going to try getting an MS Word version now and if my clients want to convert it to PDF they can pay for the PDF create version from Adobe.
This is a dead question but I would like to add for future readers that the current incarnation on FCKEditor (CKEditor now) is better at producing high quality XHTML (even a user-definable set of tags is possible).
I have gotten around similar issues by actually not using XSL-FO but using a (X)HTML to PDF converter that renders the PDF from your source without XSL Transforms. I validate the produced XHTML and fix the rare issues with HtmlAgilityPack - that way will get you a long way from non-semantic HTML complexities. There are many converters to choose from, my choice is wkhtmltopdf (If money is not an issue PrinceXML is a superior alternative - I would love to use it but it's simply too expensive).