I have a little problem with validating xml, xslt in details.
I have an xslt stylesheet that transforms xml data source to xsl:fo document.
Something like this:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:template match="/">
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns="http://www.w3.org/1999/xhtml">
<fo:layout-master-set>
<fo:simple-page-master margin-top="25mm" margin-bottom="25mm" margin-left="25mm" margin-right="25mm" page-width="210mm" page-height="297mm" master-name="simplePageLayout">
<fo:region-body region-name="xsl-region-body" column-gap="0.25in" />
<fo:region-before region-name="xsl-region-before" display-align="after" extent="0.1mm" padding-top="0pt" padding-left="0.4in" padding-right="0.4in" padding-bottom="0pt" />
<fo:region-after region-name="xsl-region-after" display-align="before" extent="0.4in" padding-top="4pt" padding-left="0.4in" padding-right="0.4in" padding-bottom="0pt" />
</fo:simple-page-master>
<fo:page-sequence-master master-name="default-sequence">
<fo:repeatable-page-master-reference master-reference="simplePageLayout" />
</fo:page-sequence-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="default-sequence">
<fo:flow flow-name="xsl-region-body">
<fo:block font-family="Segoe UI" color="#000000" font-size="9pt" />
</fo:flow>
</fo:page-sequence>
</fo:root>
</xsl:template>
What I want to do is to validate written xsl:fo elements, ignoring xsl tags. Is it possible?
For now I use dtd validation (I have xsd schema too) for validating Fo, but it gives me an error on each xsl tag.
Summary:
Is it possible to validate only fo elements against the schema, ignoring xsl tags, and how should I do it? Maybe a code snippet in C#, or a hint how to modify documents?
What you want to do the schema validation against is the output after you do the transform, not against the XSLT document. When you run the XSLT transformation against the input XML, the resulting output shouldn't contain any of the XSL tags.
Related
I am using an XSLT file to transform an XML file to another XML file and then creating this XML file locally. I get this error:
System.InvalidOperationException: 'Token Text in state Start would result in an invalid XML document. Make sure that the ConformanceLevel setting is set to ConformanceLevel.Fragment or ConformanceLevel.Auto if you want to write an XML fragment. '
The XSLT file was debugged in visual studios and it looks like it works correctly but I don't understand this error. What does this mean and how can it be fixed?
This is my XML:
<?xml version="1.0" encoding="utf-8"?>
<In xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="take.xsd">
<Submit ID="1234">
<Values>
<Code>34</Code>
<Source>27</Source>
</Values>
<Information>
<Number>55</Number>
<Date>2018-05-20</Date>
<IsFile>1</IsFile>
<Location></Location>
<Files>
<File>
<Name>Red.pdf</Name>
<Type>COLOR</Type>
</File>
<File>
<Name>picture.pdf</Name>
<Type>IMAGE</Type>
</File>
</Files>
</Information>
</Submit>
</In>
My XSLT code:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:output method="xml" indent="yes"/>
<!-- identity template - copies all elements and its children and attributes -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="/In">
<!-- Remove the 'In' element -->
<xsl:apply-templates select="node()"/>
</xsl:template>
<xsl:template match="Submit">
<!-- Create the 'Q' element and its sub-elements -->
<Q xmlns:tns="Q" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://schema.xsd" Source="{Values/Source}" Notification="true">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="Values" />
<xsl:apply-templates select="Information" />
<xsl:apply-templates select="Information/Files" />
</xsl:copy>
</Q>
</xsl:template>
<xsl:template match="Information">
<!-- Create the 'Data' sub-element without all of its children -->
<xsl:copy>
<xsl:copy-of select="Number"/>
<xsl:copy-of select="Date"/>
<xsl:copy-of select="IsFile"/>
<xsl:copy-of select="Location"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
And this is the C# code used to transform the file:
XslCompiledTransform xslt = new XslCompiledTransform();
xslt.Load(#"D:\\Main\XLSTFiles\Test.xslt");
string xmlPath = #"D:\Documents\Test2.xml";
using (XmlWriter w = XmlWriter.Create(#"D:\Documents\NewFile.xml"))
{
xslt.Transform(xmlPath, w);
}
Also, is there a way to produce the new XML file with proper indentation? It seems to create each node after the last one is closed and on the custom template it just appends each item one after another.
It's an amazingly unhelpful message, isn't it? But I think I can decipher it for you.
The XSLT processor is producing its output by writing events such as start-document, start-element, output-text to an XML Writer.
If you want to produce a well-formed XML document, then you can't have any text before the start of the first element. The message is saying that if the last thing you did is to issue start-document, then the next thing isn't allowed to be text, because the document would be ill-formed (it says invalid, but it means ill-formed).
Now, XSLT stylesheets are allowed to produce "well-formed fragments" rather than only being allowed to write "well-formed documents". Actually, the term used in the XML spec is "well-formed external general parsed entity", but that's a bit of a mouthful, so everyone calls them "fragments" because that's what DOM calls them, and there's no point using correct terminology in error messages if no-one understands it. The difference is that a fragment can contain multiple elements and text nodes at the top level, for example this <b>really</b> is a <i>well-formed</i> fragment. The problem is that the destination to which you write the XSLT output might not handle fragments, and in this particular case, the XML Writer can handle a fragment only if it's configured to do so.
I suspect you didn't actually intend to produce a fragment, and you need to fix your XSLT code so it outputs a well-formed document.
To expand on Michael Kay's excellent answer (as this was too long to write in comments), for your particular input XML the issue is with whitespace. In the template matching /In you do this...
<xsl:template match="/In">
<!-- Remove the 'In' element -->
<xsl:apply-templates select="node()"/>
</xsl:template>
But by selecting node() you are selecting the whitespace nodes before and after the child Submit, so you end up with a text node before your root Q element causing the error.
So, what you could do in this case, is simply strip out the whitespace from your XML by adding this to your XSLT
<xsl:strip-space elements="*" />
Alternatively, you could also do this, to select only elements, as opposed other nodes (although this would omit comments and processing instructions)
<xsl:apply-templates select="*" />
However, if you have multiple Submit elements in your XML, you then get multiple Q elements in your output, which will be a fragment, as there would be a single root element. If this is what you really intend, then you should make the following change to your C#...
using (XmlWriter w = XmlWriter.Create(#"C:\Users\tcase.BGT\Documents\NewFile.xml", xslt.OutputSettings ))
The default ConformanceLevel is ConformanceLevel.Auto, which I think allows fragments. Adding this will also solve your indentation problem, as it will use the settings in your xsl:output.
My Schema.xsd file is located in the same directory with the .xsl file. In the .xsl file I would like to generate a link to Schema.xsl in the generated output. The generated output is located in different directories. Currently I do it like this:
<xsl:template match="/">
<root version="1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="../../../Schema.xsd">
<!-- . . . -->
However this forces the generated output to be located 3 levels under the directory of Schema.xsd. I would like to generate an absolute path to the schema in the output, so the output could be located anywhere.
Update. I use XSLT 1.0 (XslCompiledTransform implementation in .NET Framework 4.5).
XSLT 2.0 Solution
Use the XPath 2.0 function, resolve-uri():
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"
omit-xml-declaration="yes"
encoding="UTF-8"/>
<xsl:template match="/">
<root version="1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="{concat(resolve-uri('.'), 'Schema.xsd')}">
</root>
</xsl:template>
</xsl:stylesheet>
Yields, without parameter passing and regardless of the input XML:
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
version="1.0"
xsi:noNamespaceSchemaLocation="file:/c:/path/to/XSLT/file/Schema.xsd"/>
This is a sketch of how to do it (also see Passing parameters to XSLT Stylesheet via .NET).
In your C# code you need to define and use a parameter list:
XsltArgumentList argsList = new XsltArgumentList();
argsList.AddParam("SchemaLocation","","<SOME_PATH_TO_XSD_FILE>");
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load("<SOME_PATH_TO_XSLT_FILE>");
using (StreamWriter sw = new StreamWriter("<SOME_PATH_TO_OUTPUT_XML>"))
{
transform.Transform("<SOME_PATH_TO_INPUT_XML>", argsList, sw);
}
Your XSLT could be enhanced like this:
...
<xsl:param name="SchemaLocation"/> <!-- this more or less at the top of your XSLT! -->
...
<xsl:template match="/">
<root version="1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="{$SchemaLocation}">
...
...
</xsl:template>
....
I am applying an XSL-T file xsltUri to an XML file TargetXmlFile using the XslCompiledTransform class:
XslCompiledTransform xslTransform = new XslCompiledTransform(false);
xslTransform.Load(xsltUri);
using (var outStream = new MemoryStream())
{
var writer = new StreamWriter(outStream, new UTF8Encoding());
using (var reader = new XmlTextReader(TargetXmlFileName)
{
WhitespaceHandling = WhitespaceHandling.All,
DtdProcessing = DtdProcessing.Ignore
})
{
xslTransform.Transform(reader, xsltArguments, writer);
}
outStream.Position = 0;
using (FileStream outFile = new FileStream(outputFileName, FileMode.Create))
{
outStream.CopyTo(outFile);
}
}
Input XML:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element
id="1"
attr1="value11"
attr2="value12"/>
<element id="2" attr1="value21" attr2="value22"/>
</root>
Input XSL:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="//element[#id='2']/#attr1">
<xsl:attribute name="attr1">
<xsl:value-of select="'newvalue21'"/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Actual output XML:
<?xml version="1.0" encoding="utf-8"?><root>
<element id="1" attr1="value11" attr2="value12" />
<element id="2" attr1="newvalue21" attr2="value22" />
</root>
Desired output XML:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element
id="1"
attr1="value11"
attr2="value12"/>
<element id="2" attr1="newvalue21" attr2="value22"/>
</root>
Question: How can I preserve the whitespace (particularly, line breaks) of the input XML file within the "element" tags in the output XML file? I have experimented with different options, but nothing worked for this case.
Thanks for any hints!
This has nothing to do with XSLT. The whitespace you're referring to does not exist in the XML document model, and it cannot be made significant to a conformant XML processor, even with xml:space="preserve". There is no place for it in the DOM, and it will be skipped by the reader; as such there is no way to copy it to the writer. You would have to emit the XML with custom code (in other words, not with an XmlWriter).
The internal formatting of a tag (whitespace between attributes) is completely ephemeral in XML.
As far as XML documents are concerned, it does not exist.
As far as XML parsers are concerned, it is ignored, because 1). The only exception is that whitespace is illegal immediately after a <.
As far as XML serializers are concerned, they can do what they want, because 1) and 2). Most (if not all) will use a single space character to separate attributes from each other.
So...
Don't try to build an application that depends on the source code layout of XML.
Since this kind of source code layout in XML is technically irrelevant… get over your OCD. ;)
I have an XML file, I need to extract values from it, and put them in another XML file.
Questions:
Another person is creating the "schema" for the resulting XML file. Is there something that person can give me that will automate the inserting of the values? Do I even need to extract anything from the XML, or can something like a XSLT just do all the transformation?
Is there a problem with this XML structure below? I tried using xsd2code to generate objects but nothing will load when I use the LoadFromFileMethod - I read an article that wasn't very specific but said "nested parents" cause problems for XSD.exe and xsd2code.
<Section>
<Form id="1"...>
<Control id="12523"..> <--Some have this some don't
<Property name="Color">Red</Property>
<Property name="Size">Large</Property>
</Control>
</Form>
<Form id="2"...>
<Property name="Color">Blue</Property>
<Property name="Size">Large</Property>
</Form>
<Form id="3"...>
<Property name="Color">Red</Property>
<Property name="Size">Small</Property>
</Form>
</Section>
Thank you for any guidance!
XSLT is the tool for XML transformation.
As far as your XML goes, in a lot of applications you should replace this:
<Property name="Color">Red</Property>
with:
<Color>Red</Color>
Some reasons:
If you want to write a schema that restricts an element's content in some way (e.g. to one of a list of values), the element must be identifiable by its name; you can't write one schema for a Property element with a name attribute equals "Color" and another schema for a Property element whose name attribute equals "Size".
It's easier to write XPath predicates if your element names are meaningful. For instance, Form[Color = 'Red'] is a lot easier to write (and read) than Form[Property[#name='Color' and .='Red']]
The above is also true if you're writing Linq queries against the XML, in pretty much the same manner. Compare Element.Descendants("Color") with Element.Descendents("Property").Where(x => x.Attributes["name"] == "Color").
There are applications where it's appropriate to have generically-named elements, too; the above argument's not definitive. But if you're going to do that, you should have good reasons.
XLST is the best way to transform xml from one schema to another. Thats exactly what it was built to do. http://w3schools.com/xsl/default.asp is an excellent XSLT tutorial. All you really need is the schema, or a few examples of his xml to write your xslt file.
Also, your xml looks fine/well-formed to me.
XSLT is probably the solution if you just want to transform it, but if you need to do anything with the values in code then LINQ to Xml will make your task much easier.
I'd use XSLT for this, here's a small example to get you started.
Copy this sample code to an empty c# project:
static void Main(string[] args) {
const string xmlPath = "source.xml";
const string xslPath = "transform.xsl";
const string outPath = "out.xml";
try {
//load the Xml doc
var xmlDoc = new XPathDocument(xmlPath);
//load the Xsl
var xslDoc = new XslCompiledTransform();
xslDoc.Load(xslPath);
// create the output file
using (var outDoc = new XmlTextWriter(outPath, null)) {
//do the actual transform of Xml
xslDoc.Transform(xmlDoc, null, outDoc);
}
}
catch (Exception e) {
Console.WriteLine("Exception: {0}", e.ToString());
}
}
Write your example xml code above into source.xml file and put the following xsl code into transform.xsl file:
<?xml version="1.0" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output indent="yes" method="xml" />
<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="Section">
<OtherSection>
<xsl:apply-templates />
</OtherSection>
</xsl:template>
<xsl:template match="Form">
<OtherForm>
<xsl:attribute name="id">
<xsl:value-of select="#id" />
</xsl:attribute>
<xsl:apply-templates />
</OtherForm>
</xsl:template>
<xsl:template match="Control">
<OtherControl>
<!-- converts id attribute to an id tag -->
<id>
<xsl:value-of select="#id" />
</id>
<xsl:apply-templates />
</OtherControl>
</xsl:template>
<xsl:template match="Property">
<OtherProperty>
<!-- converts name attribute to an id attribute -->
<xsl:attribute name="id">
<xsl:value-of select="#name" />
</xsl:attribute>
<xsl:value-of select="."/>
</OtherProperty>
</xsl:template>
</xsl:stylesheet>
The resulting out.xml should give you an idea of how the xsl is doing the work and hopefully get you started.
For more info on XSLT look up the tutorial on W3Schools.
I'm using C# to translate a XML file to HTML with the use of XSLT.
I use an Extension object to render my own code:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
xmlns:widget="urn:serverTime"
>
<xsl:output method="html" indent="yes"/>
<xsl:template match="/">
<xsl:value-of select="demo:printTime()"/>
</xsl:template>
and in my C#:
XsltArgumentList myList = new XsltArgumentList();
myList.AddExtensionObject("demo:serverTime", new ServerTime());
transform.Transform(document, myList, writer);
This works perfectly. However, I would like to create my own custom tags like:
<demo:printTime />
This doesn't work: the tag is printed to the output without being rendered. How can I make this work so I can use my own tags?
You can't do this. XSLT does not support "custom tags".
If you want to print out anything that is not a literal value, then it must be the result of a function call, wrapped in <xsl:value-of/>.