Subject: [xsl] XSLT 1.0 serializer for XML From: Hermann Stamm-Wilbrandt <STAMMW@xxxxxxxxxx> Date: Thu, 19 Aug 2010 16:26:27 +0200 |
Hello, in another thread Florent posted [1] a link to a XML serializer written in pure XSLT [2]. That serializer is written in XSLT 2.0 and cannot be used in browsers; they do support XSLT 1.0 only. With Michael's help I got the differentiation of the 6 XML node types (text, comment, processing-instruction, element, attribute and namespace) right [3] and was able to output "readable" XML, even for attributes and namespaces. For another tool I modified that to generate HTML output, and by reading Florent's posting I realized that it already did XML serialization. I extracted the serializer which you can find below and under [4]. There is also a XML file online [5] demonstrating the new features: It serializes and displays its own content, the demo XSLT and serialize.xsl with some comments and links -- try it out! In trying to verify correct behavior I used serialize-test.xsl [6] and compared its output displayed in a browser with the output of <xsl:copy-of select="/"/>. This was really useful for handling of characters that needed to be escaped from CDATA sections [7]. Question 1: While '<' and '&' must be escaped, '>' must not. But the output of <xsl:copy-of select="/"/> does escape the '>', too. This was the reason for template escapeLtGtAmp to escape all three in order to match the copy-of behavior. Why does xsl:copy-of escape '>'? Question 2: The displayed output looks quite nice for Firefox, Chrome, Safari and Opera browsers (Firefox does not support the namespace:: axis and cannot handle and display namespaces). Why is the serialized XML displayed by IE6 and IE8 looking completely different to all the other browsers (ugly)? Question 3: Is it correct, that a stylesheet cannot have access to the CDATA sections? (I think the parser removes them) Question 4: Is it correct, that a stylesheet cannot access the "original" attribute values (including eg. newlines) but only the result of Attribute-Value Normalization [8]? [1] http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/201008/msg00181.html [2] http://code.google.com/p/xlibs/source/browse/serial/trunk [3] http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/201008/msg00161.html [4] http://stamm-wilbrandt.de/en/xsl-list/serialize/serialize.xsl [5] http://stamm-wilbrandt.de/en/xsl-list/serialize/serialize-demo.xml [6] http://stamm-wilbrandt.de/en/xsl-list/serialize/serialize-test.xsl [7] http://www.w3.org/TR/REC-xml/#syntax [8] http://www.w3.org/TR/REC-xml/#AVNormalize <!-- XSLT 1.0 serializer for XML remarks: - generates output nearly identical to <xsl:copy-of select="/"/> - all attributes before namespace declarations - attribute values might be different because of AVN - since stylesheet does not have access to CDATA sections it has to use template escapeLtGtAmp to ensure correct escaping; overhead of 1 x call-template + 3 x contains() for text output not containing any of < , > and & - because of "Attribute-Value Normalization" no newlines in attribute values; this might change visual presentation as can be seen in first <xsl:when>'s test attribute - entity references like and " in the XML file are not accessible by the stylesheet and are displayed as non-Entity serialize.xsl: XML serializer serialize-demo.xml: demonstration file (open in browser) serialize-demo.xsl: referenced demonstration copy-of.xsl: for comparison with serialize-test.xsl output serialize-test.xsl: for comparison with copy-of.xsl output; view output in browser for comparing --> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > <xsl:template name="doOutput"> <xsl:choose> <xsl:when test="count(. | ../namespace::*) != count(../namespace::*)"> <xsl:apply-templates select="." mode="output"/> </xsl:when> <xsl:otherwise> <xsl:value-of select= "concat('xmlns', substring(':',1 div boolean(name ())), name(),'="',.,'"')" /> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="@*" mode="output"> <xsl:value-of select="concat(' ',name(),'="',.,'"')"/> </xsl:template> <xsl:template match="node()" mode="output"> <!-- for xsl:copy-of behavior of xsltproc; Saxon, xalan and DataPower XSLT processors do not do this. <xsl:if test="(.=/) and (preceding::comment()|preceding::processing-instruction ())"> <xsl:text> </xsl:text> </xsl:if> --> <xsl:value-of select="concat('<',name())"/> <xsl:apply-templates select="@*" mode="output"/> <xsl:for-each select="namespace::*"> <xsl:if test="not(.=../../namespace::*) and name()!='xml'"> <xsl:value-of select= "concat(' xmlns', substring(':',1 div boolean(name ())), name(),'="',.,'"')" /> </xsl:if> </xsl:for-each> <xsl:choose> <xsl:when test="*|text()|comment()|processing-instruction()"> <xsl:text>></xsl:text> <xsl:apply-templates select="*|text()|comment()|processing-instruction()" mode="output"/> <xsl:value-of select="concat('</',name(),'>')"/> </xsl:when> <xsl:otherwise> <xsl:text>/></xsl:text> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="comment()" mode="output"> <xsl:value-of select="concat('<!--',.,'-->')"/> </xsl:template> <xsl:template match="processing-instruction()" mode="output"> <xsl:value-of select="concat('<?',name(),' ',.,'?>')"/> </xsl:template> <xsl:template match="text()" mode="output"> <!-- overhead: 1 x call-template + 3 x contains() for text without CDATA --> <xsl:call-template name="escapeLtGtAmp"> <xsl:with-param name="str" select="."/> </xsl:call-template> </xsl:template> <xsl:template name="escapeLtGtAmp"> <xsl:param name="str"/> <xsl:choose> <xsl:when test="contains($str,'<') or contains($str,'>') or contains($str,'&')"> <xsl:variable name="lt" select="substring-before(concat($str,'<'),'<')"/> <xsl:variable name="gt" select="substring-before(concat($str,'>'),'>')"/> <xsl:variable name="amp" select="substring-before(concat($str,'&'),'&')"/> <xsl:choose> <xsl:when test="string-length($gt) > string-length($amp)"> <xsl:choose> <xsl:when test="string-length($amp) > string-length($lt)"> <xsl:value-of select="concat(substring-before ($str,'<'),'&lt;')"/> <xsl:call-template name="escapeLtGtAmp"> <xsl:with-param name="str" select="substring-after($str,'<')"/> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:value-of select="concat(substring-before ($str,'&'),'&amp;')"/> <xsl:call-template name="escapeLtGtAmp"> <xsl:with-param name="str" select="substring-after($str,'&')"/> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:when> <xsl:otherwise> <xsl:choose> <xsl:when test="string-length($gt) > string-length($lt)"> <xsl:value-of select="concat(substring-before ($str,'<'),'&lt;')"/> <xsl:call-template name="escapeLtGtAmp"> <xsl:with-param name="str" select="substring-after($str,'<')"/> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:value-of select="concat(substring-before ($str,'>'),'&gt;')"/> <xsl:call-template name="escapeLtGtAmp"> <xsl:with-param name="str" select="substring-after($str,'>')"/> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:otherwise> </xsl:choose> </xsl:when> <xsl:otherwise> <xsl:value-of select="$str"/> </xsl:otherwise> </xsl:choose> </xsl:template> </xsl:stylesheet> Mit besten Gruessen / Best wishes, Hermann Stamm-Wilbrandt Developer, XML Compiler, L3 WebSphere DataPower SOA Appliances ---------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Replying to Digest [was: Re: , List Owner | Thread | Re: [xsl] XSLT 1.0 serializer for X, Martin Honnen |
[xsl] Replying to Digest [was: Re: , List Owner | Date | Re: [xsl] XSLT 1.0 serializer for X, Martin Honnen |
Month |