Re: [xsl] convert XML to NFC

Subject: Re: [xsl] convert XML to NFC
From: Martin Honnen <Martin.Honnen@xxxxxx>
Date: Wed, 27 Apr 2011 18:15:48 +0200
Kenneth Reid Beesley wrote:

I've got a valid XML file, in UTF-8, and I simply want to create a version of the same file that is guaranteed to be UTF-8 with NFC normalization. Text in elements and in attribute values should all be UTF-8 NFC in the resulting file.

Is there an easy way to do this? For XSLT I normally use Saxon.

There is http://www.saxonica.com/documentation/functions/intro/normalize-unicode.xml which says it "Converts a string to Unicode normalized form NFC" so you could run your XML input through a stylesheet that transform text nodes and attribute values with e.g.
<xsl:template match="text()">
<xsl:value-of select="normalize-unicode(.)"/>
</xsl:template>
<xsl:template match="@*">
<xsl:attribute name="{name()}" namespace="{namespace-uri()}" select="normalize-unicode(.)"/>
</xsl:template>
<xsl:template match="*">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>


--

	Martin Honnen
	http://msmvps.com/blogs/martin_honnen/

Current Thread