RE: [xsl] Entities

Subject: RE: [xsl] Entities
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 11 Nov 2008 10:26:27 -0000
> I am involve in a xml to html to xml conversion by xslt and
> entities conversion is a problematic issue. Below are the
> steps for conversion.
> XML INPUT Entities: &#x201C; (left double quote),

Let's start by getting the terminology right. &#x201C; is not an entity; it is
a character reference.

> HTML OUTPUT Entities: C"b,E (left double
> quote),  C"b,B (right double quote),

Almost certainly the XSLT processor produced correct output in UTF-8, but you
are viewing it using some kind of software that doesn't know it is reading
UTF-8, and is presenting it wrongly, thinking it to be iso-8859-1.

> I want the entities should be same as input xml ie. &#x201C;
> for left double quote.
>

You can't guarantee the same representation as in the input (the XSLT
processor can't distinguish, for example, between a decimal and a hexadecimal
character reference for the same character). But you can force the XSLT
processor (or rather, its serializer) to output the character as a character
reference by selecting an encoding that does not include the character, for
example <xsl:output encoding="iso-8859-1"/>

Michael Kay
http://www.saxonica.com/

Current Thread