Re: [?UTF-8?]

Subject: Re: [?UTF-8?]
From: Mike Brown <mike@xxxxxxxx>
Date: Wed, 14 Jun 2000 11:15:49 -0600 (MDT)
Mike Kay wrote:
> "&nbsp;" and "&#160;" and the invisible character xA0 in HTML are
> absolutely equivalent, so if your browser renders them differently, get
> another browser.

Sure, the characters are equivalent, but a document that has been output
is comprised of byte sequences that represent those characters. He didn't
realize it, but he's asking about encodings.

Encodings are a weakness in the HTML spec. Even with HTML 4, there is too
much leeway for a document to not signal its own encoding and for a user
agent to make wild guesses at what encoding was used.

If the encoding is UTF-8, for example, outputting the byte xA0 for a
non-breaking space character is wrong, but outputting the ASCII bytes for
"&nbsp;" or "&#160;" would work even if the document were interpreted as
being in some other encoding, so long as that encoding subsets ASCII.

The problem, really, is that he is assuming he can use one encoding for
output and can tell his browser to assume a different encoding when it
reads the bytes back in. XSLT processors right now only support a limited
range of encodings. 

My advice would be not to change his browser, but to learn how much
control he has over the output encoding with his particular XSLT
processor, and to put the appropriate <meta> element his document head to
signal that encoding to the browser. His browser should be set to
auto-detect encodings, if it has that option.

   - Mike
____________________________________________________________________
Mike J. Brown, software engineer at         My XML/XSL resources:
webb.net in Denver, Colorado, USA           http://www.skew.org/xml/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread