Re: [xsl] document( URI ) with accented chars fails

Subject: Re: [xsl] document( URI ) with accented chars fails
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 17 Nov 2020 21:26:23 -0000
The document() function expects a URI, not a filename, and URIs never contain
accented characters.

XSLT 2.0+ has functions to escape special characters using %HH escapes so you
can turn arbitrary filenames into valid URIs.

For xsltproc you'll need some processor-specific solution and I can't help you
with that.

Michael Kay
Saxonica

> On 17 Nov 2020, at 20:28, Alexandre HoC/de alexandre.hoide@xxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>
>  Hello !
>
>  I have a little problem : URI inside a bdocument()bB9 is
> ignored when the filename contains accented (UTF-8)
> character(s).
>
>  When applying the XSLT on the source with the sample
> 2 files below, with the following command
>
> ~~~{Command line}
> $ xsltproc multifiles.xsl files-list.xml
> ~~~
>
> I expect the following result :
>
> ~~~{expected result}
> <?xml version="1.0" encoding="UTF-8"?>
> <root>
>  <el>element 1</el>
>  <el>element 2</el>
>  <el>element 3</el>
> </root>
> ~~~
>
> but I only get the `el`s from the ASCII only filename.
>
> ~~~{output}
> <?xml version="1.0" encoding="UTF-8"?>
> <root>
>  <el>element 1</el>
> </root>
> ~~~
>
>
> ~~~{multifiles.xsl}
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="1.0">
>  <xsl:output encoding="UTF-8" indent="yes"/>
>  <xsl:template match="/">
>    <root>
>      <xsl:for-each select="document(/fileslist/filepath)/root/el">
>        <xsl:copy-of select="." />
>      </xsl:for-each>
>    </root>
>  </xsl:template>
> </xsl:stylesheet>
> ~~~
>
> ~~~{files-list.xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <fileslist>
>  <filepath>filename-without-accented-char.xml</filepath>
>  <filepath>filename-with-utf-8-accented-char-C).xml</filepath>
> </fileslist>
> ~~~
> (When i add files to the `files-list.xml`, the ones
> containing accented chars are consistently ignored.)
>
> ~~~{filename-without-accented-char.xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <root>
>  <el>element 1</el>
> </root>
> ~~~
>
> ~~~{filename-with-utf-8-accented-char-C).xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <root>
>  <el>element 2</el>
>  <el>element 3</el>
> </root>
> ~~~
>
>  Do i miss something or is it a libxslt bug ?
>
>  Thanks for your time !
>
> Alexandre HoC/de
>
> XSLT Processor Version (under Guix GNU/Linux)
>  XSL version: 1.0
>  Vendor: libxslt
>  version: 1.1.34
>  (libxml2@xxxxxx)
>  Vendor URL: http://xmlsoft.org/XSLT/
>
> 1. https://www.w3.org/TR/xslt-10/#document

Current Thread