Re: [xsl] document( URI ) with accented chars fails

Subject: Re: [xsl] document( URI ) with accented chars fails
From: "Martynas Jusevičius martynas@xxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 18 Nov 2020 12:16:00 -0000
We use this in bash scripts:

urlencode()
{
    python -c 'import urllib, sys; print urllib.quote(sys.argv[1] if
len(sys.argv) > 1 else sys.stdin.read()[0:-1])' "$1"
}

On Wed, Nov 18, 2020 at 12:32 PM Alexandre HoC/de
alexandre.hoide@xxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
wrote:
>
> On Tue, Nov 17, 2020 at 10:51:25PM -0000, Alexandre HoC/de
alexandre.hoide@xxxxxxxxxx wrote:
> > On Tue, Nov 17, 2020 at 09:26:23PM -0000, Michael Kay mike@xxxxxxxxxxxx
wrote:
> > > The document() function expects a URI, not a filename, and URIs never
contain accented characters.
> > >
> > > XSLT 2.0+ has functions to escape special characters using %HH escapes
so you can turn arbitrary filenames into valid URIs.
> > >
> > > For xsltproc you'll need some processor-specific solution and I can't
help you with that.
>
> It is not directly XSLT related, but just in case :
>
> The EXSLT has a `str:encode-uri`B9 function but, unfortunately,
> `xsltproc` from `libxslt` does not implement it.
>
> So, I have now enriched my bash script used to build
> the fileslist.xml with a small Perl script including the Perl
> module bURIbB2, and applied to each file path.
>
> ~~~{filename-to-uri.pl}
> #!/usr/bin/env perl
> use URI::file;
> my $uri = URI::file->new( $ARGV[0] );
> print $uri . "\n";
> ~~~
>
> applied on each file name with :
> ~~~{bash command line}
> $ perl filename-to-uri.pl <the-filename-to-convert-to-uri>
> ~~~
>
> Best regards and thanks again,
> Alexandre HoC/de
>
> 1. http://exslt.org/str/functions/encode-uri/index.html
> 2. https://metacpan.org/pod/URI
>    (on GNU Guix the package is `perl-uri`)

Current Thread