bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: url protection


From: Gavin Smith
Subject: Re: url protection
Date: Fri, 5 Aug 2022 18:29:45 +0100

On Thu, Aug 04, 2022 at 10:08:47PM +0200, Patrice Dumas wrote:
> On Thu, Aug 04, 2022 at 08:30:01PM +0100, Gavin Smith wrote:
> > On 8/3/22, Patrice Dumas <pertusus@free.fr> wrote:
> > >
> > > But that was
> > > not really myquestion, my question was more on whether we should use the
> > > output encoding to encode string before doing the URI::Escape call, or
> > > always use UTF-8, even if the document encoding is not UTF-8.
> > 
> > Are there browsers in non UTF-8 locales manage to follow links percent
> > encoded in non UTF-8 encodings? This seems like a very niche case.
> 
> To me the question is not the locales of the browser, but the encoding
> of the HTML file.  If the encoding is ISO latin 1 as in:
> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
> 
> Then it seems to me that the URI::Escape call should be on a ISO latin 1
> encoded string.  But I am not sure.

I don't think so.  Such encoded strings are not recommended by anybody.
I think it's simpler to use the usual URL encoding of either straight
ASCII or percent encoded UTF-8.

I don't see why the encoding of the HTML file itself should make a difference.
I tested it and didn't find the HTML encoding declaration made a difference
to percent encoded links (on Chromium 97).  I've attached the example files.

Regardless of the declaration, the encoded bytes were used for the filename.

There was one difference, which shows that percent encoding links is a good
idea.  In test-latin1.html (attached), the uncoded link does not work - in
the file it is "ä.html", but Chromium looks for a file named "ä.html" on
the filesystem, presumably due to decoding it from Latin-1 and then reencoding
to UTF-8.

Mixing encodings like this should be considered unpredicatable.

Attachment: test-encoding-links.tar.gz
Description: application/tar-gz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]