[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: url protection
From: |
Per Bothner |
Subject: |
Re: url protection |
Date: |
Wed, 3 Aug 2022 14:36:58 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 |
On 8/3/22 13:46, Patrice Dumas wrote:
This is not what we do in general for html/xhtml. For epub we always
emit utf8, as it is mandated by the standard, but for html/xhtml, we
use, in the default case, the input encoding for the output encoding.
I think that is a mistake.
It seems clear that in 2022 all publicly-visible html pages (i.e. on a public
web server) should use utf8.
It is also clear that a practical html-reading program is able to read
utf8-encoded
html files (assuming a correct charset declaration), regardless of the local
character encoding, even for local file: urls or an internal web-server.
Ergo, always emitting utf8 (with a charset declaration) is safer and very
unlikely to
lead to problems. while using a native or input-base encoding is fragile and
dangerous.
The conversion should not have already been done at that point, we are
still character strings in internal perl unicode encoding. But that was
not really myquestion, my question was more on whether we should use the
output encoding to encode string before doing the URI::Escape call, or
always use UTF-8, even if the document encoding is not UTF-8.
The question is irrelevant: we should always emit utf8 in both urls and in the
body
of html/xhtml files. That should certainly be the default (regardless of
native or input encoding) - and it is almost certainly a waste of time to
support anything else.
Here is another datapoint:
https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier#Compatibility
--
--Per Bothner
per@bothner.com http://per.bothner.com/
- url protection, Patrice Dumas, 2022/08/03
- Re: url protection, Per Bothner, 2022/08/03
- Re: url protection, Eli Zaretskii, 2022/08/04
- Message not available
- Re: url protection, Eli Zaretskii, 2022/08/05
- Re: url protection, Patrice Dumas, 2022/08/05
- Re: url protection, Per Bothner, 2022/08/05
- Re: url protection, Gavin Smith, 2022/08/04
- Re: url protection, Patrice Dumas, 2022/08/04
- Re: url protection, Gavin Smith, 2022/08/05