[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
POSIX msgfmt and universal-character-name escape sequences
From: |
Bruno Haible |
Subject: |
POSIX msgfmt and universal-character-name escape sequences |
Date: |
Thu, 23 Jun 2022 08:01:27 +0200 |
https://posix.rhansen.org/p/gettext_draft
Line 1031
"except that universal-character-name escape sequences need not be supported."
Neither GNU msgfmt nor Solaris msgfmt treat universal-character-name
escape sequences specially. If an msgstr contains e.g. "\\u20AC", the
resulting string in the .mo file is
{ '\\', 'u', '2', '0', 'A', 'C', '\0' }.
Issue: Leaving it undefined whether \u escape sequences are recognized can
lead to mutual incompatibility of msgfmt implementations: Implementations
would differ in their interpretation of the dot-po file.
There is no good reason for leaving it undefined: There is already a
mechanism for specifying an encoding (charset=... in the header), and the
UTF-8 encoding is in widespread use for more than 10 years.
Suggestion: Change
"except that universal-character-name escape sequences need not be supported."
to
"except that universal-character-name escape sequences are not supported."
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- POSIX msgfmt and universal-character-name escape sequences,
Bruno Haible <=