Re: Corrupted multibyte characters in command substitutions fixes may be

bug-bash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Corrupted multibyte characters in command substitutions fixes may be

From:	Frank Heckenbach
Subject:	Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.
Date:	Mon, 07 Feb 2022 12:21:09 +0100

>  In the case of bash with environment having LC_CTYPE: C.UTF-8 or 
> en_US.UTF-8
> read:
> 0xC3 (len=1) i.e. Ã ('A' w/tilde in a legacy 8-bit latin-compatible 
> charset),
> but invalid if bash processes the environment setting of en_US.UTF-8.
> 
> Should bash process it as legacy input or invalid UTF8?
> Either way, what should it return? a UTF-8 char
> (hex 0xc30x83) transcoded from the latin value of A-tilde, or
> keep the binary value the same (return 0x83),
> should it return a warning message?  If it does, should
> it return NUL for the returned value because the input was erroneous?

Assuming Latin-1 when nothing in the environment points to it seems
questionable. It might just as well be a Cyrillic character in
ISO-8859-5 or whatever.

Email filters were mentioned. Emails may use charsets different from
the current environment -- even several different ones within a mail
(I've sent such mails myself). So if bash were to "fix" input
depending on the environment, even writing a pass-through filter
would require parsing the Content-Type headers and changing the
environment accordingly (or else, use an 8-bit clean charset
throughout).

So I don't think bash should change the input (unintentionally as
with the original bug or intentionally as discussed here) unless and
until it needs to do charset-dependent operations

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., L A Walsh, 2022/02/05
- Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Frank Heckenbach, 2022/02/06
  - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., L A Walsh, 2022/02/06
    - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Alex fxmbsw7 Ratchev, 2022/02/06
    - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Frank Heckenbach <=
- Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Chet Ramey, 2022/02/06
  - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Alex fxmbsw7 Ratchev, 2022/02/06
    - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Alex fxmbsw7 Ratchev, 2022/02/06
    - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Alex fxmbsw7 Ratchev, 2022/02/06
    - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Greg Wooledge, 2022/02/06
    - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Alex fxmbsw7 Ratchev, 2022/02/06
    - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Lawrence Velázquez, 2022/02/07
    - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Alex fxmbsw7 Ratchev, 2022/02/07
    - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Lawrence Velázquez, 2022/02/07
    - Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem., Alex fxmbsw7 Ratchev, 2022/02/07

Prev by Date: Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.
Next by Date: Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.
Previous by thread: Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.
Next by thread: Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.
Index(es):
- Date
- Thread