[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 0/1] Update check-python-tox test for pylint 2.10
From: |
Daniel P . Berrangé |
Subject: |
Re: [PATCH v2 0/1] Update check-python-tox test for pylint 2.10 |
Date: |
Wed, 15 Sep 2021 10:10:34 +0100 |
User-agent: |
Mutt/2.0.7 (2021-05-04) |
On Wed, Sep 15, 2021 at 01:30:10AM -0400, John Snow wrote:
> V2: It's not safe to use sys.stderr.encoding to determine a "console
> encoding", because that uses the "current" stderr and not a
> hypothetically generic one -- and doing this causes the acceptance tests
> to fail.
>
> Use UTF-8 instead.
>
> Question: What encoding do terminal programs use? Is there an inherent
> encoding to fprintf et al, or does it just push whatever bytes you put
> into it straight into the stdout/stderr pipe?
Programs are expected to output data in the encoding that is set in
the various env variables LC_ALL/LC_CTYPE/LANG.
In traditional end user scenarios this almost always means UTF-8 charset.
There's plenty of cases which end up with the C locale though, which
would mean 7-bit ASCII on Linux, though apps are supposed to be 8-bit
clean allow data with the high bit to pass through without interpretation.
The latter is what python3 gets very wrong complaining if you output
8-bit high data in C locale.
There is increasing support for a C.UTF-8 locale to bring it closer to
other locales which are all UTF-8.
On macOS the C locale has been UTF-8 by default indefinitely.
Windows is a whole other world of fun and IIRC isn't UTF-8 by default,
but I don't recall details.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|