[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2] decodetree: Open files with encoding='utf-8'
From: |
Eduardo Habkost |
Subject: |
Re: [PATCH v2] decodetree: Open files with encoding='utf-8' |
Date: |
Fri, 8 Jan 2021 13:58:23 -0500 |
On Fri, Jan 08, 2021 at 07:09:52PM +0100, Philippe Mathieu-Daudé wrote:
> When decodetree.py was added in commit 568ae7efae7, QEMU was
> using Python 2 which happily reads UTF-8 files in text mode.
> Python 3 requires either UTF-8 locale or an explicit encoding
> passed to open(). Now that Python 3 is required, explicit
> UTF-8 encoding for decodetree source files.
>
> To avoid further problems with the user locale, also explicit
> UTF-8 encoding for the generated C files.
>
> Explicit both input/output are plain text by using the 't' mode.
I believe the 't' is unnecessary. But it's harmless and makes it
more explicit.
>
> This fixes:
>
> $ /usr/bin/python3 scripts/decodetree.py test.decode
> Traceback (most recent call last):
> File "scripts/decodetree.py", line 1397, in <module>
> main()
> File "scripts/decodetree.py", line 1308, in main
> parse_file(f, toppat)
> File "scripts/decodetree.py", line 994, in parse_file
> for line in f:
> File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
> return codecs.ascii_decode(input, self.errors)[0]
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 80:
> ordinal not in range(128)
>
> Reported-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
However:
> ---
> v2: utf-8 output too (Peter)
> explicit default text mode.
> ---
> scripts/decodetree.py | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/decodetree.py b/scripts/decodetree.py
> index 47aa9caf6d1..d3857066cfc 100644
> --- a/scripts/decodetree.py
> +++ b/scripts/decodetree.py
> @@ -1304,7 +1304,7 @@ def main():
>
> for filename in args:
> input_file = filename
> - f = open(filename, 'r')
> + f = open(filename, 'rt', encoding='utf-8')
> parse_file(f, toppat)
> f.close()
>
> @@ -1324,7 +1324,7 @@ def main():
> prop_size(stree)
>
> if output_file:
> - output_fd = open(output_file, 'w')
> + output_fd = open(output_file, 'wt', encoding='utf-8')
> else:
> output_fd = sys.stdout
This will still use the user locale encoding for sys.stdout. Can
be solved with:
output_fd = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
(Based on a suggestion from Yonggang Luo)
--
Eduardo