Re: [PATCH v2] decodetree: Open files with encoding='utf-8'

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] decodetree: Open files with encoding='utf-8'

From:	Eduardo Habkost
Subject:	Re: [PATCH v2] decodetree: Open files with encoding='utf-8'
Date:	Fri, 8 Jan 2021 13:58:23 -0500

On Fri, Jan 08, 2021 at 07:09:52PM +0100, Philippe Mathieu-Daudé wrote:
> When decodetree.py was added in commit 568ae7efae7, QEMU was
> using Python 2 which happily reads UTF-8 files in text mode.
> Python 3 requires either UTF-8 locale or an explicit encoding
> passed to open(). Now that Python 3 is required, explicit
> UTF-8 encoding for decodetree source files.
> 
> To avoid further problems with the user locale, also explicit
> UTF-8 encoding for the generated C files.
> 
> Explicit both input/output are plain text by using the 't' mode.

I believe the 't' is unnecessary.  But it's harmless and makes it
more explicit.

> 
> This fixes:
> 
>   $ /usr/bin/python3 scripts/decodetree.py test.decode
>   Traceback (most recent call last):
>     File "scripts/decodetree.py", line 1397, in <module>
>       main()
>     File "scripts/decodetree.py", line 1308, in main
>       parse_file(f, toppat)
>     File "scripts/decodetree.py", line 994, in parse_file
>       for line in f:
>     File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
>       return codecs.ascii_decode(input, self.errors)[0]
>   UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 80:
>   ordinal not in range(128)
> 
> Reported-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>

However:

> ---
> v2: utf-8 output too (Peter)
>     explicit default text mode.
> ---
>  scripts/decodetree.py | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/scripts/decodetree.py b/scripts/decodetree.py
> index 47aa9caf6d1..d3857066cfc 100644
> --- a/scripts/decodetree.py
> +++ b/scripts/decodetree.py
> @@ -1304,7 +1304,7 @@ def main():
>  
>      for filename in args:
>          input_file = filename
> -        f = open(filename, 'r')
> +        f = open(filename, 'rt', encoding='utf-8')
>          parse_file(f, toppat)
>          f.close()
>  
> @@ -1324,7 +1324,7 @@ def main():
>          prop_size(stree)
>  
>      if output_file:
> -        output_fd = open(output_file, 'w')
> +        output_fd = open(output_file, 'wt', encoding='utf-8')
>      else:
>          output_fd = sys.stdout

This will still use the user locale encoding for sys.stdout.  Can
be solved with:

    output_fd = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')

(Based on a suggestion from Yonggang Luo)

-- 
Eduardo

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v2] decodetree: Open files with encoding='utf-8', Philippe Mathieu-Daudé, 2021/01/08
- Re: [PATCH v2] decodetree: Open files with encoding='utf-8', Eduardo Habkost <=
  - Re: [PATCH v2] decodetree: Open files with encoding='utf-8', Yonggang Luo, 2021/01/09

Prev by Date: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
Next by Date: Re: [PATCH v2 0/4] s390x/tcg: fix booting Linux kernels compiled with clang-11 and clang-12
Previous by thread: [PATCH v2] decodetree: Open files with encoding='utf-8'
Next by thread: Re: [PATCH v2] decodetree: Open files with encoding='utf-8'
Index(es):
- Date
- Thread