gnumed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnumed-devel] Encoding (viewing) on Mac OS


From: Nicolas Barbier
Subject: Re: [Gnumed-devel] Encoding (viewing) on Mac OS
Date: Wed, 16 Nov 2011 11:18:50 +0100

2011/11/16 Karsten Hilbert <address@hidden>:

> Thanks for the clarification. I was under the impression that
> there IS a way to use the very same byte sequence for both latin1
> and utf8 as long as there's only overlapping characters in the file.
>
> After all, it IS possible to say either of "coding latin1" or
> "coding utf8" at the top of a Python file and have the Python
> interpreter properly read, say, German umlauts within said file (the
> byte sequence does not change, just the declaration) ?

I assume that means your file is actually encoded in UTF-8, and Python
interpreting it as Latin 1 works (as all bytes used for encoding
ÄÖÜäöüß in UTF-8 are valid Latin 1 bytes). I guess Python stores
strings as bytes and never tries to convert, so outputting such
strings in a UTF-8 locale results in the right characters appearing on
the screen, even if Python “internally treated” those bytes as Latin
1. I.e., both UTF-8 encoded files…

# coding=latin1
print("ÄÖÜäöüß")

…and…

# coding=utf8
print("ÄÖÜäöüß")

…yield “ÄÖÜäöüß” (I use a UTF-8 locale).

However, the Latin 1 encoded versions of those files yield…

address@hidden:~/bla$ python test.py # The file contains coding=latin1
??????

…and…

address@hidden:~/bla$ python test.py # The file contains coding=utf8
  File "test.py", line 2
SyntaxError: 'utf8' codec can't decode byte 0xc4 in position 7:
invalid continuation byte

I.e., the first version outputs Latin 1 bytes that are not understood
by my terminal (they probably fall in the “control codes” block),
while the second version doesn't work because Python can detect that
the source code is not valid UTF-8.

Nicolas

-- 
A. Because it breaks the logical sequence of discussion.
Q. Why is top posting bad?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]