[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: guile can't find a chinese named file
From: |
David Kastrup |
Subject: |
Re: guile can't find a chinese named file |
Date: |
Fri, 17 Feb 2017 10:04:29 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux) |
Marko Rauhamaa <address@hidden> writes:
> Eli Zaretskii <address@hidden>:
>>> From: Marko Rauhamaa <address@hidden>
>>> Python uses the surrogate hole in the middle of the Unicode range to
>>> represent such stray bytes, but only when naming files.
>>
>> IMO, it makes no sense to limit this to file names, because (a) you
>> don't always know on all levels of the code which string is a file
>> name or a part thereof; and (b) because situations where non-ASCII
>> bytes cannot be properly decoded into Unicode happen with text that is
>> not file names, and users still expect Emacs to silently produce the
>> same byte stream on round-trip operations, e.g., when copying text
>> from one file to another.
>
> Python just barfs:
>
> $ python3 -c "import sys; print(sys.stdin.read(30))" <<<$'\xdd'
> Traceback (most recent call last):
> File "<string>", line 1, in <module>
> File "/usr/lib64/python3.5/codecs.py", line 321, in decode
> (result, consumed) = self._buffer_decode(data, self.errors, final)
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdd in position \
> 0: invalid continuation byte
>
> The situation is a bit difficult to recover from.
You can load an executable into an Emacs buffer and do a
search-and-replace on UTF-8 strings, then save again. Assuming that the
replacement has been by a string of the same length and that the string
does not appear as part of symbols for the linker, the executable will
likely work fine afterwards.
I don't think that XEmacs (another Emacs implementation that migrated a
lot more leisurely to multibyte encodings) would stand up to the same
sort of abuse. And probably quite a few text editors would throw in the
towel as well. But once you view Emacs as a text processing platform,
it's a reasonable conclusion that failure is not a good option.
For a general-purpose programming language like Python or Guile, I
should think it should be at least as important that strings can
represent input accurately without having to degress outside of string
processing and use stuff like byte arrays.
--
David Kastrup
- Re: guile can't find a chinese named file, (continued)
- Re: guile can't find a chinese named file, Marko Rauhamaa, 2017/02/16
- Re: guile can't find a chinese named file, Eli Zaretskii, 2017/02/16
- Re: guile can't find a chinese named file, Marko Rauhamaa, 2017/02/16
- Re: guile can't find a chinese named file, Eli Zaretskii, 2017/02/16
- Re: guile can't find a chinese named file, Marko Rauhamaa, 2017/02/16
- Re: guile can't find a chinese named file, Eli Zaretskii, 2017/02/16
- Re: guile can't find a chinese named file, David Kastrup, 2017/02/16
- Re: guile can't find a chinese named file, Marko Rauhamaa, 2017/02/16
- Re: guile can't find a chinese named file, Eli Zaretskii, 2017/02/17
- Re: guile can't find a chinese named file, Marko Rauhamaa, 2017/02/17
- Re: guile can't find a chinese named file,
David Kastrup <=
- Re: guile can't find a chinese named file, tomas, 2017/02/17
- Re: guile can't find a chinese named file, Eli Zaretskii, 2017/02/17
- Re: guile can't find a chinese named file, Eli Zaretskii, 2017/02/17
- Re: guile can't find a chinese named file, Eli Zaretskii, 2017/02/16
- Re: guile can't find a chinese named file, Marko Rauhamaa, 2017/02/16
- Re: guile can't find a chinese named file, Eli Zaretskii, 2017/02/16
- Re: guile can't find a chinese named file, Mike Gran, 2017/02/16
- Re: guile can't find a chinese named file, David Kastrup, 2017/02/16
- Re: guile can't find a chinese named file, Marko Rauhamaa, 2017/02/16
- Re: guile can't find a chinese named file, Eli Zaretskii, 2017/02/16