[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: guile can't find a chinese named file
From: |
David Kastrup |
Subject: |
Re: guile can't find a chinese named file |
Date: |
Wed, 15 Feb 2017 10:54:06 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux) |
<address@hidden> writes:
> On Tue, Feb 14, 2017 at 10:19:14PM +0000, Chris Vine wrote:
>> On Tue, 14 Feb 2017 21:52:01 +0000 (UTC)
>> Mike Gran <address@hidden> wrote:
>> [snip]
>> > > In particular, filenames are *not*, nor can they be mapped to,
>> > > Unicode
>> >
>> > > strings in Linux.
>> >
>> > True. Linux should follow OpenBSD and make all locales UTF-8.
>>
>> Filenames and locales are not necessarily related. When you access a
>> networked file system, you get the filename encoding you are given,
>> which may or may not be the same as the particular locale encoding on
>> your particular machine on one particular day, and may or may not be a
>> unicode encoding. Glib, for example, enables you to set this with the
>> G_FILENAME_ENCODING environmental variable [...]
>
> which is, btw., "just a better approximation", but still wrong: the
> application creating a directory might have been "in" a different
> locale (and thus having a different encoding) that the one creating
> the file whithin that directory.
>
> Most notably, the whole path might cross several mount points, thus
> the whole path can well have fragments coming from several file systems.
>
> I think the only sane way to see a Linux file system path is the way
> Linux sees it: as a byte string.
>
> Sure, some helper infrastructure to try to make characters of that
> mess will be welcome, but that should be absolutely robust wrt.
> unexpected input e.g. bad UTF-8) and leave control to the application.
>
> Not easy.
If you tell Emacs that some external entity is in UTF-8, it will
represent all valid UTF-8 sequences as properly decoded characters, and
it has special codes for all bytes not part of valid UTF-8.
As a result, it works with valid UTF-8 perfectly as expected but will
reproduce arbitrary byte streams thrown at it perfectly when decoding as
UTF-8 and then reencoding into UTF-8 again.
Guile is lacking this byte stream reproducibility when
decoding/reencoding. That makes it a whole lot less robust for dealing
with externally provided material.
--
David Kastrup
- Re: guile can't find a chinese named file, Linas Vepstas, 2017/02/14
- Re: guile can't find a chinese named file, Linas Vepstas, 2017/02/14
- Re: guile can't find a chinese named file, Mike Gran, 2017/02/14
- Re: guile can't find a chinese named file, Marko Rauhamaa, 2017/02/14
- Re: guile can't find a chinese named file, Mike Gran, 2017/02/14
- Re: guile can't find a chinese named file, Marko Rauhamaa, 2017/02/14
- Re: guile can't find a chinese named file, Chris Vine, 2017/02/14
- Re: guile can't find a chinese named file, Marko Rauhamaa, 2017/02/15
- Re: guile can't find a chinese named file, tomas, 2017/02/15
- Re: guile can't find a chinese named file,
David Kastrup <=
- Re: guile can't find a chinese named file, tomas, 2017/02/15
- Re: guile can't find a chinese named file, Eli Zaretskii, 2017/02/15
- Re: guile can't find a chinese named file, tomas, 2017/02/15
- Re: guile can't find a chinese named file, Eli Zaretskii, 2017/02/15
- Re: guile can't find a chinese named file, Marko Rauhamaa, 2017/02/15
- Re: guile can't find a chinese named file, David Kastrup, 2017/02/15
- Re: guile can't find a chinese named file, Chris Vine, 2017/02/15
- Re: guile can't find a chinese named file, tomas, 2017/02/15
- Re: guile can't find a chinese named file, Chris Vine, 2017/02/15
- Re: guile can't find a chinese named file, tomas, 2017/02/15