guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: guile can't find a chinese named file


From: Eli Zaretskii
Subject: Re: guile can't find a chinese named file
Date: Wed, 15 Feb 2017 18:59:14 +0200

> Date: Wed, 15 Feb 2017 10:18:32 +0100
> From: <address@hidden>
> 
> > Filenames and locales are not necessarily related.  When you access a
> > networked file system, you get the filename encoding you are given,
> > which may or may not be the same as the particular locale encoding on
> > your particular machine on one particular day, and may or may not be a
> > unicode encoding.  Glib, for example, enables you to set this with the
> > G_FILENAME_ENCODING environmental variable [...]
> 
> which is, btw., "just a better approximation", but still wrong: the
> application creating a directory might have been "in" a different
> locale (and thus having a different encoding) that the one creating
> the file whithin that directory.
> 
> Most notably, the whole path might cross several mount points, thus
> the whole path can well have fragments coming from several file systems.

A possible solution would be to decode each mount point's part as it
is being resolved.

> I think the only sane way to see a Linux file system path is the way
> Linux sees it: as a byte string.

This would lose a lot in 99% of use cases.  You are, in effect,
suggesting a "reverse optimization", whereby the majority of use cases
is punished in favor of a small minority, based on theoretical
intractability.

> Sure, some helper infrastructure to try to make characters of that
> mess will be welcome, but that should be absolutely robust wrt.
> unexpected input e.g. bad UTF-8) and leave control to the application.

Most applications won't like this burden, because most application
programmers don't know enough about the issue to solve them correctly,
especially for users of other OSes and locales.

> > But if OpenBSD requires all _filenames_ to be in valid UTF-8, that
> > is a bad decision in my view.
> 
> NT has done that too.

Windows can do that because it also transparently translates file
names to the locale's encoding when files are accessed with ANSI APIs.
Without such translation, this kind of decision is unwise, IMO.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]