[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: guile can't find a chinese named file

From: David Kastrup
Subject: Re: guile can't find a chinese named file
Date: Mon, 30 Jan 2017 17:42:07 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)

Marko Rauhamaa <address@hidden> writes:

> David Kastrup <address@hidden>:
>> Marko Rauhamaa <address@hidden> writes:
>>> address@hidden (Ludovic Courtès):
>>>> Guile assumes its command-line arguments are UTF-8-encoded and
>>>> decodes them accordingly.
>>> I'm afraid that choice (which Python made, as well) was a bad one
>>> because Linux doesn't guarantee UTF-8 purity.
>> Have you looked at the error messages? They are all perfect UTF-8. As
>> was the command line locale.
> I was responding to Ludovic.
>> Apparently, Guile can open the file just fine, and it sees the command
>> line just fine as encoded in utf-8.
> My problem is when it is not valid UTF-8.
>> So I really, really, really suggest that before people post their
>> theories that they actually bother cross-checking them with Guile.
> Well, execute these commands from bash:
>    $ touch $'\xee'
>    $ touch xyz
>    $ ls -a
>    .  ..  ''$'\356'  xyz

We are not talking about file names not encoded in UTF-8.  It is
well-known that Guile is unable to work with strings in UTF-8-encoding
when their byte-pattern is not valid UTF-8.

This is a red herring.  The problem is not that Guile is unable to deal
with badly encoded UTF-8 file names.  The problem is that Guile is
unable to deal with properly encoded UTF-8 file names when it is
supposed to execute them from the command line.

> Then, execute this guile program:
> ========================================================================
> (let ((dir (opendir ".")))
>   (let loop ()
>     (let ((filename (readdir dir)))
>       (if (not (eof-object? filename))
>           (begin
>             (if (access? filename R_OK)
>                 (format #t "~s\n" filename))
>             (loop))))))
> ========================================================================
> It outputs:
>    ".."
>    "."
>    "xyz"
> skipping a file. This is a security risk. Files like these appear easily
> when extracting zip files, for example.

I am surprised this does not just throw a bad encoding exception.

But at any rate, this cannot easily be fixed since Guile uses libraries
for encoding/decoding that cannot deal reproducibly with improper byte

The problem here is that Guile cannot even deal with _properly_ encoded
UTF-8 file names on the command line.

David Kastrup

reply via email to

[Prev in Thread] Current Thread [Next in Thread]