bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: uuencode: multi-bytes char in remote file name contains bytes >0x80


From: John Cowan
Subject: Re: uuencode: multi-bytes char in remote file name contains bytes >0x80
Date: Tue, 5 Jul 2011 13:18:53 -0400
User-agent: Mutt/1.5.18 (2008-05-17)

Eric Blake scripsit:

> When used according to POSIX, the 'decode_pathname' argument (POSIX
> notation, or REMOTEFILE argument in 'uuencode --help' notation) is
> output literally in the resulting output of 'uuencode' on the line
> starting with "begin"; that resulting output is also required by POSIX
> to be a text file.

I grasp that, but the definition of "text file" merely requires that the
bytes be interpretable in *some* locale, not in the actually relevant
locale.  A locale in which the character encoding is 8859-1, for
example, has a character for every byte, which means that every file is,
*as far as this part of the definition goes*, a text file.

> It also helps to read elsewhere in the POSIX requirements on
> uuencode: "If there are characters in decode_pathname that are not
> in the portable filename character set the results are unspecified."
> Therefore, you _cannot_ use uuencode to pass the name of a file that
> contains non-portable characters and still have output that complies
> with POSIX.

Well, it's good to know about that requirement, since it means Posix
is irrelevant to the case.  However, your second sentence is false,
because if Posix does not specify the result, then any result is
Posix-compliant.

> Which means that for our particular implementation of uuencode, if
> we encounter a file name that contains any bytes not already in the
> portable file name set, then we can do whatever we want (error out,
> or output some sort of prefix line that tells knowledgeable uudecode
> implementations that we are about to send an encoded form of a file
> name, output a binary file rather than a text file [by outputting the
> file name as a literal sequence of bytes, even though those bytes
> are not characters in the current locale], or anything else), all as
> an extension to POSIX.  Of course, our goal should be to have the
> out-of-the-box behavior provide the most likely use (that is, it would
> be better if we could just make uuencode work on all possible file
> names, even on the ones where POSIX does not require any particular
> behavior).

Agreed.

-- 
John Cowan    address@hidden    http://ccil.org/~cowan
The whole of Gaul is quartered into three halves.
        --Julius Caesar



reply via email to

[Prev in Thread] Current Thread [Next in Thread]