[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: quotearg improvements [was: filenames in error messages]

From: Eric Blake
Subject: Re: quotearg improvements [was: filenames in error messages]
Date: Wed, 13 Feb 2008 20:57:51 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/20071031 Thunderbird/ Mnenhy/

Hash: SHA1

According to Bruno Haible on 2/13/2008 8:13 PM:
| Sorry, but you lost me here. Where did the C trigraphs come into play?

Because the quotearg module _already_ did trigraph quoting (try ls
- --quoting-style=c for an example).  The question is whether the new
c_maybe style (or if we come up with a better name for it), designed for
use in unambiguous error message output, should continue using that
trigraph code or ditch it.  I think the consensus is to ditch it by
default, although it might still be worth leaving the option in the code
to provide it (quotearg, as a module, is useful for more than just error

|> For C strings, the code already outputs \a, \b, \f, \n, \r, \t, \v, \\,
|> \"; and for all other non-printable characters, a 3-digit \nnn octal
| So you want to escape, in an UTF-8 locale, all non-ASCII characters or
| So that a Japanese user, for an error in file をつけた時でも, gets to read

No.  The existing quotearg code was already locale-dependent, and tries
its hardest to recognize valid multibyte sequences as printable.  It only
prints an octal escape for invalid multibyte sequences and/or nonprintable
characters, according to the current locale's notion of printable.
However, when in the C locale, the notion of what is printable is fuzzy as
you change machines; I am often annoyed that on cygwin, where there is no
locale besides C, isprint('\0xc0') is false, even though it renders in the
terminal as a single-byte printable character (accented A, as if by
iso-8859-1) - to date, I've simply maintained a cygwin-specific patch to
quotearg that treats all characters above 0x80 as printable, even when the
C locale claims otherwise.

| This is far, far away from the original goal, and also neglects the
| of minimal surprise. I mean, if the goal is to solve ambiguities, then
| add enough escapes to solve ambiguities, but not more than that!

OK - then I think we're settled here - since we are using "" on the
outside of ambiguous strings, we do not need to worry about quoting most
remaining shell special characters.  Space, ?, (), [], {}, |, etc. can all
be output as-is - with no change to the quotearg module.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org


reply via email to

[Prev in Thread] Current Thread [Next in Thread]