Re: quotearg improvements [was: filenames in error messages]

From: Eric Blake
Subject: Re: quotearg improvements [was: filenames in error messages]
Date: Wed, 13 Feb 2008 18:12:50 -0700
According to Karl Berry on 2/13/2008 5:45 PM:
|     the "c" quoting style now outputs "\"?\"\"?/\""
|     ("?""?/") rather than "\"?\\?/\"" ("?\?/"),
| Sorry, I'm not following this.  What's the original filename?

Consider the original filename of `dir??/file'.  Before my patch, the
c_quoting_style converted it to `"dir?\?/file"', since `??/' is a trigraph
for `\', but that is not a valid C string.  Right now, the output is
`"dir?""?/file"', i.e. two concatenated C strings, so that a C parser
would unambiguously recognize the quoted output, even if it is parsing

|     this assumes that C string concatenation is acceptable in that style
| Then we'll have to say that.  I did not imagine that it would be
| necessary.  Indeed, it seems problematic to me, it means a parsing
| program has to recognize whether the character after the first string
| constant is another string constant or (I guess) a :.  That seems like
| nontrivial complexity to be adding.

Maybe it's worth another flag to the quotearg module, default off means
output trigraphs without worrying about extra quoting (since trigraphs
default to off in gcc), but when enabled, output concatenated C strings if
the output would otherwise be a trigraph.

|     #include "quotearg.h"
|     ...
|     set_quoting_style (NULL, c_maybe_quoting_style);
|     quotearg_colon (string);
| Excellent.
| Can we add something to the .texi about this?

I'll try to spend some time on this.

| Meanwhile, I had sent a proposed simple change to rms for standards.texi
| about this.  No problem with the principle, but he wants to specify the
| exact list of troublesome characters and one escape to use for each, not
| just say "like C string constants".
| I suppose we could always use \OOO, but somehow using \n and the like
| seems like it would be much more readable.  So it'll take me a little
| time to work up that list.  And I'm not sure what effect this new
| wrinkle will have on your code, sorry.

For C strings, the code already outputs \a, \b, \f, \n, \r, \t, \v, \\,
\"; and for all other non-printable characters, a 3-digit \nnn octal
(except for NUL, which is abbreviated to \0 if the next character output
is not a digit, but never a 2-digit octal).  But now you've made me worry
whether we should also quote shell characters.  For example, should it be:

program:question mark?:line: message
program:"question mark?":line: message

since both ' ' and '?' are special to the shell, but not to C strings?

