bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: special characters in filenames in error messages


From: Bruno Haible
Subject: Re: special characters in filenames in error messages
Date: Wed, 3 Dec 2008 13:50:20 +0100
User-agent: KMail/1.9.9

Hello Karl,

> new proposal is to use octal escapes \ooo (and
> nothing else).  Specifically:
> 
> 1) if the first character of the source is a ", the source name extends
>    to the next ".  (Including :'s, for example.)
> 2) within that "...", \ooo is recognized.  For instance, \042=" and \134=\,
>    so this weird filename (5 chars):
> a:"\b
>    would be output in an error message as:
> "a:\042\134b":10:some message
> 
> This should be about as easy to parse as it can be, and it follows rms's
> "one character one specification" rule.
> 
> The \ooo convention could be used for any character, but in an 8-bit
> clean environment, I believe it is only *required* for " and \.

1) For which filenames should the escaped syntax be used and for which not?

   From the syntax you gave, and assuming a program that parses line by line,
   it is *required* only if the filename

     - contains a ':' or newline, or
     - starts with a '"'.

   Is that the intent?

   Or is the intent that it be used for a wider class of frequently used
   filenames, such as
     - filenames that contain non-ASCII characters? (I hope not, this would
       punish people who use their native script for filenames.)
     - filenames that contain a space? (I hope not, since these are very
       frequent.)

2) Native Windows filenames (which occur in mingw ports of GNU programs)
   often contain a colon and backslashes: 'C:\SUBDIR\0049\FOOBAR.PNG'
   It would be useful, in my opinion, to use a syntax that preserves the
   human-readability and copy&paste-ability of such filenames.

   I.e. I would like to see above filename being escaped as
   "C:\SUBDIR\0049\FOOBAR.PNG". But this is incompatible with the use
   of \nnn as escape sequence.

   Since the use of % in filenames is less frequent than the backslash
   (on Windows), and since there is already an RFC standard for how to
   escape URIs (<http://www.ietf.org/rfc/rfc2396.txt>), I would vote
   for using the %nn syntax.

   So my proposal is:

   - For parsing:
     - If the first character is a '"', then the escaped syntax is
       in use. The filename is enclosed in "..."; inside,
         - occurrences of '"' and '%' are escaped as %22 and %25,
           respectively,
         - other ASCII characters may be escaped in %nn syntax as well,
           where nn is the hexadecimal notation (case insignificant)
           of the byte value in the ASCII encoding.
     - Otherwise, the filename ends at the first ':' or end of line.

   - For output:
     The escaped syntax is required if the filename contains a ':' or
     newline, or starts with a '"'. It may also be used for other
     filenames.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]