bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

address@hidden: Re: quotation characters]


From: Karl Berry
Subject: address@hidden: Re: quotation characters]
Date: Thu, 23 Jun 2005 20:21:33 -0400

Here are rms's comments on our draft about the quote character stuff.

1) I don't know how to address his problem with "domain at hand", please help?

2) I hope that if I point the "preferably", and that gcc is using
   '...', and that the rest of the world thinks ' is the standard, he
   will let the text pass.  Are there other arguments that might
   persuade him?

3) I deleted the sentence.  Draft appended.

Thanks,
k


Date: Sun, 12 Jun 2005 15:57:57 -0400
From: Richard Stallman <address@hidden>
To: address@hidden (Karl Berry)
Subject: Re: quotation characters

    Sticking to the ASCII character set (plain text, 7-bit characters) is
    preferred in GNU source code comments, text documents, and other
    contexts, unless there is good reason to do something else because of
    the domain at hand.

I am not sure what "the domain at hand" means.  Please look for some
other way to say whatever it is.

    In the C locale, GNU programs should stick to plain ASCII for
    quotation characters in messages to users: preferably 0x60 (`) for
    left quotes and 0x27 (') for right quotes.  If using ` is unacceptable
    in your application, other possibilities are using ' for both opening
    and closing, or 0x22 (") for both opening and closing.  It is ok, but
    not required, to use locale-specific quotes in other locales.

    The @uref{http://www.gnu.org/software/gnulib/, Gnulib} @code{quote}
    and @code{quotearg} modules provide a reasonably straightforward way
    to support locale-specific quote characters, as well as taking care of
    other issues, such as quoting a filename that itself contains a quote
    character.  See the Gnulib documentation for usage details.

I thought we were going to tell people to use `quote',
not just mention it as a possibility.

      Latin1 does have paired standalone accents, but it seems
    wrong in principle to abuse them as quotes.

We should not say this is a matter of principle.
It is purely a practical matter.

--

@node Character set
@section Character set
@cindex character set
@cindex encodings
@cindex ASCII characters
@cindex non-ASCII characters

Sticking to the ASCII character set (plain text, 7-bit characters) is
preferred in GNU source code comments, text documents, and other
contexts, unless there is good reason to do something else because of
the domain at hand.

If you need to use non-ASCII characters, for example to represent
names of contributors, you should normally stick with one encoding, as
one cannot in general mix encodings reliably.  


@node Quote characters
@section Quote characters
@cindex quote characters

In the C locale, GNU programs should stick to plain ASCII for
quotation characters in messages to users: preferably 0x60 (`) for
left quotes and 0x27 (') for right quotes.  If using ` is unacceptable
in your application, other possibilities are using ' for both opening
and closing, or 0x22 (") for both opening and closing.  It is ok, but
not required, to use locale-specific quotes in other locales.

The @pxxref{http://www.gnu.org/software/gnulib/, Gnulib} @code{quote}
and @code{quotearg} modules provide a reasonably straightforward way
to support locale-specific quote characters, as well as taking care of
other issues, such as quoting a filename that itself contains a quote
character.  See the Gnulib documentation for usage details.

In any case, the documentation for your program should clearly specify
how it does quoting, if different than the preferred method of ` and
'.  This is especially important if the output of your program is ever
likely to be parsed by another program.

Quotation characters are a difficult area in the computing world at this
time: there are no true left or right quote characters in ASCII, or even
Latin1; the ` character we use was standardized as a grave accent.
However, Latin1 is still not universally usable.

Unicode contains the unambiguous quote characters required, and its
common encoding UTF-8 is upward compatible with address@hidden  However,
Unicode and UTF-8 are not universally well-supported, either. 

This may change over the next few years, and then we will revisit
this.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]