bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question from Austin Group regarding standardization of msgfmt


From: Bruno Haible
Subject: Re: Question from Austin Group regarding standardization of msgfmt
Date: Sun, 16 Jan 2022 21:37:43 +0100

Hi,

Eric Blake wrote:
> The Austin Group (the standards body in charge of the POSIX document)
> is trying to standardize the gettext(3) family of functions, as well
> as command line tools such as gettext(1) and xgettext(1).  You can
> track the efforts here, and if you have comments, I'm happy to relay
> them back to the Austin Group:
> 
> https://posix.rhansen.org/p/gettext_draft

Thanks for this info.

> At the moment, there is a particular question about GNU msgfmt(1)
> behavior.  The Austin Group has noted the current documented
> behaviors, first with GNU xgettext

You mean GNU msgfmt. GNU xgettext has an option '-c' too, but it
has a completely different meaning.

> having two two separate options, -c
> and -v, which are currently orthogonal:
> 
>        -c, --check
>               perform    all    the    checks   implied   by   --check-format,
>               --check-header, --check-domain
> 
>        -v, --verbose
>               increase verbosity level
> 
> and contrasting with Solaris msgfmt
> (https://docs.oracle.com/cd/E36784_01/html/E36870/msgfmt-1.html),
> which has no -c, but documents:
> 
> –v
> –−verbose
> 
>     Verbose. Lists duplicate message identifiers if Solaris message
>     catalog files are processed. Message strings are not redefined.
> 
>     If GNU-compatible message files are processed, this option detects
>     and diagnoses input file anomalies which might represent
>     translation errors. The msgid and msgstr strings are studied and
>     compared. It is considered abnormal if one string starts or ends
>     with a newline while the other does not. Also, if the string
>     represents a format string used in a printf-like function, both
>     strings should have the same number of % format specifiers, with
>     matching types. If the flag c-format appears in the special
>     comment '#' for this entry, a check is performed.
> 
> The question on the floor is whether GNU msgfmt would consider
> tweaking behavior so that -v implies -c (that is, turning on verbosity
> now also turns on format checking), so that there is one less option
> letter to standardize, and so that users can just rely on 'msgfmt -v'
> for message checking regardless of GNU or Solaris implementation.
> 
> Or put another way, the Austin Group would like to standardize only:
> 
>      -v    Verbose. If this option is specified, msgfmt shall detect and
>     diagnose input file abnormalities which might represent
>     translation errors. The msgid and msgstr strings shall be
>     compared. It shall be considered abnormal if one string starts or
>     ends with a <newline> while the other does not.  Also, if the flag
>     c-format appears in a "#," comment for this entry, it shall be
>     considered abnormal if the strings do not have the same number of
>     '%' conversion specifiers, or if corresponding conversion
>     specifiers take different argument types (see [xref to
>     fprintf()]). If an abnormality is detected, the exit status shall
>     be non-zero and a diagnostic message shall be output.
> 
> which would still leave -c as a GNU extension, but give users the
> ability to get format checking across both implementations with just
> -v.

I object, for three reasons:

OBJECTION 1:
   The text that you propose is incompatible with *both* GNU msgfmt
   and Solaris msgfmt.
   Namely,
     - In GNU msgfmt, the option '-v' increases verbosity without diagnosing
       abnormalities, and does *not* have an effect on the exit status.
     - In Solaris msgfmt, the option '-v' increases verbosity through
       diagnostics of abnormalities, and for most such abnormalities does
       *not* have an effect on the exit status either. Only for duplicate
       msgids does it have an effect on the exit status.

But of course, it is good to realize that presenting error-like diagnostics
with no influence on the exit status is not useful in practice. In fact,
both
  - desktop translation tools and
  - web-based translation services
use "msgfmt -c" to test whether the PO file is ready to submit/accept, by
looking at the exit code of this command.

OBJECTION 2:
  Not introducing a '-c' option is pointless, because (as just said)
  this is the main option for checking the validity / soundness of a PO file.
  It is widely used in practice. Use of 'msgfmt' without the option '-c'
  is neither useful not frequent, because who wants a .mo file that is
  able to crash the application that opens and uses it?

  Suggestion: Add a '-c' option. Describe it in abstract terms. Don't
  describe it as "perform all the checks implied by --check-format,
  --check-header, --check-domain", because we want to be able add
  different kinds of checks in the future (like accelerators).

OBJECTION 3:
  Making a '-v' option change the exit status of a utility would be a
  deviation from current practice for existing POSIX utilities.
  The POSIX utilities that have a '-v' option that increases verbosity are:
    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ar.html
    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/command.html
    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/compress.html
    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/lex.html
    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/od.html
    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html
    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sh.html
    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/uncompress.html
    https://pubs.opengroup.org/onlinepubs/9699919799/utilities/yacc.html
  For none of them, the '-v' option has an effect other than to produce
  more or different output.

In summary, the best course of action is to have two orthogonal options
  '-v', that increases the verbosity,
  '-c', that diagnoses abnormalities and modifies the exit code accordingly.

Bruno






reply via email to

[Prev in Thread] Current Thread [Next in Thread]