bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fix wrong character count in argp


From: Bruno Haible
Subject: Re: Fix wrong character count in argp
Date: Sun, 12 Feb 2012 22:59:06 +0100
User-agent: KMail/4.7.4 (Linux/3.1.0-1.2-desktop; KDE/4.7.4; x86_64; ; )

Hi Vladimir,

Thank you for the proposed patch.

> As already reported several years ago

I cannot find it in my archives. Maybe that discussion already contained
some useful thoughts or arguments? Can you please point me to it?

> argp counts bytes even when
> actually what matters is the display length. This patch improves the
> situation by counting only leading and standalone UTF-8 bytes. It
> doesn't handle the double-width characters like Chinese sinograms

A program that needs to consider display length - for example for
line wrapping - should
  1) work with any locale encoding. Don't assume that the locale encoding
     is UTF-8.
  2) work with Chinese ideographs correctly, like it should also work
     with Russian (single-width) letters.

The easiest way to satisfy these two requirements is to base the code on
either
  * the function mbswidth (gnulib module mbswidth) and possibly also mbiter
    or mbuiter, or
  * the gnulib module unilbrk/ulc-width-linebreaks, it contains a complete
    line-breaking algorithm.

Can you rewrite your patch to this effect?

Also, such tricky issues should be checked in the test suite. Can you
please also provide a test program, some input data, and the expected
output for this data? We can then turn it into a gnulib test.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]