bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#63029: [BUG?] format inconsistency in deciding string widths on diff


From: Eli Zaretskii
Subject: bug#63029: [BUG?] format inconsistency in deciding string widths on different locales
Date: Sun, 23 Apr 2023 17:19:10 +0300

> Date: Sun, 23 Apr 2023 18:23:02 +0800
> From:  Ruijie Yu via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> I don't quite know yet whether this is a bug in Emacs.  Here are the
> observed results, and note the unicode character:
> 
> --8<---------------cut here---------------start------------->8---
> $ for locale in {en_US,fr_FR,de_DE,zh_CN,ja_JA}.UTF-8; do
>     printf "$locale\t"
>     LANG="$locale" src/emacs -Q -batch \
>                    -eval '(message "%S" (format "%-5.5s" "1234…"))'
> done
> --8<---------------cut here---------------end--------------->8---
> 
> This results in the following output:
> 
> --8<---------------cut here---------------start------------->8---
> en_US.UTF-8   "1234…"
> fr_FR.UTF-8   "1234…"
> de_DE.UTF-8   "1234…"
> zh_CN.UTF-8   "1234 "
> ja_JA.UTF-8   "1234 "
> --8<---------------cut here---------------end--------------->8---
> 
> Notice that in zh_CN and ja_JA, we have a space instead of the expected
> ellipsis character.
> 
> 
> If this is expected behavior, how do we know how "wide" the `format'
> function thinks any given character is?  In other words, why _does_ it
> think "…" should be two-character wide?

This is a kludgey feature: in CJK locales some characters are always
considered double-width.  See code in characters.el that begins with a
comment around line 1140.  The function use-cjk-char-width-table
defined there is invoked (via the setup-function of the language
environment) when the language environment in Emacs is set to one of
those CJK locales.

The reason for this is that in CJK fonts these characters are supposed
to be rendered using full-width glyphs.

See also bug#54138 and
https://lists.gnu.org/archive/html/emacs-devel/2022-02/msg00917.html.

> And how do we, the elisp users, get this information?

I don't understand this question.  Please elaborate: what information
do you want to get, besides the width of the characters (which is
accessible via char-width-table).





reply via email to

[Prev in Thread] Current Thread [Next in Thread]