bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20258: 24.5; format-time-string miscounting of multibyte characters


From: Lars Ingebrigtsen
Subject: bug#20258: 24.5; format-time-string miscounting of multibyte characters
Date: Mon, 30 Sep 2019 05:09:08 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

Stefan Kangas <stefan@marxist.se> writes:

>>> As the subject says, format-time-string miscounts multibyte characters.
>>> Simple example with nb_NO.utf8 locale, where ø is two bytes:
>>>
>>> (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015"))
>>> "  lø."
>>>
>>> (length (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 
>>> 2015")))
>>> 5
>>
>> 'length' counts characters, not bytes.  If you need to count bytes,
>> use 'string-bytes' instead:
>>
>>   (string-bytes "  lø.") => 6
>
> I can see no bug here, only a misunderstanding about the length
> function.  I'm therefore closing this bug.  If that's incorrect, please
> reopen this bug report.

But the issue here is that "%6a" should give you a string that's six
characters long, I think?  Admittedly the doc string is vague here:

---
A field width N is an unsigned decimal integer with a leading digit nonzero.
%NX is like %X, but takes up at least N positions.
---

But the natural interpretation of "positions" isn't bytes, I think, and
if is, then the doc string should say so.

(let ((system-time-locale "nb_NO.UTF-8"))
  (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015")))
=> "  lø."

(if you have that locale in /etc/locale.gen.)

But I seem to remember from previous discussions that this quirk is in
the C strftime function?  And Emacs just call it?  I haven't checked.
But this means that you can't use format-time-string to line stuff up,
but have to use `format':

(let ((system-time-locale "nb_NO.UTF-8"))
  (format "%6s" (format-time-string "%a" (date-to-time "Sat Apr  4 16:14:40 
2015"))))
=> "   lø."

So I think what WIDTH means should be said explicitly in the doc string.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





reply via email to

[Prev in Thread] Current Thread [Next in Thread]