bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM


From: Lars Ingebrigtsen
Subject: bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM
Date: Sun, 03 Jul 2022 14:07:43 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

Lars Ingebrigtsen <larsi@gnus.org> writes:

> Hm...  I guess the only reliable solution across all coding systems is
> (like your comment in the code says) to drop the encode-every-char and
> try encoding strings, and then see whether the result is short enough.
> That could be done somewhat efficiently using a binary search.  I'll
> have a go at it...

And while I was at it, I changed it to return complete glyphs, not just
complete code points.

There's a behavioural change, though.  This: 

(string-limit "foóá" 6 t 'utf-16)

Now returns a string with a BOM, whereas previously it didn't.  I think
that's what callers would want, though (the use case here is really
IRC -- you have to limit the max encoded length, but I think if you're
talking utf-16, you want the BOM).

But it's debatable.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





reply via email to

[Prev in Thread] Current Thread [Next in Thread]