guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF16 encoding adds BOM everywhere?


From: Mark H Weaver
Subject: Re: UTF16 encoding adds BOM everywhere?
Date: Wed, 20 Jul 2022 16:42:53 -0400

Hi,

Jean Abou Samra <jean@abou-samra.fr> wrote:

> With this code:
> 
> (let ((p (open-output-file "x.txt")))
>    (set-port-encoding! p "UTF16")
>    (display "ABC" p)
>    (close-port p))
> 
> the sequence of bytes in the output file x.txt is
> 
> ['FF', 'FE', '41', '0', 'FF', 'FE', '42', '0', 'FF', 'FE', '43', '0']
> 
> FFE is a little-endian Byte Order Mark (BOM), fine.
> But why is Guile adding it before every character
> instead of just at the beginning of the string?
> Is that expected?

No, this is certainly a bug.  It sounds like the
'at_stream_start_for_bom_write' port flag is not being cleared, as it
should be, after the first character is written.  I suspect that it
worked correctly when I first implemented proper BOM handling in 2013
(commit cdd3d6c9f423d5b95f05193fe3c27d50b56957e9), but the ports code
has seen some major reworking since then.  I guess that BOM handling was
broken somewhere along the way.

I would suggest filing a bug report.  I don't have time to look into it,
sorry.  I don't work on Guile anymore.  I only happened to see your
message by chance.

     Regards,
       Mark

-- 
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]