[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF16 encoding adds BOM everywhere?

From: Mark H Weaver
Subject: Re: UTF16 encoding adds BOM everywhere?
Date: Wed, 20 Jul 2022 16:42:53 -0400


Jean Abou Samra <> wrote:

> With this code:
> (let ((p (open-output-file "x.txt")))
>    (set-port-encoding! p "UTF16")
>    (display "ABC" p)
>    (close-port p))
> the sequence of bytes in the output file x.txt is
> ['FF', 'FE', '41', '0', 'FF', 'FE', '42', '0', 'FF', 'FE', '43', '0']
> FFE is a little-endian Byte Order Mark (BOM), fine.
> But why is Guile adding it before every character
> instead of just at the beginning of the string?
> Is that expected?

No, this is certainly a bug.  It sounds like the
'at_stream_start_for_bom_write' port flag is not being cleared, as it
should be, after the first character is written.  I suspect that it
worked correctly when I first implemented proper BOM handling in 2013
(commit cdd3d6c9f423d5b95f05193fe3c27d50b56957e9), but the ports code
has seen some major reworking since then.  I guess that BOM handling was
broken somewhere along the way.

I would suggest filing a bug report.  I don't have time to look into it,
sorry.  I don't work on Guile anymore.  I only happened to see your
message by chance.


Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <>.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]