bug#9747: M-x untabify with "ZERO WIDTH NO-BREAK SPACE" (aka "BYTE ORDER

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#9747: M-x untabify with "ZERO WIDTH NO-BREAK SPACE" (aka "BYTE ORDER

From:	Lars Ingebrigtsen
Subject:	bug#9747: M-x untabify with "ZERO WIDTH NO-BREAK SPACE" (aka "BYTE ORDER MARK")
Date:	Fri, 16 Jul 2021 15:57:52 +0200
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Juri Linkov <juri@jurta.org> writes:

>> I often use C-x h TAB and M-x untabify to format C, C++, and Java code.
>>
>> If a document has an errant UTF-8 byte order mark (a UTF-8 BOM is EF
>> BB BF), Emacs cannot always format the source file.
>>
>> For example, the attached Java file (JavaEncryptor.java-backup) has
>> 1845 BOMs sprinkled throughout. I'm not sure what editor put them in,
>> but Emacs does not properly handle some operations with them present.
>> If I strip the errant BOMs with the attached program
>> (efbbbf-strip.cpp), Emacs will properly format the file.
>
> "BYTE ORDER MARK" is the old name of the U+FEFF character.
> The new name is "ZERO WIDTH NO-BREAK SPACE".

So I don't think there's anything here to fix on the Emacs side --
zero-width spaces aren't necessarily supposed to be handled identically
to other white space here.  So I'm closing this bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

[Prev in Thread]

Current Thread

[Next in Thread]

bug#9747: M-x untabify with "ZERO WIDTH NO-BREAK SPACE" (aka "BYTE ORDER MARK"), Lars Ingebrigtsen <=

Prev by Date: bug#9791: 23.1; Non latin characters display with next keystroke in emacs -nw
Next by Date: bug#49507: 28.0.50; macOS: Symbol’s value as variable is void: lock-file-name-transforms
Previous by thread: bug#9791: 23.1; Non latin characters display with next keystroke in emacs -nw
Next by thread: bug#9729: 24.0.50; can't editing buffers while in another buffer i'm openning files
Index(es):
- Date
- Thread