bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20623: XML and HTML files with encoding/charset="utf-8" declaration


From: Alain Schneble
Subject: bug#20623: XML and HTML files with encoding/charset="utf-8" declaration loose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save
Date: Wed, 12 Oct 2016 23:44:57 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (windows-nt)

I'm joining this discussion and would like to report a recipe to
reproduce this issue on Windows:

- emacs -Q
- C-x C-f utf-8-bom-test.xml
- Enter the following text in the new buffer:
<?xml version="1.0" encoding="utf-8"?>
<root></root>
- C-x RET c utf-8-with-signature-dos C-x C-s yes RET
- C-x k RET
- C-x C-f utf-8-bom-test.xml
- M-: buffer-file-coding-system
  => utf-8-with-signature-dos
- Change buffer content, e.g. add some text to the root element:
<?xml version="1.0" encoding="utf-8"?>
<root>test</root>
- C-x C-s
- M-: buffer-file-coding-system
  => utf-8-dos
  (expected coding system: utf-8-with-signature-dos)

As it was already mentioned in this thread, just by visiting the file,
then changing and saving the buffer, the BOM gets lost.  This is due to
select-safe-coding-system (called by choose_write_coding_system) fully
trusting the coding system identified by find-auto-coding.  So far so
good.  The latter eventually calls auto-coding-functions which in turn
calls the built-in sgml-xml-auto-coding-function which I think should
take into account some context to enrich the derived coding system with
a signature if needed.  Similar to what select-safe-coding-system does
to enrich the coding with the proper eol-type.

Does that make sense to you?  If so, I'll try to come up with a patch
that enhances sgml-xml-auto-coding-function to take into account
buffer-file-coding-system (buffer + default value) in case it carries
the same text-conversion but different signature.  The proposed "auto
coding" shall inherit the signature in this case.

Thanks for any help.
Alain






reply via email to

[Prev in Thread] Current Thread [Next in Thread]