bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#15803: default-file-name-coding-system: utf-8 better than latin-1 th


From: Eli Zaretskii
Subject: bug#15803: default-file-name-coding-system: utf-8 better than latin-1 these days?
Date: Fri, 11 Sep 2020 15:24:14 +0300

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: rgm@gnu.org,  15803@debbugs.gnu.org
> Date: Fri, 11 Sep 2020 13:27:28 +0200
> 
> make[1]: Entering directory '/home/larsi/src/emacs/f�o/test'
>   ELC      lisp/eshell/eshell-tests.elc
> foo2: 
> "#(\"/home/larsi/src/emacs/fóo/test/lisp/eshell/eshell-tests.elcnjDFYY\" 0 65 
> (charset iso-8859-1))"
> >>Error occurred processing lisp/eshell/eshell-tests.el: File is missing 
> >>(("Doing chmod" "No such file or directory" 
> >>"/home/larsi/src/emacs/f\303\263o/test/lisp/eshell/eshell-tests.elcnjDFYY"))
> make[1]: *** [Makefile:165: lisp/eshell/eshell-tests.elc] Error 1
> 
> So it's created a tempfile, tagged with the correct charset (I had no
> idea that that's how it worked), but decoded, and then set-file-modes
> interprets that as an UTF-8 file name.
> 
> So...  it's a bug in set-file-modes?  Hm, nope, write-region has the
> same problem.

There be dragons ;-)

The problematic aspect of debugging these problems is that what you
see is not always what's there, due to display and decoding/encoding
operations by both Emacs and the display software you have on your
system (which drives the terminal).

In particular, strings inside Emacs are always in UTF-8-compatible
encoding, so the fact you get UTF-8 in *Messages* doesn't prove
anything.  What we need is to find 2 types of possible problems:

  . raw bytes from Latin-1 encoding inside Emacs buffers or strings
    that are supposed to be decoded
  . UTF-8 encoded (instead of Latin-1 encoded) characters passed to
    libc functions

So if you found that the problem reveals itself in set-file-modes,
let's see what happens there.  The relevant code is this:

  char *fname = SSDATA (ENCODE_FILE (absname));
  mode_t imode = XFIXNUM (mode) & 07777;
  if (fchmodat (AT_FDCWD, fname, imode, nofollow) != 0)
    report_file_error ("Doing chmod", absname);

Please either run this under GDB, or add printf's, to show the byte
sequences of 'absname' and of 'fname'.  The former should be in UTF-8
(so you should see 0xC3 and 0xB3 for the ó character), the latter
should be in Latin-1 (so you should see 0xF3 for the same letter).
This should give us some hints wrt where to look for the cause of the
problem.

> That weird file name (decoded and tagged with a charset text parameter)
> comes from make-temp-file -- everything seems to be OK before that.
> target-file is:
> 
> foo: "\"/home/larsi/src/emacs/f\\363o/test/lisp/eshell/eshell-tests.elc\""
> 
> which seems to be correct,

Where does the "foo:" printout comes from?  I wouldn't expect to see
Latin-1 encoded strings inside Emacs, not normally anyway.

> but
> 
>                      (tempfile
>                       (make-temp-file (expand-file-name target-file)))
> 
> is
> 
> "#(\"/home/larsi/src/emacs/fóo/test/lisp/eshell/eshell-tests.elcnjDFYY\" 0 65 
> (charset iso-8859-1))"

I see nothing wrong here: this is how decoding works in Emacs.  And
again, how did you produce this string?  As I explained above, the
details of how you display these strings matter in this case.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]