--- Begin Message ---
Subject: |
No coding system used for environment variables |
Date: |
Wed, 05 Mar 2008 00:24:13 +0000 |
User-agent: |
Thunderbird 2.0.0.12 (Windows/20080213) |
-------- Original Message --------
Subject: No coding system used for environment variables
Date: Thu, 21 Feb 2008 22:40:40 +0100
From: Göran Uddeborg <goeran@uddeborg.se>
To: bug-gnu-emacs@gnu.org
It seems there is no coding system applied to values of environment
variables.
I'm running a system using UTF-8. My locale is sv_SE.utf8. And emacs
uses UTF-8 as default most of the time. When I open a new file for
example.
I do have issues with strings coming from environment variables
though. I first discovered this in the vm mail system, since it
misinterpreted the variable MAIL which has the value
/var/spool/mail/göran encoded in UTF-8. (In case your mailer mangles
it, the last file name component is g ö r a n.) But it also
causes problems in various places, for example with functions relating
to the home directory. $HOME is /home/göran (same last component as
before).
As an example, I start emacs in my home directory, and do a few
experiments in the scratch buffer (which has a "u" for coding system
in the mode line):
default-directory
"/home/göran/"
Looks good. I see my ö.
(expand-file-name "")
"/home/göran"
Ok too.
(expand-file-name "~")
"/home/g\303\266ran"
Here the octal codes for a UTF-8 encoded ö is shown instead of
the ö itself. The source of ~ is the environment variable HOME.
But if I explicitly ask for that variable:
(getenv "HOME")
"/home/göran"
Here I see the ö.
Let's have a bit more fun. Here I try to expand a FILE with my own
name:
(expand-file-name "göran")
"/home/göran/göran"
Looks the way I expected it. Now the same thing, explicitly saying to
put it in the home directory:
(expand-file-name "~/göran")
"/home/g\xc3\xb6ran/göran"
The ö in the file name is ok. The ö in the directory name
is strange again, only this time it is shown in hex rather than octal.
I asked about this on gnu.emacs.help first,
(http://groups.google.se/group/gnu.emacs.help/browse_thread/thread/80258d0a17e37138/75411fce63db9b2c#75411fce63db9b2c)
I was unsure if it was a bug or my lack of understanding. But two
other posters have suggested I report it as a bug.
In GNU Emacs 22.1.1 (x86_64-redhat-linux-gnu, GTK+ Version 2.12.1)
of 2007-11-06 on xenbuilder2.fedora.redhat.com
Windowing system distributor `The X.Org Foundation', version 11.0.70101000
configured using `configure '--build=x86_64-redhat-linux-gnu'
'--host=x86_64-redhat-linux-gnu' '--target=x86_64-redhat-linux-gnu'
'--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin'
'--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share'
'--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec'
'--localstatedir=/var' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man'
'--infodir=/usr/share/info' '--with-pop' '--with-sound' '--with-gtk'
'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu'
'target_alias=x86_64-redhat-linux-gnu' 'CFLAGS=-DMAIL_USE_LOCKF
-DSYSTEM_PURESIZE_EXTRA=16777216 -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: sv_SE.utf8
locale-coding-system: utf-8
default-enable-multibyte-characters: t
Major mode: Fundamental
Minor modes in effect:
which-function-mode: t
tooltip-mode: t
mouse-wheel-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
unify-8859-on-encoding-mode: t
utf-translate-cjk-mode: t
auto-compression-mode: t
temp-buffer-resize-mode: t
line-number-mode: t
transient-mark-mode: t
Recent input:
? <return> M-< C-n C-k C-k <switch-frame> <switch-frame>
<switch-frame> C-y <switch-frame> <next> <down> <down>
<down> <up> <up> <up> <up> <up> <up> <up> <up> p <switch-frame>
<switch-frame> <switch-frame> <down-mouse-2> <mouse-2>
<backspace> C-j C-x 4 C-f . e m <tab> <return> C-s
v m - s p o o M-< C-x C-f . v m C-g C-x C-f ~ / . v
m <return> C-s C-g C-_ C-s v m - s p o o l - f i l
e s C-a ; C-x C-s <help-echo> <switch-frame> <switch-frame>
q <switch-frame> C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-c C-g <switch-frame> <help-echo>
<help-echo> C-x M q q <help-echo> C-x M n n n n n n
n M-< C-s S E C C-a SPC <switch-frame> <help-echo>
C-x C-f ~ / N <tab> <return> C-x c C-a C-k r p m g
r e p SPC l h a <return> ! <help-echo> <help-echo>
<switch-frame> <switch-frame> <help-echo> <switch-frame>
<switch-frame> <help-echo> <switch-frame> <switch-frame>
<switch-frame> <switch-frame> <help-echo> <switch-frame>
d <switch-frame> <help-echo> C-u C-u C-u <f6> C-x o
C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p
C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p M->
C-x o C-x C-f <M-backspace> <M-backspace> u p d <return>
<help-echo> <switch-frame> M-> C-p C-p C-p C-p C-p
C-p C-p C-p C-p C-p C-p SPC n n d d d e <next> <down>
<down> <down> <down> <down> <down> <down> <down> <down>
<down> <down> <down> <down> <down> <down> <down> <down>
<down> <down> <down> <down> <down> <down> <down> <down>
<down> <down> <down> <down> <down> <left> = C-c C-c
SPC SPC <backspace> <backspace> <down-mouse-2> <mouse-2>
<switch-frame> <switch-frame> <switch-frame> <switch-frame>
n <down-mouse-2> <mouse-2> s I <tab> <return> q d SPC
<switch-frame> M-x r e p o <tab> r <tab> <return>
Recent messages:
End of message 1059 from Göran Uddeborg
Loading vm-digest...done
Decoding MIME message... done
End of message 1 from Gunilla Christensson
1 message saved to buffer INBOX
Quitting...
Decoding MIME message... done
End of message 1060 from Göran Uddeborg
Making completion list...
Loading emacsbug...done
--- End Message ---
--- Begin Message ---
Subject: |
bug#38: Re: No coding system used for environment variables |
Date: |
Tue, 24 Mar 2009 22:23:12 +0800 |
User-agent: |
Thunderbird 2.0.0.21 (Windows/20090302) |
Chong Yidong wrote:
Jason Rumney <jasonr@gnu.org> writes:
Two objections were made to Jason's patch: (i) some coding systems are
not ready until some .elc files get loaded (relevant for special cases,
such as the EMACS_LOAD_PATH variable), and (ii) DECODE_FILE causes GC,
so variables such as `nm' in Fexpand_file_name may not point to valid
data after that.
If no elegant solution is forthcoming, I'd suggest simply documenting
(i) as a limitation, and dealing with (ii) by simply turning off GC in
the affected part of the function.
I think the GC part can be handled the same way as in bug #93
Okay. Could you put your patch back in, with the proper GC handling?
I've finally looked at this, and the case for Fsubstitute_file_name
looks much simpler than Fexpand_file was. Only one Lisp_String was
internally referenced, and that was already copied on DOS_NT, so I moved
the copy outside of the #ifdef so that all platforms now work with a
copy of the original string.
--- End Message ---