[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: setenv -> locale-coding-system cannot handle ASCII?!
From: |
Kenichi Handa |
Subject: |
Re: setenv -> locale-coding-system cannot handle ASCII?! |
Date: |
Wed, 26 Feb 2003 16:49:15 +0900 (JST) |
User-agent: |
SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) |
In article <address@hidden>, "Stefan Monnier" <monnier+gnu/address@hidden>
writes:
>> Why is it not needed? Strings and buffers are not that
>> different, both are containers of characters.
> They are used differently. Operations on strings generally apply to the
> whole string: you can only encode/decode a whole string at a time.
That's because of the limitation of the current
implementation, not because of the nature of strings.
There's no reason for keeping that limitation. Actually, as
we have changed the type Lisp_String in 21.1, it's not
difficult to make strings change length.
>> If we get a unibyte string from a unibyte buffer by buffer-substring,
>> how should we treat that string?
> Like any other unibyte string: as a sequence of raw bytes.
> If you want to treat it as a sequence of characters, then
> you need to pass it through `string-as-multibyte'.
If we regard that limitation as a nature of strings, your
idea is worth considering. It seems that we can at least
construct a consistent explanation about its behaviour based
on your idea too.
------------------------------------------------------------
What a character in a unibyte buffer represents depends on a
context. It may be a character represented by a single
byte, or a raw byte not yet decoded, or a byte constituing a
multibyte form of the different character.
On the other hand, a character in a unibyte string always
represents a raw byte. Emacs coerces it into a character
represented by that single byte when a unibyte string is
concatenated with a multibyte string, or it is inserted in a
multibyte buffer.
------------------------------------------------------------
But, I'm not sure such a change is really necessary. Are
you sure that the change doesn't break the current usage of
unibyte strings?
>> The latter yields multibyte, but I think it'a bug. I found
>> that "(format "%s" 1)" is implemented by using
>> prin1-to-string, and prin1-to-string prints an object to a
>> temporary buffer and gets that buffer string. So, in a
>> multibyte sesstion "(format "%s" 1)" yields a multibyte
>> string. :-(
> I know: I bumped into it yesterday while playing around with tar-mode.
> How about the attached patch ?
Please see the comments below.
>> So, do you mean that you want this?
>>
>> If a unibyte buffer has \201\300 in the region FROM and TO,
>>
>> (encode-coding-string (buffer-substring FROM TO) 'iso-latin-1)
>> => "\201\300"
>>
>> (encode-coding-region FROM TO 'iso-latin-1) changes the
>> region to \300.
> Yes, I guess I'd be happy with it.
>> Isn't it more confusing?
> Not to me.
What do the other people think about it?
> PS: I wish there was a way to swap two buffers's content so that
> tar-mode could swap the (potentially very large) data to
> a helper buffer (without needing to copy this large data)
> and then use multibyte for the display and unibyte for
> the helper buffer.
I don't understand what you mean, especially the usage of
the helper buffer.
I think tar-mode should use multiple buffers, one unibyte
buffer for tar-file itself, one multibyte buffer for table
of contents, and the other multibyte buffers (created on
demand) for viewing/editing files contained in the tar-file.
Then, tar mode works almost the same way as dired. We can
see multibyte files in the different buffers. We can use
the same method in arc-mode and also in RMAIL.
Is that different from what you mean?
---
Ken'ichi HANDA
address@hidden
- setenv -> locale-coding-system cannot handle ASCII?!, Sam Steingold, 2003/02/24
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Kenichi Handa, 2003/02/25
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Miles Bader, 2003/02/25
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Kenichi Handa, 2003/02/25
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Stefan Monnier, 2003/02/25
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Kenichi Handa, 2003/02/25
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Stefan Monnier, 2003/02/25
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Kenichi Handa, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Stefan Monnier, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!,
Kenichi Handa <=
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Kenichi Handa, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Stefan Monnier, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Stefan Monnier, 2003/02/26
- tar-mode, Kenichi Handa, 2003/02/26
- Re: tar-mode, Stefan Monnier, 2003/02/26
- Re: tar-mode, Kenichi Handa, 2003/02/26
- Re: tar-mode, Stefan Monnier, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Richard Stallman, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Richard Stallman, 2003/02/26
- Re: setenv -> locale-coding-system cannot handle ASCII?!, Richard Stallman, 2003/02/26