[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: unibyte<->multibyte conversion [Re: Emacs-diffs Digest, Vol 2, Issue

From: Kenichi Handa
Subject: Re: unibyte<->multibyte conversion [Re: Emacs-diffs Digest, Vol 2, Issue 28]
Date: Wed, 29 Jan 2003 20:23:23 +0900 (JST)
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <address@hidden>, "Stefan Monnier" <monnier+gnu/address@hidden> 

>>  In one sense, it seems clean to use the concept of decoding
>>  and encoding for all unibyte<->multibyte conversions
>>  coherently.  But, that hides what Emacs actually does.

> You mean that string-FOO-multibyte uses special-cased code
> and that there is thus a difference of efficiency ?

Yes.  string-FOO-multibyte are more effcient than
decode-coding-string.  But, that is not the point.

>>  > unibyte strings are sequences of bytes while multibyte
>>  > strings are sequences of chars.
>>  Unfortunately no.

> I don't think there is any "truth" here.  There are simply different
> ways to look at the same thing.

I don't understand why you don't think my explanation is not

You wrote:
>> Converting between bytes and chars is the purpose of
>> coding-systems.

Ok, then resulting region of encode-coding-region is a
sequence of bytes, not chars, even if it's a multibyte
buffer.  Thus, the return string of buffer-substring on that
region (let's name it MULTI) is also a byte sequence.

Using (string-to-unibyte MULTI) to get the same byte
sequence but in unibyte form is ok as long as we adopt my
interpretatoin of that function.

But, doing (encode-coding-string MULTI 'raw-text) is
conceptually broken because MULTI is already a byte

Ken'ichi HANDA

reply via email to

[Prev in Thread] Current Thread [Next in Thread]