[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: setenv -> locale-coding-system cannot handle ASCII?!

From: Stefan Monnier
Subject: Re: setenv -> locale-coding-system cannot handle ASCII?!
Date: Wed, 26 Feb 2003 03:12:38 -0500

> In article <address@hidden>, "Stefan Monnier" <monnier+gnu/address@hidden> 
> writes:
> >>  Why is it not needed?  Strings and buffers are not that
> >>  different, both are containers of characters.
> > They are used differently.  Operations on strings generally apply to the
> > whole string: you can only encode/decode a whole string at a time.
> That's because of the limitation of the current
> implementation, not because of the nature of strings.

I think it's the nature of strings and of the functions
we provide on them.  If the user wants to do anything else,
she uses a buffer instead where modifications are easy to
make and where you have things like markers, point, ...

> There's no reason for keeping that limitation.  Actually, as
> we have changed the type Lisp_String in 21.1, it's not
> difficult to make strings change length.

Actually, strings are virtually never changed so it would be silly
to do that.  When a string needs to be changed, 99% of the functions
simply create a new string.  I know of only two cases where a string
is mutated: set-text-properties and aset.
The copy of Emacs that I use every day has aset disabled on strings
and works very well despite that (it did require a few minor
changes in a handful of packages).
Emacs strings are 99% immutable.  In practice it's also the case for
Scheme strings, BTW, and there's always been hot debates about whether
or not to change the Scheme language to specify that strings are
not mutable.

> ------------------------------------------------------------
> What a character in a unibyte buffer represents depends on a
> context.  It may be a character represented by a single
> byte, or a raw byte not yet decoded, or a byte constituing a
> multibyte form of the different character.
> On the other hand, a character in a unibyte string always
> represents a raw byte.  Emacs coerces it into a character
> represented by that single byte when a unibyte string is
> concatenated with a multibyte string, or it is inserted in a
> multibyte buffer.
> ------------------------------------------------------------
> But, I'm not sure such a change is really necessary.  Are
> you sure that the change doesn't break the current usage of
> unibyte strings?

I'm pretty sure it'll break current usage in a few places.

> > PS: I wish there was a way to swap two buffers's content so that
> >     tar-mode could swap the (potentially very large) data to
> >     a helper buffer (without needing to copy this large data)
> >     and then use multibyte for the display and unibyte for
> >     the helper buffer.
> I don't understand what you mean, especially the usage of
> the helper buffer.
> I think tar-mode should use multiple buffers, one unibyte
> buffer for tar-file itself, one multibyte buffer for table
> of contents, and the other multibyte buffers (created on
> demand) for viewing/editing files contained in the tar-file.
> Then, tar mode works almost the same way as dired.  We can
> see multibyte files in the different buffers.  We can use
> the same method in arc-mode and also in RMAIL.
> Is that different from what you mean?

No, that's exactly what I meant, but the problem is the following:

When tar-mode is called, the current buffer already contains the 24MB
binary content of the file and it is also the buffer that should
in the end contain the table of contents, so you need to somehow
move those 24MB from this buffer to a new one (the "helper" one).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]