[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: distinguishing multibyte/unibyte ASCII
From: |
Toke Høiland-Jørgensen |
Subject: |
Re: distinguishing multibyte/unibyte ASCII |
Date: |
Fri, 09 Sep 2016 22:17:58 +0200 |
Stefan Monnier <address@hidden> writes:
>> If you just generate an ASCII string from ASCII characters, it will
>> usually be unibyte. If you take it as a substring from a multibyte
>> buffer, it will usually be multibyte.
>
> And it's arguably a wart in Emacs's handling of chars-vs-bytes.
> But it's kind of hard to fix now.
>
> At some point I tried to change this handling (not exactly fix it) by
> treating multibyte ASCII strings specially (it's easy to recognize by
> checking that the char length is equal to the byte length and both are
> readily available in the "struct Lisp_String" object). Then when we
> read an ASCII string, instead of making it unibyte, I'd keep it as
> multibyte. And then change things like "concat" so that those "ASCII
> multibyte" strings don't force the result to be multibyte.
>
> My local Emacs still runs with those changes, but in the end I don't
> think the result is really better (or sufficiently better to justify
> the subtle incompatibilities it introduces).
>
> [ Also, I wouldn't be surprised to hear that such a change causes real
> problems with utf-7 or EBCDIC, or other systems where decoding/encoding
> a string of bytes/chars all <127 is not a no-op. ]
Isn't Unicode fun? :)
-Toke
- Re: [PATCH] url: Wrap cookie headers in url-http--encode-string., (continued)
- Re: [PATCH] url: Wrap cookie headers in url-http--encode-string., Eli Zaretskii, 2016/09/09
- Re: [PATCH] url: Wrap cookie headers in url-http--encode-string., Alain Schneble, 2016/09/09
- Re: [PATCH] url: Wrap cookie headers in url-http--encode-string., Eli Zaretskii, 2016/09/09
- Re: [PATCH] url: Wrap cookie headers in url-http--encode-string., Alain Schneble, 2016/09/09
- Re: [PATCH] url: Wrap cookie headers in url-http--encode-string., Eli Zaretskii, 2016/09/09
- Re: [PATCH] url: Wrap cookie headers in url-http--encode-string., Toke Høiland-Jørgensen, 2016/09/09
- Re: [PATCH] url: Wrap cookie headers in url-http--encode-string., Eli Zaretskii, 2016/09/10
- Re: [PATCH] url: Wrap cookie headers in url-http--encode-string., Dmitry Gutov, 2016/09/10
- Re: [PATCH] url: Wrap cookie headers in url-http--encode-string., Eli Zaretskii, 2016/09/10
- distinguishing multibyte/unibyte ASCII (was: [PATCH] url: Wrap cookie headers in url-http--encode-string.), Stefan Monnier, 2016/09/09
- Re: distinguishing multibyte/unibyte ASCII,
Toke Høiland-Jørgensen <=
- Re: distinguishing multibyte/unibyte ASCII, Stefan Monnier, 2016/09/09
- Re: distinguishing multibyte/unibyte ASCII, Alain Schneble, 2016/09/09
- Re: distinguishing multibyte/unibyte ASCII (was: [PATCH] url: Wrap cookie headers in url-http--encode-string.), Eli Zaretskii, 2016/09/10
Re: [PATCH] url: Wrap cookie headers in url-http--encode-string., Lars Ingebrigtsen, 2016/09/07