[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: One more string functions change

From: Dmitry Antipov
Subject: Re: One more string functions change
Date: Sun, 29 Jun 2014 20:38:26 +0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0

On 06/29/2014 07:13 PM, Eli Zaretskii wrote:

It's possible that we should consider this now, but the answer to your
question is not a trivial one in any case.  Emacs traditionally
exposed to Lisp all the Unicode character properties, as char-tables.

Are these exposed properties really used from Lisp in a high-level,
user-defined manner? For example, is it desirable/possible to customize
related things via .emacs? Or is there major/minor mode which relies
on the Lisp-visible character properties?

If we decide to use ICU, we'd need to think what to do with those
char-tables: remove them, populate them using ICU, something else?
(Having these databases twice would be an unnecessary bloat, IMO.)

Yes, ICU itself is bloated enough. On my system, shared library
with compiled-in Unicode data is > 20M. Nevertheless it's commonly
considered "not too bloated" even for relatively small systems like
the modern Android-based gadgets.

Some of these properties need to support very fast access (e.g., for
bidi display), and the question is how fast is ICU in this regard.
Also, many Unicode features are already implemented, so they should be
reworked or refactored, or maybe the corresponding ICU features left
unused.  And features that depend on Unicode, like font selection,
will have to be adapted.

IIUC the things are even worse because ICU uses 16- and 32-bit quantities
to represent Unicode characters; this doesn't look too compatible
with our internal variable-size, 1-5 bytes-width encoding.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]