[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unicode and Guile
From: |
Tom Lord |
Subject: |
Re: Unicode and Guile |
Date: |
Tue, 11 Nov 2003 17:40:28 -0800 (PST) |
> From: Marius Vollmer <address@hidden>
> Tom Lord <address@hidden> writes:
> > ~ (grapheme=? g1 g2 [locale]) => <boolean>
> > ~ (grapheme<? g1 g2 [locale])
> > ~ (grapheme>? g1 g2 [locale])
> > [...]
> > ~ (grapheme-ci=? g1 g2 [locale])
> > ~ (grapheme-ci<? g1 g2 [locale])
> > ~ (grapheme-ci>? g1 g2 [locale])
> > The usual orderings.
> Is it a good idea to have an ordering among graphemes, or would it be
> better to only order texts, i.e., to allow for the context of a
> grapheme to determine the order?
I think it's a fine idea to order graphemes but, depending on the
locale, the ordering of texts is _not_ a lexical ordering grounded in
grapheme ordering.
It would be good to provide a locale, perhaps the default, in which
ordering of texts _is_ a lexical ordering grounded in (default)
grapheme order.
> > ~ (make-text-marker text index) => <marker>
> What about having _only_ markers and not allow integers as
> indices?
Seems excessive and aribtrary. How do I implement (Emacs') GOTO-CHAR
without standing on my head?
> Also, what about making TEXTs unmutable by default and instead let
> TEXT-REPLACE, etc return a new text object?
Given an implementation that can do that efficiently, I see no
obstacle to implementing a new type, META-TEXT?, which is mutable in
exactly the way that TEXT? is in my proposal. That'd be ridiculously
inconvenient though. So, make META-TEXT? the same thing as TEXT?.
(I strongly suggest splay trees as an ideal implementation strategy
for for TEXT?. They would make _both_ mutating and functional
REPLACE efficient.)
> > There is no essential difference between a grapheme and a text
> > object of length 1, and thus the proposal makes GRAPHEME? a
> > subtype of TYPE.
> Do we need the concept of grapheme at all, then?
Interesting question! And it ties in with your question about "why
not just markers and not integer indexes".
I don't see a good way to ground markers _without_ integer indexes.
Graphemes are a reasonable "what the user thinks of as a character".
What does DELETE-BACKWARD-CHAR delete (for example) (at least by
default) if not a grapheme? And in the non-default cases, how does it
analyze the TEXT? value to figure out what to do?
> > The proposal also makes it possible to pass strings everywhere that
> > text can be used. I think that's the more interesting direction:
> > just use text- and grapheme- procedures from now on except where you
> > _really_ want to refer to octets.
> Could we make strings/chars go away completely over time? For vectors
> of octets, there is u8vector? from SRFI-4.
I wouldn't object to seeing a complete unification of STRING? with
u8vector. I'm not so sure that the CHAR? type is particularly useful
in the long run -- it's rather culturally biased.
-t
- Re: Unicode and Guile, Kevin Ryde, 2003/11/02
- Re: Unicode and Guile, Andy Wingo, 2003/11/03
- Re: Unicode and Guile, Tom Lord, 2003/11/03
- Re: Unicode and Guile, Andy Wingo, 2003/11/11
- Re: Unicode and Guile, Tom Lord, 2003/11/11
- Re: Unicode and Guile, Marius Vollmer, 2003/11/11
- Re: Unicode and Guile,
Tom Lord <=
- Re: Unicode and Guile, Marius Vollmer, 2003/11/11
- Re: Unicode and Guile, Tom Lord, 2003/11/11
- Re: Unicode and Guile, Marius Vollmer, 2003/11/12
- Re: Unicode and Guile, Andy Wingo, 2003/11/18
- Re: Unicode and Guile, Marius Vollmer, 2003/11/11
- Re: Unicode and Guile, Tom Lord, 2003/11/11
Re: Unicode and Guile, Andy Wingo, 2003/11/03
Re: Unicode and Guile, Mikael Djurfeldt, 2003/11/26