guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SRFI-14 and locale settings


From: Ludovic Courtès
Subject: Re: SRFI-14 and locale settings
Date: Wed, 13 Sep 2006 10:29:21 +0200
User-agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux)

Hi,

Neil Jerram <address@hidden> writes:

> address@hidden (Ludovic Courtès) writes:
>
>> According to SRFI-14, a Latin-1 implementation should contain _both_ `ñ'
>> and `ê' in `char-set:letter', regardless of the current language
>> settings, hence the difficulty we might have building `char-set:letter'.
>>
>> Does that clarify things?
>
> Yes.  So it seems to me, therefore, that we should not be using
> isalpha() etc. to construct char-set:letter, but should instead hard
> code it as the intersection of (char-set:letter as specified by SRFI
> 14) with (the set of characters that Guile can represent).

In practice, I can think of two ways to determine the set of _letters_
available in the current encoding (which is what `char-set:letter'
expects).

1. Since SRFI-14 lists all the characters that have to be added to the
   ASCII `char-set:letter' to get the Latin-1 `char-set:letter', we
   could somehow hard-code them.  But this is ugly.

2. Or, we can use a predicate that uses the `is' functions which we
   expect to be language-independent (i.e., those functions that only
   depend on the locale's charset), such as:

     (!isblank (c)) && (!ispunct (c)) && (!isdigit (c)) && (!iscntrl (c))

   This is certainly not perfect, but it should work for Latin-1, and
   hopefully for other 8-bit charsets as well.

As Kevin mentioned earlier, all the char sets could be re-computed in
`scm_setlocale ()'.

I think I'll give a try to the second option in the next few days if
nobody considers it too silly.

Thanks,
Ludovic.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]