[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#18051: [Emacs-diffs] trunk r117726: Add string collation.
From: |
Eli Zaretskii |
Subject: |
bug#18051: [Emacs-diffs] trunk r117726: Add string collation. |
Date: |
Mon, 25 Aug 2014 18:03:32 +0300 |
> From: Michael Albinus <michael.albinus@gmx.de>
> Date: Mon, 25 Aug 2014 08:41:03 +0200
> Cc: Paul Eggert <eggert@cs.ucla.edu>, 18051@debbugs.gnu.org
>
> > BTW, I think that collation functions with 3rd optional argument
> > to specify locale settings will be a bit more versatile, e.g.
> >
> > (string-collate-lessp a b "es_ES.UTF-8")
>
> We discuss this already, see
> <http://lists.gnu.org/archive/html/bug-gnu-emacs/2014-08/msg00623.html>
>
> My major reservation to this approach is that it doesn't fit well using
> string-collate-lessp as predicate of sort. That's why I have proposed a
> global variable as alternative, which could be let-bounded.
I think that binding a variable will indeed be cleaner. Using
process-environment for that purpose should be reserved for the
application level. Also, what if LC_COLLATE is not set in the
environment, but 'setlocale' does return some value for it? shouldn't
we use that?
Here are a few more thoughts about related issues:
1. Why does str_collate return a ptrdiff_t value? AFAIK, wcscoll
etc. return int data type, and of rather small values.
2. Should we signal an error if the input strings are not pure-ASCII
or multibyte? Unibyte strings will at best cause incorrect
results. And what about strings with invalid codepoints,
e.g. those outside of the Unicode range, which can happen inside
Lisp strings?
3. What about errors in wcscoll? The current code ignores them;
however, the value returned by wcscoll in case of an error is not
documented, so it could be random. Should we signal an error if
errno gets set by wcscoll?
4. How to control the optional features of the collating sequence? I
mean, for example, the fact that punctuation characters are ignored
in the .UTF-8 locales on glibc hosts (or so it seems). At least on
Windows, a somewhat higher degree of control is available, but it
must be specified separately of the locale ID. E.g., the
comparison function accepts flags to ignore punctuation and
symbols, width differences, diacritics, etc. Should we have another
variable, perhaps w32-specific, to request these features?
Alternatively, we could use .UTF-8 on Windows to communicate that,
although that sounds like a kludge.
5. The locale names on Windows are different from Posix: Windows uses
3-letter abbreviations of the country and the language,
e.g. "fra_FRA" instead of the Posix "fr_FR". Do we want the locale
string values used for let-binding the above-mentioned variable to
be portable across systems? Then we'd need some conversion
database on MS-Windows.
6. I think we will want case-insensitive version of this function.
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Paul Eggert, 2014/08/25
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Dmitry Antipov, 2014/08/25
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Michael Albinus, 2014/08/25
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation.,
Eli Zaretskii <=
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Eli Zaretskii, 2014/08/25
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Michael Albinus, 2014/08/27
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Eli Zaretskii, 2014/08/27
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Michael Albinus, 2014/08/27
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Eli Zaretskii, 2014/08/27
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Paul Eggert, 2014/08/27
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Paul Eggert, 2014/08/27
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Eli Zaretskii, 2014/08/27
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Paul Eggert, 2014/08/27
- bug#18051: [Emacs-diffs] trunk r117726: Add string collation., Michael Albinus, 2014/08/27