bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: nocaseglob


From: Chet Ramey
Subject: Re: nocaseglob
Date: Fri, 16 Feb 2007 13:29:30 -0500
User-agent: Thunderbird 1.5.0.9 (Macintosh/20061207)

Tim Waugh wrote:

>> strcoll indicates that, in the "en_US" locale, `h' sorts between `A' and
>> `Z'.  In the "C" locale, it does not.  This is consistent with the
>> collating sequences I posted earlier.
> 
> Here is what Ulrich Drepper has to say on the matter (see
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217359#c5):
> 
> "[...] The strcoll result has nothing whatsoever to do with
>  the range match.  strcoll uses collation weights, ranges use
>  collation sequence values, completely different concept.

This is an academic distinction of little or no practical value.  There
is no portable interface that exposes the difference to the programmer.
All there is is strcoll (which, for its part, says nothing about that).

Posix says there is no difference:

        A collation sequence definition shall define the relative
        order between collating elements (characters and multi-character
        collating elements) in the locale. This order is expressed in
        terms of collation values; that is, by assigning each element
        one or more collation values (also known as collation weights).

This implies to me that the collation weights are what determines the
collation sequence order.

Maybe that's why there is no Posix interface to expose the difference.

>  Not matching 'h' (note, lowercase) is correct since if you
>  look at the locale definition you'll see that first all
>  lower characters are described and then the uppercase.  So
>  h is not in A-Z.  H (uppercase) of course is.

But that's the point:  I can't "look at the locale definition."  And
there is no library function that will allow me to do so.  I make do
with what strcoll() gives me.

>  From all I can see so far it's entirely bash's fault by not
>  implementing globbing correctly.  bash really must use the
>  fnmatch code from glibc itself."

Why would I do that?  That does nothing to enhance portability.  I
can put the old subtraction code in that ignores the locale myself,
since, as far as I can tell, that's the only portable part of the
glibc fnmatch code.  It would be a step backward to ignore the
locale information, though.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                       Live Strong.  No day but today.
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]