bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: nocaseglob


From: Bruce Korb
Subject: Re: nocaseglob
Date: Fri, 16 Feb 2007 11:14:30 -0800

On 2/16/07, Chet Ramey <chet.ramey@case.edu> wrote:
Tim Waugh wrote:
>> strcoll indicates that, in the "en_US" locale, `h' sorts between `A' and
>> `Z'.  In the "C" locale, it does not.  This is consistent with the
>> collating sequences I posted earlier.
>
> Here is what Ulrich Drepper has to say on the matter (see
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217359#c5):
>
> "[...] The strcoll result has nothing whatsoever to do with
>  the range match.  strcoll uses collation weights, ranges use
>  collation sequence values, completely different concept.

This is an academic distinction of little or no practical value.  There
is no portable interface that exposes the difference to the programmer.

except ``ls [a-z]*''

In the absence of any locale specification (via LANG and LC_* environment
variables), it is incomprehensible to claim that it is reasonable to include
upper case letters in the range of matched characters using a range of
lower case letters.  Your quoted paragraph:

        A collation sequence definition shall define the relative
        order between collating elements (characters and multi-character
        collating elements) in the locale. This order is expressed in
        terms of collation values; that is, by assigning each element
        one or more collation values (also known as collation weights).

This implies to me that the collation weights are what determines the
collation sequence order.

two points:

1.  collation sequence is not the same as character range.  [a-z] is a range.
2.  there is no locale specified, so it must be the default.  The default is C.
   not en_US or whatever.

But that's the point:  I can't "look at the locale definition."  And
there is no library function that will allow me to do so.  I make do
with what strcoll() gives me.

bash should be entirely consistent with fnmatch.  The best way
to do that is to use fnmatch.  Then, if there is a problem, we jump
on Ulrich instead of you.  :)

>  From all I can see so far it's entirely bash's fault by not
>  implementing globbing correctly.  bash really must use the
>  fnmatch code from glibc itself."

Why would I do that?  That does nothing to enhance portability.

Yes, it does.  You can use the fnmatch module from gnulib to
backfill where necessary.

can put the old subtraction code in that ignores the locale myself,
since, as far as I can tell, that's the only portable part of the
glibc fnmatch code.  It would be a step backward to ignore the
locale information, though.

It is a step forward to ignore locale information when there is no
locale information, though.  :-)

Thanks - Bruce




reply via email to

[Prev in Thread] Current Thread [Next in Thread]