bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug on [A-Z] and [a-z]


From: Bruno Cesar Ribas
Subject: Re: bug on [A-Z] and [a-z]
Date: Mon, 2 May 2011 10:23:14 -0300
User-agent: Mutt/1.5.21 (2010-09-15)

On Mon, May 02, 2011 at 08:41:23AM -0400, Greg Wooledge wrote:
> On Sun, May 01, 2011 at 09:17:49PM -0500, Jonathan Nieder wrote:
> > Hi,
> > 
> > ribas@inf.ufpr.br wrote:
> > 
> > >   When  running "echo [A-Z]*" , it shows all files/dirs of current
> > >     directory, not only those starting with capital letters. I tried
> > >     different locales such as: POSIX, C, en_US, pt_BR
> > >
> > > Repeat-By:
> > >     $ mkdir a && cd a
> > >     $ touch a b c; mkdir D E F
> > >     $ echo [A-Z]*
> > >     b c D E F
> > >     $ echo [a-z]*
> > >     a b c D E F
> > 
> > See http://bugs.debian.org/301717 (???fnmatch("[a-z]", ...) matches
> > capital letters in most locales???) for some details.
> 
> See also http://mywiki.wooledge.org/locale

Thanks for the explanations now I understand what is happening.

> 
> > I'm puzzled by your comment on trying different locales, though:
> > I tried
> > 
> >     mkdir a && cd a
> >     touch a b c; mkdir D E F
> >     echo [A-Z]*
> > 
> > and got output
> > 
> >     b c D E F
> > 
> > as expected.  Then I tried
> > 
> >     LANG=C
> >     export LANG
> >     echo [A-Z]*
> > 
> > and got output
> > 
> >     D E F
> > 
> > Does your experience differ?  I'm using 4.1.5(1)-release fwiw.
> 
> Presumably, "ribas" did not correctly set the locale variables during
> his or her testing.

Indeed, I did not export the variable just ran like LANG=C echo [A-Z]*,
exporting works.

> 
> > >     No Fix yet, looking on the source code.
> 
> There's nothing to fix.  This is in the realm of a new feature request.
> 
> > In the long run, a good fix might be to teach fnmatch a new
> > FNM_STRICTCASE flag and optionally use it.
> 
> If by "strict case" you mean "force POSIX locale" or "force US-ASCII
> ordering", then the option ought to be called something less confusing.
> 
> > The hardest part would
> > seem to be making tables so the system can know what "this range,
> > using the same case" means.
> 
> It already knows this, because it's what the POSIX (C) locale does.
> 
> > A separate aspect is documentation.  I imagine Chet wouldn't mind
> > a patch to bash.1 and bash.info to explain this pitfall under
> > "Pattern Matching" or even under "BUGS" (aka LIMITATIONS).
> 
> This is not a bug, so it does not belong in BUGS.
> 
> The first place I found in the man page that makes mention of this is
> the Pathname Expansion section.  This, I agree, should be changed.
> Perhaps this would an acceptable wording:
> 
> 
> --- doc/bash.1.orig     Mon May  2 08:31:26 2011
> +++ doc/bash.1  Mon May  2 08:35:51 2011
> @@ -3121,8 +3121,8 @@
>  If one of these characters appears, then the word is
>  regarded as a
>  .IR pattern ,
> -and replaced with an alphabetically sorted list of
> -file names matching the pattern.
> +and replaced with a list of file names matching the pattern,
> +sorted alphabetically by the current locale's collating sequence.
>  If no matching file names are found,
>  and the shell option
>  .B nullglob
> 
> 
> Under Pattern Matching, there is already an explanation of how it uses the
> LC_COLLATE variable, the current locale, etc.  It's all there.  In fact,
> since Pattern Matching is a subsection of Pathname Expansion, one could
> argue that my patch is redundant, but since the pathname expansion stuff
> appears first, someone may stop reading before encountering the more
> verbose description, so IMHO it doesn't hurt to correct the introduction.

-- 
Bruno Ribas - ribas@inf.ufpr.br
http://www.inf.ufpr.br/ribas



reply via email to

[Prev in Thread] Current Thread [Next in Thread]