[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: regexps and locales
From: |
Olivier Wittenberg |
Subject: |
Re: regexps and locales |
Date: |
Thu, 27 Jan 2005 00:22:49 +0100 |
User-agent: |
Mutt/1.4.2.1i |
Hello,
What I wrote in my bug-report was wrong: according to POSIX, the
patten "*" should match any filename. I thought 'find' was right to
ignore non-"character strings" filenames with "*", but actually the
bug is in 'find'.
However it does seem that the pattern "?" should only match
one-character-long filenames, whereas currently in bash it also
matches any one-byte-long filename.
So my bug-report still applies with ? instead of *, unless I'm still
mistaken about POSIX (in which case I apologize for these wrong
bug-reports).
Best,
--Olivier
On Wed, Jan 26, 2005 at 10:21:43PM +0100, Olivier Wittenberg wrote:
> Hello,
>
> I have noticed the following behaviour in bash; it seems to me that it
> is a bug, but I'm not 100% sure.
>
> Configuration Information:
> Machine: i386
> OS: linux-gnu
> Compiler: i386-redhat-linux-gcc
> Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='i386'
> -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i386-redhat-linux-gnu'
> -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale'
> -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib
> -D_FILE_OFFSET_BITS=64 -O2 -g -pipe -m32 -march=i386 -mtune=pentium4
> uname output: Linux foobar 2.6.10-1.741_FC3smp #1 SMP Thu Jan 13
> 16:53:16 EST 2005 i686 i686 i386 GNU/Linux
> Machine Type: i386-redhat-linux-gnu
>
> Bash Version: 3.0
> Patch Level: 14
> Release Status: release
>
> Description:
>
> When bash does pathname expansion, it finds that the pattern * matches
> any file, even if POSIXLY_CORRECT is set. I think the POSIX
> specification says it should only match those files whose names are
> valid sequences of characters according to the current locale. For
> instance, when using a UTF-8 locale, it should not match the file
> whose name is the output of "echo -e \\351".
>
> Note that GNU find version 4.1.20 works as expected:
> find -name \*
> does not print, e.g., the non-UTF-8 filenames when using a UTF-8
> locale.
>
> The POSIX-compliant behaviour may well be deemed braindead, but that's
> unfortunately another story...
>
>
> Repeat-By:
> LC_ALL=en_US.UTF-8 bash
> mkdir /tmp/foo ; cd /tmp/foo ; touch dummy
> nargs() { echo $#; }
> nargs *
> touch $(echo -e \\351)
> nargs *
>
> It prints 1 and then 2, it should print 1 and then 1.
>
> Best,
> --Olivier
- regexps and locales, Olivier Wittenberg, 2005/01/26
- Re: regexps and locales,
Olivier Wittenberg <=