[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Grepping an array in bash, odd/buggy behavior when using shopt and '!'

From: Davey E
Subject: Grepping an array in bash, odd/buggy behavior when using shopt and '!'
Date: Tue, 2 Feb 2010 19:29:25 -0800 (PST)

Perhaps I'm missing something obvious here, but I'm trying to figure out an
efficient way to search a large array in bash (I'm using bash 4.1 because I
needed to use associative arrays)

Given an array that contains alphanumeric strings, I want to create a
smaller array of those elements that match a simple regular expression. 
i.e., I want to match on either a single name "abc123" or a regular
expression "abc*"  Obviously I can use a loop, but thats a bit slow and I
was hoping there might be a slick way to do this which is faster.  Since
there are so many variable operations which can also operate on an array it
seems like it should be easy, but perhaps isn't.

For instance, if I want to select everything BUT my regular expression, no
problem, I could use search and replace to search for and replace what I'm
looking for with the null string, leaving everything else:  

But that's the exact opposite of what I want, and I was wishing there was a
way to negate that expression.  So I'm searching around and find extglob,
which allows the use of a ! operator that will do exactly what I want!  So I
try address@hidden/#!(RE*)/} but it does not work as I would expect.

Seems to me like it should work, as this appears to be the negation of the
first example which gives me everything I don't want, so I should get
everything I do want.  But it behaves oddly, and either I've uncovered a
bash bug or I'm misunderstanding what is going on.  When I use the '!'
operator for file globbing it works like expected, but works differently
when I'm using it here for variable substition:


echo 'address@hidden/#A*/}' results in:  address@hidden/#A*/}
echo 'address@hidden/#AB*/}' results in:  address@hidden/#AB*/}
echo 'address@hidden/#XYZ*/}' results in:  address@hidden/#XYZ*/}
echo 'address@hidden/#ABZ555/}' results in:  address@hidden/#ABZ555/}

shopt -s extglob
echo 'address@hidden/#!(A*)/}' results in:  address@hidden/#!(A*)/}
echo 'address@hidden/#!(AB*)/}' results in:  address@hidden/#!(AB*)/}
echo 'address@hidden/#!(XYZ*)/}' results in:  address@hidden/#!(XYZ*)/}
echo 'address@hidden/#!(ABZ555)/}' results in:  address@hidden/#!(ABZ555)/}

When run, I see:

address@hidden/#A*/} results in: XYZ234
address@hidden/#AB*/} results in: XYZ234 AXXXXX
address@hidden/#XYZ*/} results in: ABC123 ABZ555 ABC777 AXXXXX
address@hidden/#ABZ555/} results in: ABC123 XYZ234 ABC777 AXXXXX
address@hidden/#!(A*)/} results in: ABC123 ABZ555 ABC777 AXXXXX
address@hidden/#!(AB*)/} results in: BC123 BZ555 BC777
address@hidden/#!(XYZ*)/} results in: Z234
address@hidden/#!(ABZ555)/} results in: 5

It is actually finding the array member(s) it should, and works pefectly for
A*.  But AB* removes the initial 'A' and ABC* removes the initial 'AB'. 
WTF?  It doesn't do that when used for filename globbing.

It doesn't even work as intended when used without a wildcard, directly
searching for the negation of the string 'ABZ555'.  Again, it correctly
narrows the array down to that one member, but also unhelpfully strips the
first five characters off.  For some reason, for each character longer my
regular expression is, one more character is stripped off the otherwise
correct result!

Can anyone explain to me what is going on here, and what I can do to fix
this?  Or perhaps better yet, if there is some other way to accomplish this,
using a single variable substitution to search the entire array, so that the
result is a new array that contains only those members that match my regular
expression?  What I'm trying to do here is quite a bit faster than a loop
for the large arrays I'm dealing with, but maybe that's because it is easy
to be fast when producing the wrong answer :-)

If this is a bug, is the above script a workable test case?  I'll be happy
to test patches, since I had to build from source to get version 4.x anyway
so adding a few patches isn't much extra work, if it means I might be able
to get this working properly.

View this message in context: 
Sent from the Gnu - Bash mailing list archive at Nabble.com.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]