[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: using mapfile is extreamly slow compared to oldfashinod ways to read
From: |
Greg Wooledge |
Subject: |
Re: using mapfile is extreamly slow compared to oldfashinod ways to read files |
Date: |
Thu, 26 Mar 2009 08:25:50 -0400 |
User-agent: |
Mutt/1.4.2.2i |
On Thu, Mar 26, 2009 at 08:53:50AM +0100, Lennart Schultz wrote:
> I have a bash script which reads about 250000 lines of xml code generating
...
> mapfile < file
> for i in "${MAPFILE[@]}"
> do
> line=$(echo $i) # strip leading blanks
> case "$line" in
> ....
> done
>
> With this change the job now takes more than 48 minutes. :(
Oh... new builtin. New to me anyway.
A quarter of a million subshells (the $(echo) part) are probably the
reason for the slowness, not the array traversal (unless holding that
much data in memory is causing your system to thrash).
> It may be that I am new to mapfiles, and there are more efficient ways to
> traverse a mapfile array, but if this the case please document it.
for element in "${array[@]}"
for index in ${!array[*]}
are probably about the same. I haven't actually benchmarked them.
> please introduce an option to strip leading blanks so mapfile acts like
> readline so constructions like:
> line=$(echo $i) # strip leading blanks
> above can be avoid.
Huh... most people go out of their way to get the opposite behavior
when using read. Typically, we have to throw in IFS= and -r just
to get read to act the way you *don't* want. Ironic.
If you want to strip leading blanks without a subshell, you can do it
this way:
shopt -s extglob
line=${i##+([[:space:]])}
However, given the way you're stating your requirements, it seems you'd
actually prefer just using read:
unset array i
while read -r line; do
array[i++]=$line
done
This will avoid the need to strip leading blanks yourself (read will
do that), and also doesn't use any subshells.