[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: using mapfile is extreamly slow compared to oldfashinod ways to read
From: |
Chet Ramey |
Subject: |
Re: using mapfile is extreamly slow compared to oldfashinod ways to read files |
Date: |
Thu, 26 Mar 2009 16:52:22 -0400 |
User-agent: |
Thunderbird 2.0.0.21 (Macintosh/20090302) |
Lennart Schultz wrote:
> Bash Version: 4.0
> Patch Level: 10
> Release Status: release
>
> Description:
>
> I have a bash script which reads about 250000 lines of xml code generating
> about 850 files with information extracted from the xml file.
> It uses the construct:
>
> while read line
> do
> case "$line" in
> ....
> done < file
>
> and this takes a little less than 2 minutes
>
> Trying to use mapfile I changed the above construct to:
>
> mapfile < file
> for i in "${MAPFILE[@]}"
> do
> line=$(echo $i) # strip leading blanks
> case "$line" in
> ....
> done
>
> With this change the job now takes more than 48 minutes. :(
The most important thing is using the right tool for the job. If you
have to introduce a command substitution for each line read with mapfile,
you probably don't have the problem mapfile is intended to solve:
quickly reading exact copies of lines from a file descriptor into an
array.
If another approach works better, you should use it.
If you're interested in why the mapfile solution is slower, you could
run the loop using a version of bash built for profiling and check
where the time goes. I believe you'd find that the command substitution
is responsible for much of it, and the rest is due to the significant
increase in memory usage resulting from the 250000-line array (which
also slows down fork and process creation).
Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
Chet Ramey, ITS, CWRU chet@case.edu http://cnswww.cns.cwru.edu/~chet/