Chris,
I agree with you to use the right tool at the right time, and mapfile seems
not to be the right tool for my problem, but I will just give you some facts
of my observations:
using a fast tool like egrep just to find a simple string in my datafile
gives the following times:
time egrep '<pro' >/dev/null < dr.xml
real 0m54.628s
user 0m27.310s
sys 0m0.036s
My original bash script :
time xml2e2-loadepg
real 1m53.264s
user 1m22.145s
sys 0m30.674s
While the questions seems to go on spawning subshells and the cost I have
checked my script
it is only calling one external command is date which in total is called a
little less than 20000 times. I have just for this test changed the call of
date to an assignment of an constant. and now it looks:
time xml2e2-loadepg
real 1m3.826s
user 1m2.700s
sys 0m1.004s
I also made the same change to the version of the program using mapfile, and
changed line=$(echo $i) to
line=${i##+([[:space:]])}
so the mainloop is absolulty without any sub shell spawns:
time xml2e2-loadepg.new
real 65m2.378s
user 63m16.717s
sys 0m1.124s