[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Performance regression for hash_insert when using bash's internal malloc
From: |
Eduardo A . Bustamante López |
Subject: |
Performance regression for hash_insert when using bash's internal malloc |
Date: |
Sat, 22 Nov 2014 16:50:24 -0600 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
I noticed that bash from the devel branch was taking too much time to start
when invoked as an interactive shell. I happen to have bash-completion
installed.
See this:
| dualbus@hp:~$ for sh in /bin/bash /tmp/bash/devel-O2/bin/bash; do time "$sh"
-i <<< ''; done
| dualbus@hp:~$
| dualbus@hp:~$ exit
|
| real 0m0.071s
| user 0m0.060s
| sys 0m0.004s
| dualbus@hp:~$
| dualbus@hp:~$ exit
|
| real 0m2.497s
| user 0m2.484s
| sys 0m0.012s
>From 0.071s to 2.497s, that's a lot of time...
I managed to track the issue to the calls to 'complete ... NAME', which do a
hash_insert for each name (bash-completion does a lot of 'complete' calls).
So, in order to try and isolate the issue, I wrote the following test script
(the relevant part is highlighted):
dualbus@hp ~ % cat test
#!/bin/bash
: <<'REQUIRES'
* moreutils (for 'ts')
* gnuplot
REQUIRES
unset tmpdir tmpimg
trap 'rm -rf "$tmpdir"' EXIT
tmpdir=$(mktemp -d)
tmpimg=$(mktemp)
shells=(
/tmp/bash/master-O2/bin/bash
/tmp/bash/4.3-rc2-O2/bin/bash
/tmp/bash/devel-O2/bin/bash
/tmp/bash/devel-nobashmalloc-O2/bin/bash
)
cd "$tmpdir" || exit 1
for i in "${!shells[@]}"; do
sh=${shells[i]}
for _ in {1..20}; do
"$sh" -c '
# sh -c code
# THIS IS THE INTERESTING PART
declare -A a;
for ((i = 0; i < 3000; i++)); do
a["$i"]=.; echo "$i";
done
# END INTERESTING PART
' | ts -s '%.s'
done | awk '
# we have: $timestamp $i
# we want: $delta $i
BEGIN { p_ts = 0 }
{
if(p_ts > $1) {
p_ts = 0
}
if(! N[$2]) {
S[l++] = $2
}
X[$2] += ($1 - p_ts)
N[$2]++
p_ts = $1
}
END {
for(i = 0; i < l; i++) {
k = S[i]
printf "%f\t%s\n", X[k]/N[k], k
}
}
' > "sh-$i.dat"
done
gnuplot /dev/stdin <<'PLOT' > "$tmpimg"
reset
set terminal png notransparent size 1200,960
set xtics 512
set xlabel "item"
set ylabel "time (ms)"
set title "hash_insert performance"
set style data linespoints
plot "sh-0.dat" using 2:($1*1000) title "master", \
"sh-1.dat" using 2:($1*1000) title "4.3", \
"sh-2.dat" using 2:($1*1000) title "devel", \
"sh-3.dat" using 2:($1*1000) title "devel no-bash-malloc"
PLOT
echo "image: $tmpimg"
I compiled the shells like this:
/tmp/bash/master-O2/bin/bash:
CFLAGS='-O2' ./configure --prefix=/tmp/bash/master-O2 && make install
/tmp/bash/4.3-rc2-O2/bin/bash:
CFLAGS='-O2' ./configure --prefix=/tmp/bash/4.3-rc2-O2 && make install
/tmp/bash/devel-O2/bin/bash:
CFLAGS='-O2' ./configure --prefix=/tmp/bash/devel-O2 && make install
/tmp/bash/devel-nobashmalloc-O2/bin/bash:
CFLAGS='-O2' ./configure --with-bash-malloc=no
--prefix=/tmp/bash/devel-nobashmalloc-O2 && make install
I attach the graph that results from running the script. And for those who
check this through the archive, I leave a link to it here:
http://i.imgur.com/ir8zJ6s.png
As you can notice, there's a weird spike for both the 4.3 and devel versions
which use the internal bash malloc around 2560.
graph.png
Description: PNG image
- Performance regression for hash_insert when using bash's internal malloc,
Eduardo A . Bustamante López <=