bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Only store revealed pids in bgpids data structure


From: John Fremlin
Subject: Only store revealed pids in bgpids data structure
Date: Tue, 14 Apr 2015 04:54:25 +0000
User-agent: Microsoft-MacOutlook/14.4.8.150116

Bash instances running in loops get slower over time, as the bgpids data structure grows. Here is a small patch to alleviate one issue :)

The jobs.c:bgpids data structure is used as a cache for the wait syscall, to store the status of dead processes, so that scripts can wait on pids even multiple times (a bash extension not in POSIX that only allows it once).

Some pids cannot ever be waited for in this way because they run in the foreground and there is no way to see their process id.

This patch chooses to only insert the dead pids that could have been known about into the bgpids structure. This has a *huge* performance implication for long running bash processes that naturally spawn many sub-shells over their life, and can gradually slow down.

First set ulimit -u 30000 (this is used by bash to determine the size of bgpids)

With the patch:

john@dev:~/Programs/bash$ (time -p ./bash -c 'j=30000; for i in $(seq 1 $j); do (:); (:) ; (:); done; wait')
real 32.31
user 1.21
sys 10.37

Without the patch:

john@dev:~/Programs/bash$ (time -p ./bash-unpatched -c 'j=30000; for i in $(seq 1 $j); do (:); (:) ; (:); done; wait')
real 104.03
user 44.16
sys 48.57

Maybe the number j=30000 will need to be modified according to your system’s capabilities.


The number of pids in bgpids is determined normally by the number of user processes or ulimit -u, which can be very high on modern systems (e.g. >100k). As this is the number of allowed *living* processes and bgpids is just about *dead* processes, maybe this limit should be revisited - actually to increase it (perhaps at least 1024 + number of living processes — this would accommodate those systems that set low limits on living processes to constrain resource usage but allow people to obtain the exit status of their processes termination). 

I ran make check and it has 0 exit status :) - not sure if this is the right way of guessing whether a pid was revealed, would love feedback! I have a bigger patch I changing bgpids to a hash table etc. that makes general workloads much faster

Attachment: bash-store-revealed-pids-only.patch
Description: bash-store-revealed-pids-only.patch


reply via email to

[Prev in Thread] Current Thread [Next in Thread]