bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Async processes started in functions not reliably started


From: Steffen Nurpmeso
Subject: Async processes started in functions not reliably started
Date: Sun, 04 Aug 2019 00:40:08 +0200
User-agent: s-nail v14.9.14-3-g96dc286e

Hello.

For the MUA i maintain i yet implemented parallel tests, and now
wanted to add a reaper process which automatically kills tests
which need longer than X seconds.  That turns out to be more
complicated than i thought, it works just fine in mksh, but does
not work at all in dash (which can also not access variables from
within traps it seems), and in bash it works 100% reliably only
when started in-code, but not if placed in a jobreaper_start().
This is on CRUX Linux 3.5 (GNU C based), self-compiled (standard
/etc/pkgmk.conf flags -O2 -march=x86-64 -pipe), bash 5.0.7.
Imagine

  echo shell is $SHELL/$0
   (
      int= hot=
      echo 'Starting job reaper'
      trap 'int=1 hot=1' HUP
      trap 'int=1 hot=' INT
      trap 'echo "Stopping job reaper"; exit 0' TERM
      trap '' EXIT

      while [ 1 ]; do
         int=
         sleep ${JOBWAIT} &
         wait
         if [ -z "${int}" ] && [ -n "${hot}" ]; then
            i=0 l=
            while [ ${i} -lt ${MAXJOBS} ]; do
               i=`add ${i} 1`

               if [ -f t.${i}.pid ] && read p < t.${i}.pid; then
                  kill -KILL ${p}
                  ${rm} -f t.${i}.result
                  l="${l} ${i}"
               fi
            done
            [ -n "${l}" ] &&
               printf '%s!! Reaped job(s)%s after %s seconds%s\n' \
                  "${COLOR_ERR_ON}" "${l}" ${JOBWAIT} "${COLOR_ERR_OFF}"
         fi
      done
   ) </dev/null & #>/dev/null 2>&1 &
   JOBREAPER=$!

This works a hundred percent reliable if placed alongside the
normal code, but if i place it in a function jobreaper_start(),
then i would say my success rate is about 60 percent:

  Spawing up to 4 tests in parallel
  shell is /bin/bash//home/steffen/src/nail.git/mx-test.sh
  Starting job reaper
  ... [1=vpospar] [2=atxplode] .. wait(1)
  /home/steffen/src/nail.git/mx-test.sh: line 401:  2487 Killed                 
 ( if ${mkdir} t.${JOBS}.d; then
      cd t.${JOBS}.d; eval t_${1} ${JOBS} ${1};
  fi; ${rm} -f ../t.${JOBS}.pid ) > t.${JOBS}.io 2>&1 < /dev/null
  [vpospar]
  1:ok ifs:ok
  !! Reaped job(s) 2 after 3 seconds
  ..
versus
  Spawing up to 4 tests in parallel
  shell is /bin/bash//home/steffen/src/nail.git/mx-test.sh
  ... [1=vpospar] [2=atxplode] .. wait(1)
  /home/steffen/src/nail.git/mx-test.sh: line 405: kill: (2560) - No such 
process
  [vpospar]
  1:ok ifs:ok
  [atxplode]
  1:ok
  /home/steffen/src/nail.git/mx-test.sh: line 364: kill: (2560) - No such 
process
  ...

The code is driven via

  else
     if have_feat debug; then
        if have_feat devel; then
           DUMPERR=y
           ARGS="${ARGS} -Smemdebug"
        fi
     elif have_feat devel; then
        LOOPS_MAX=${LOOPS_BIG}
     fi
     color_init

     if [ -z "${RUN_TEST}" ] || [ ${#} -eq 0 ]; then
        jobs_max
        printf 'Spawing up to %s tests in parallel\n' ${MAXJOBS}
        jobreaper_start

(Inject above code here plain and all ok it seems.)

        t_all
     else
  ...
     fi

     jobreaper_stop

Injecting code plain is a bit painful because i need it twice.
I hope i am not missing something, i most likely do, since i would
have thought that the builtin kill would recognize that i am
actually killing something that came in via $! (though it never
started up it seems).
Interesting to me is that dash shows exactly the same errors (but
always, in practice).

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]