[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH] Fix foreground dead jobs in trap handlers reported like backgrou
From: |
Koichi Murase |
Subject: |
[PATCH] Fix foreground dead jobs in trap handlers reported like background ones in `jobs' |
Date: |
Mon, 18 Jul 2022 12:52:31 +0900 |
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -g -O2 -Wno-parentheses -Wno-format-security
uname output: Linux chatoyancy 5.6.13-100.fc30.x86_64 #1 SMP Fri May
15 00:36:06 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu
Bash Version: 5.1.4, also in the current devel 87a6e89e
Description:
In interactive shells, the foreground dead jobs (i.e., foreground
jobs that have already terminated) in trap handlers, commands called
through `bind -x', and `PROMPT_COMMAND' (let us call these three
contexts as the affected context. Note: There might be even other
similar contexts) are treated as if they are the background dead
jobs. This causes several unintended behaviors: (1) When `jobs' is
called in the affected contexts, all the foreground jobs that have
already terminated in the same call of the affected contexts are
printed. (2) When `wait -n' is called in the affected contexts, it
immediately returns with the exit status of the previous foreground
dead job.
These behaviors are different from those in the normal case where
`jobs' and `wait -n' are executed outside the trap handler, and I do
not see the reasoning for this behavior difference between inside
and outside of the affected contexts. I hope the behavior in the
affected contexts would be similar to that in the normal contexts
where the foreground dead job information is not printed.
A quick example is here:
$ PROMPT_COMMAND='for a in {0..9};do (true);done;jobs'
This reports every time 10 foreground dead job entries, which is
unexpected. Its detailed explanation and other examples are
presented in the next section. I have received related reports
twice in different projects, once in 2022-12 and the other in the
last week:
https://github.com/akinomyoga/ble.sh/issues/78#issue-771281403
https://github.com/petobens/trueline/pull/46#issuecomment-1179853850
I have actually created a patch about one year ago after I first
received the report, but they were pending because I was not sure
what would be the intended behavior. Here, I submit possible
patches (depending on the intended behavior) with explanations.
Repeat-By:
(1) `jobs' prints the information of all the foreground dead jobs in
trap handlers. This reproduces in all the versions of Bash that I
tested from 1.14 to devel branch. This one was originally reported
by Fabio (@3ximus) at
https://github.com/akinomyoga/ble.sh/issues/78#issue-771281403
First consider the following function:
$ bash --norc
$ f(){(sleep 1;exit 1)&(exit 2);jobs;}
If one executes the function normally, `jobs' called inside the
function reports only the background job "%1 (sleep 1;exit 1)" and
not the foreground job as expected:
$ f
[1] 62465
[1]+ Running ( sleep 1; exit 1 ) &
$ <-- (wait more than 1 sec and press RET)
[1]+ Exit 1 ( sleep 1; exit 1 )
However, when this function is executed inside the trap handler, the
the foreground job "%2 (exit 2)" is also reported by `jobs', which
is not expected:
$ trap f INT
$ ^C /* <-- Here type C-c */
[1]+ Running ( sleep 1; exit 1 ) &
[2] Exit 2 ( exit 2 )
$ <-- (wait more than 1 sec and press RET)
[1]+ Exit 1 ( sleep 1; exit 1 )
When there are many forks (subshells and external command
executions), `jobs' will report as many job entries as forks in the
trap handler which is annoying:
$ trap 'for a in {0..99};do /bin/true;done;jobs' INT
$ ^C
[1] Done trap 'for a in {0..99};do /bin/true;done;jobs' INT
[2] Done trap 'for a in {0..99};do /bin/true;done;jobs' INT
[...]
[98] Done trap 'for a in {0..99};do /bin/true;done;jobs' INT
[99] Done trap 'for a in {0..99};do /bin/true;done;jobs' INT
[100] Done trap 'for a in {0..99};do /bin/true;done;jobs' INT
Actually, originally reported one had slightly different structure.
`jobs' in the `bind -x' binding reports the foreground dead jobs in
the immediately preceding invokation of trap handlers:
$ trap 'for a in {0..9};do /bin/true;done' INT
$ bind -x '"\C-t":jobs'
$ /* <-- Here change the terminal size and type C-t */
[1] Done bind -x '"\C-t":jobs'
[2] Done bind -x '"\C-t":jobs'
[3] Done bind -x '"\C-t":jobs'
[4] Done bind -x '"\C-t":jobs'
[5] Done bind -x '"\C-t":jobs'
[6] Done bind -x '"\C-t":jobs'
[7] Done bind -x '"\C-t":jobs'
[8] Done bind -x '"\C-t":jobs'
[9] Done bind -x '"\C-t":jobs'
[10] Done bind -x '"\C-t":jobs'
(2) The behavior of `wait -n' is also affected.
Consider the following function which waits for the first
termination of any background job.
$ bash --norc
$ g(){(sleep 1;exit 1)&(exit 2);wait -n;echo "wait status: $?";}
When it is executed normally, it waits for the termination of
`(sleep 1;exit 1)' for one second and report the exit status `1':
$ g
[1] 18082
[1]+ Exit 1 ( sleep 1; exit 1 )
wait status: 1
However, when it is executed in a trap handler, `wait -n'
immediately returns and reports the status `2' which is the exit
status of the already-terminated foreground job `(exit 2)':
$ trap g INT
$ ^C /* <-- Here type C-c */
wait status: 2
$
[1]+ Exit 1 ( sleep 1; exit 1 )
(3) The same also happens in PROMPT_COMMAND. The `jobs` command in
the following `PROMPT_COMMAND' always reports `(true)' as a
terminated job.
$ bash --norc
$ PROMPT_COMMAND='(true);jobs'
This was originally reported by Pedro Ferrari (@petobens) in the
following discussion:
https://github.com/petobens/trueline/pull/46
Fix:
The reason that the behavior changes between the outside and inside
of the affected contexts can be understood from the function body of
`notify_and_cleanup ()' in `jobs.c'.
> void
> notify_and_cleanup ()
> {
> if (jobs_list_frozen)
> return;
>
> if (interactive || interactive_shell == 0 || sourcelevel)
> notify_of_job_status ();
>
> cleanup_dead_jobs ();
> }
The condition of the second `if' statement is satisfied in normal
contexts but unsatisfied in the affected contexts. In a normal
situation, `notify_of_job_status ()' just marks a foreground dead
job as `J_NOTIFIED' without printing the job information unless the
job is in the `WIFSIGNALED' state (terminated by a signal?). Then,
the subsequent call of `clean_up_dead_jobs ()' removes the jobs
marked as `J_NOTIFIED'. In this way, the foreground jobs are
immediately removed from the job list in a normal context. However,
in the affected contexts, `notify_of_job_status ()' is not called,
so foreground dead jobs are never marked as `J_NOTIFIED' and
therefore survive after the call of `cleanup_dead_jobs ()'.
Actually, any foreground jobs created in a trap handler would never
be removed until the end of the trap handler.
The actual work of `wait -n' is done in the function
`wait_for_any_job (flags, ps)' in `jobs.c'. In the description of
`wait -n' in the manual and also the source-code comment of
`wait_for_any_job ()', this function is supposed to wait for any
background jobs. However, in the implementation, it isn't actually
checked whether the job is foreground or background. This is the
reason that `wait -n' immediately terminates with a foreground dead
job in the affected contexts. I think this implementation is based
on the assumption that the foreground dead jobs would be immediately
removed so that the dead jobs are only the background ones when
`wait_for_any_job ()' is called. In fact, this assumption is
correct in normal cases, but this is not the case in the affected
contexts.
Option 1: I think the quickest fix is to remove the foreground dead
jobs immediately in `cleanup_dead_jobs ()' regardless of
`J_NOTIFIED'. This is a one-line fix [see the attached patch
`r0022-foreground-dead-job-fix1.patch.txt']. There is mostly no use
for the job entries of foreground dead jobs, so they can be removed
safely in most cases.
However, the problem with option 1 is that the job information of
the foreground jobs of the `WIFSIGNALED' state (i.e., the foreground
jobs terminated by a signal) will not be printed.
$ bash-dev --norc <-- the current devel
$ trap 'echo begin;for a in {0..3}; do (kill -9 $BASHPID);
done;echo end' INT
$ ^C <-- press C-c
begin
end
Killed
Killed
Killed
Killed
$ bash-r0022-fix1 --norc <-- the devel patched with option 1
$ trap 'echo begin;for a in {0..3}; do (kill -9 $BASHPID);
done;echo end' INT
$ ^C
begin
end
Option 2: Here, another naive fix is that we print the information
of the foreground jobs terminated by a signal and removes the entry
[see the attached patch `r0022-foreground-dead-job-fix2.patch.txt'].
However, I guess this conflicts with the original reason that
`notify_of_job_status ()' is not called in the affected contexts; I
guess we do not want to print job information in the middle of trap
handlers or the other affected contexts, but instead print them at
the end of the trap handler. See the following example:
$ bash-r0022-fix2 --norc <-- the devel patched with option 2
$ trap 'echo begin;for a in {0..3}; do (kill -9 $BASHPID);
done;echo end' INT
$ ^C
begin
Killed
Killed
Killed
Killed
end
Option 3: So, we can defer the deletion of the foreground jobs
terminated by signals and only removes the foreground dead jobs that
are not terminated by signals. We may remove the foreground jobs
terminated by signals later [see
`r0022-foreground-dead-job-fix3.patch.txt']. With this, the
$ bash-r0022-fix3 --norc <-- the devel patched with option 3
$ trap 'echo begin;for a in {0..3}; do (kill -9 $BASHPID);
done;echo end' INT
$ ^C
begin
end
Killed
Killed
Killed
Killed
--
Koichi
- [PATCH] Fix foreground dead jobs in trap handlers reported like background ones in `jobs',
Koichi Murase <=