[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] A terminating signal has to complete a bash process
From: |
Andrei Vagin |
Subject: |
Re: [PATCH] A terminating signal has to complete a bash process |
Date: |
Tue, 1 May 2018 16:55:55 -0700 |
User-agent: |
Mutt/1.9.2 (2017-12-15) |
On Tue, May 01, 2018 at 02:15:17PM -0400, Chet Ramey wrote:
> On 5/1/18 12:44 PM, Andrei Vagin wrote:
> > On Tue, May 01, 2018 at 10:40:18AM -0400, Chet Ramey wrote:
> >> On 4/30/18 6:05 PM, Andrei Vagin wrote:
> >>> bash sets a handler for all terminating signals, which saves history,
> >>> executes traps, sets a default signal handler and re-sends the same
> >>> signal to itself. It expects that this signal will kill it.
> >>>
> >>> Unfortunately it doesn't work in Linux, when a bash script is executed as
> >>> an init process in a pid namespaces, because all signals to the init
> >>> process, what are sent from the current pid namespace, are ignored.
> >>>
> >>> man 7 pid_namespaces
> >>> Only signals for which the "init" process has established a signal han‐
> >>> dler can be sent to the "init" process by other members of the PID
> >>> namespace. This restriction applies even to privileged processes, and
> >>> prevents other members of the PID namespace from accidentally killing
> >>> the "init" process.
> >>>
> >>> Chet Ramey suggested to add a call to exit() after the kill(). This
> >>> patch adds this call for signals, which do not result in a core dump.
> >>> For other signals, a null pointer is dereferenced to get a core file.
> >>
> >> What's the value of a core dump from a different signal in this case?
> >
> > If we get these signals from kernel, it means that we have a bug.
>
> Usually, yes. But the usefulness of a core dump depends on two things:
>
> 1. Whether the core dump contains enough useful information to point to
> a problem. This can be defeated by not compiling the program with
> debugging symbols or by the stack being corrupted enough to obscure
> the bug's origin. I am not sure that a core dump generated by a
> different signal (caused by dereferencing a random area in memory)
> will leave any resultant core file intact enough to be useful.
In modern Linux distributions, all packages are built with debuginfo,
then debuginfo is stripped from binaries and saved in separate packages.
When a system detects a new coredump, we can install debuginfo packages and
analyze the core file. We can get backstraces for all threads of a crashed
process, values of variables, registers, etc.
I'm maintaining the CRIU project, and I can say that core dumps
are very useful to investigate user issues.
I want to mention, that currently core dumps are generated for bash. I
don't try to invent anything new. I want that bash works by the same way
when it runs as an init process in a pid namespace and when it runs as a
ordinary process.
Take a look at these experiments:
1. bash from my system is executed without any additional changes.
$ cat init.sh
#!/bin/bash
function finish {
echo Exit trap
}
trap finish EXIT
sleep 100
echo ok
$ bash init.sh &
[1] 14863
$ kill -SEGV $!
Exit trap
[1]+ Segmentation fault (core dumped) bash init.sh
$ coredumpctl | tail -n 1
Wed 2018-05-02 02:36:58 MSK 14863 0 0 11 present /usr/bin/bash
Core dump was generated and this "crash" was detected by coredumpd.
2. Run a bash script as an init process in a new pid namespace. My patch
is applied.
$ unshare -fp ./bash init.sh &
[1] 14899
$ ps -fC bash | grep init
root 14900 14899 0 02:41 pts/2 00:00:00 ./bash init.sh
$ kill -SEGV 14900
Exit trap
$
[1]+ Segmentation fault (core dumped) unshare -fp ./bash init.sh
$ coredumpctl --since '-30sec'
TIME PID UID GID SIG COREFILE EXE
Wed 2018-05-02 02:41:19 MSK 14900 0 0 11 present /root/bash/bash
Wed 2018-05-02 02:41:19 MSK 14899 0 0 11 present /usr/bin/unshare
This "crash" was detected by coredumpd too.
3. Run a bash script as an init process in a new pid namespace. Bash is patched
to just exit.
$ unshare -fp ./bash init.sh &
[1] 15014
$ ps -fC bash | grep init
root 15015 15014 0 02:43 pts/2 00:00:00 ./bash init.sh
$ kill -SEGV 15015
$ Exit trap
[1]+ Exit 123 unshare -fp ./bash init.sh
$ coredumpctl --since '-30sec'
No coredumps found.
This "crash" was not detected by coredumpd.
>
> 2. Whether the problem that elicited the core dump can be reproduced. If
> it's not immediately obvious from the core dump, you have to be able
> to reproduce the problem in order to fix it. I'm skeptical that this
> will be the case.
I have seen many times, when a problem is reproduced rarely or reproduced
only in a user environment. My experience says that we need as much
information as we can get. Usually a core dump is the most useful
piece of information to investigate a crash.
>
> > Modern linux distributions
> > automatically detect code dump files, and generates a bug report with
> > all required information.
>
> And what does "all required information" entail?
It contains backtraces for all threads of a crashed process, a list of
all packages in a system, kernel logs, lists of all memory mappings and
opened files, a process tree, etc.
I have seen many times, when such reports contained enough information
to fix issues without asking extra questions.
Here are a few examples from the redhat bugzilla, how these automatic
reports looks like:
https://bugzilla.redhat.com/show_bug.cgi?id=1169147
https://bugzilla.redhat.com/show_bug.cgi?id=1450017
>
> If it's not obvious, I'm trying to determine whether making this change
> will add any more value than simply exiting (perhaps with a particular
> exit status).
It will add more value. Without this changes, we will not know whether a
bach process crashed or exited. If it will not generate a core dump after
a crash, the tools like abrtd, coredumpd, etc will not detect this crash
and will not report about this abnormal behaviour.
>
> --
> ``The lyf so short, the craft so long to lerne.'' - Chaucer
> ``Ars longa, vita brevis'' - Hippocrates
> Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/