bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Segfault on recursive trap/kill


From: Bob Proulx
Subject: Re: Segfault on recursive trap/kill
Date: Sat, 6 Oct 2018 22:44:17 -0600
User-agent: Mutt/1.10.1 (2018-07-13)

Hi Mike,

Mike Gerwitz wrote:
> ... but are you saying that terminating with a segfault is the
> intended behavior for runaway recursion?

Let me give the discussion this way and I think you will be
convinced. :-)

How is your example any different from a C program?  Or Perl, Python,
Ruby, and so forth?  All of those also allow infinite recursion and
the kernel will terminate them with a segfault.  Because all of those
also allow infinite recursion.  A program that executes an infinite
recursion would use infinite stack space.  But real machines have a
finite amount of stack available and therefore die when the stack is
exceeded.

This following complete C program recurses infinitely.  Or at least
until the stack is exhausted.  At which time it triggers a segfault
because it tries to use memory beyond the page mapped stack.

  int main() {
    return main();
  }

  $ gcc -o forever forever.c
  $ ./forever
  Segmentation fault
  $ echo $?
  139  # Signal 11 + 128

       The return value of a simple command is its exit status, or 128+n if
       the command is terminated by signal n.

Would you say that is a bug in the C language?  A bug in gcc that
compiled it?  A bug in the Unix/Linux kernel for memory management
that trapped the error?  The parent shell that reported the exit code
of the program?  Or in the program source code?  I am hoping that we
will all agree that it is a bug in the program source code and not
either gcc or the kernel. :-)

Shell script code is program source code.  Infinite loops or infinite
recursion are bugs in the shell script source code not the interpreter
that is executing the code as written.

This feels to me to be related to The Halting Problem.

> As long as there is no exploitable flaw here, then I suppose this isn't
> a problem;

It's not a privilege escalation.  Nor a buffer overflow.  Whether this
is otherwise exploitable depends upon the surrounding environment usage.

> I haven't inspected the code to see if this is an access violation
> or if Bash is intentionally signaling SIGSEGV.

It is the kernel that manages memory, maps pages, detects page faults,
kills the program.  The parent bash shell is only reporting the exit
code that resulted.  The interpreting shell executed the shell script
souce code as written.
                                                                                
                                          
Other shells are also fun to check:

  $ dash -c 'trap "kill 0" TERM; kill 0'
  Segmentation fault

  $ ash -c 'trap "kill 0" TERM; kill 0'
  Segmentation fault

  $ mksh -c 'trap "kill 0" TERM; kill 0'
  Segmentation fault

  $ ksh93 -c 'trap "kill 0" TERM; kill 0'
  $ echo $?
  0

  $ posh -c 'trap "kill 0" TERM; kill 0'
  Terminated
  Terminated
  Terminated
  ...
  Terminated
  ^C

Testing zsh is interesting because it seems to keep the interpreter
stack in data space and therefore can consume a large amount of memory
if it is available.  And then can trap the result of being out of data
memory and then kills itself with a SIGTERM.  Note that in my testing
I have Linux memory overcommit disabled.

This finds what look like bugs in posh and ksh93.

> it's just that most users assume that a segfault represents a
> problem with the program

Yes.  And here it indicates a bug too.  It is indicating a bug in the
shell program code which sets up the infinite recursion.  Programs
should avoid doing that. :-)

  bash -c 'trap "kill 0" TERM; kill 0'

The trap handler was not set back to the default before the program
sent the signal to itself.  The way to fix this is:

  $ bash -c 'trap "trap - TERM; kill 0" TERM; kill 0'
  Terminated
  $ echo $?
  143  # killed on SIGTERM as desired, good

    If ARG is absent (and a single SIGNAL_SPEC is supplied) or `-',
    each specified signal is reset to its original value.

The proper way for a program to terminate itself upon catching a
signal is to set the signal handler back to the default value and then
send the signal to itself so that it will be terminated as a result of
the signal and therefore the exit status will be set correctly.

For example the following is useful boilerplate:

  unset tmpfile
  cleanup() {
    test -n "$tmpfile" && rm -f "$tmpfile" && unset tmpfile
  }
  trap "cleanup" EXIT
  trap "cleanup; trap - HUP; kill -HUP $$" HUP
  trap "cleanup; trap - INT; kill -INT $$" INT
  trap "cleanup; trap - QUIT; kill -QUIT $$" QUIT
  trap "cleanup; trap - TERM; kill -TERM $$" TERM
  tmpfile=$(mktemp) || exit 1

If a program traps a signal then it should restore the default signal
handler for that signal and send the signal back to itself.  Otherwise
the exit code will be incorrect.  Otherwise parent programs won't know
that the child was killed with a signal.

For a highly recommended deep dive into this:

  https://www.cons.org/cracauer/sigint.html

Hope this helps!
Bob

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]