bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

REGRESSION: signal handlers not running in real-time


From: Linda Walsh
Subject: REGRESSION: signal handlers not running in real-time
Date: Fri, 31 Jul 2015 04:31:32 -0700
User-agent: Thunderbird


I decided to give 4.3 a try again and ended up writing a 100 line script
to take the version# as input and auto-apply all the patches in the patch dir.

ARG.... going down a rathole... oh well.

I think I got a build pretty much as I want it -- w/everything builtin
statically except for a few key system libs that are unable to issue
a ".a" version of the lib:
 ldd /bin/bash
 linux-vdso.so.1 (0x00007fc4c3137000)
 libdl.so.2 => /lib64/libdl.so.2 (0x0000003001400000)
 libc.so.6 => /lib64/libc.so.6 (0x0000003000c00000)
 libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003003000000)
 /lib64/ld-linux-x86-64.so.2 (0x000055e08a982000)
 echo "$BASH_VERSION"
4.3.39(1)-release

But -- one initial bug came up -- I have a signal handler that
is called on window resizing -- but now, it no longer is called when
the window is resized -- but potentially hours later when I press a
key in that window.

Yes Chet, you warned me you broke real-time sig-handlers... but I
think this is a bigger bug than you thought it was.  I was thinking
about other places one wants real-time sig-handling --
like in handling command timeouts, and being able to manage
child processes as I do in a few scripts.

If I want to keep "X" children running, I would have background procs
increment a semaphore that would be initialized with the number of
background children I would want to run.  When a child came in, it
can increment semaphore, but now  my foreground control proc, instead
of being stuck in select or such, will be stuck in readline's
"interrupts disabled" read routine.  And it won't be able to
wake up via SIG_ALARMs, children dying.  Basically, you've now
changed BASH to be "POSIX-undefined" in  its behaviors: signal.2
manpage:

NOTES
  The effects of signal() in a multithreaded process are unspecified.

  According  to  POSIX,  the  behavior of a process is undefined after it
  ignores a SIGFPE, SIGILL, or SIGSEGV signal that was not  generated  by
  kill(2)  or  raise(3).   Integer division by zero has undefined result.
  On some architectures it will generate a SIGFPE signal.  (Also dividing
  the  most  negative  integer by -1 may generate SIGFPE.)  Ignoring this
  signal might lead to an endless loop.

  See sigaction(2) for details on what happens when  SIGCHLD  is  set  to
  SIG_IGN.

So now if someone gets a div/0 and can't handle the signal, it could lock
up the machine -- only because the results are undefined.

I.e. you've take away the ability to process sigs in real-time -- and
apparently *several* of these result in undefined and/or nasty behaviors.

It seems that POSIX solved that problem w/sigaction that auto-resets
a signal's disposition so as to prevent getting into a situation of being
"nested", handling the same sigs.

Though apparently BSD's implementation further screwed up signal handling
-- and why is that important?  Linux has different libs -- some providing
the stable SysV interface, and others the BSD interface.  Problem there is
that after around 2003, most of the SysV vendors were out of business, and
now the BSD'ers had enough clout to change POSIX's charter from
"descriptive" to "proscriptive" -- so they can push through mandatory
changes that anyone who wants to continue to claim POSIX compat has to
comply with.

The Gnu people are supporting the new posix compat so they can claim posix
compat as a feature of gnu SW (and added marketing bullet). So -- more from
the signal manpage:


  The situation on Linux is as follows:

  * The kernel's signal() system call provides System V semantics.

  * By default, in glibc 2 and later, the signal() wrapper function  does
    not  invoke  the  kernel system call.  Instead, it calls sigaction(2)
    using flags that supply BSD semantics.  This default behavior is pro-
    vided  as  long as the _BSD_SOURCE feature test macro is defined.  By
    default, _BSD_SOURCE is defined; it is also implicitly defined if one
    defines _GNU_SOURCE, and can of course be explicitly defined.
    ...

  * The signal() function in Linux  libc4  and  libc5  provide  System  V
    semantics.   If one on a libc5 system includes <bsd/signal.h> instead
    of <signal.h>, then signal() provides BSD semantics.

That last one is -- can really be bad if an autoconfig ends up linking
a program to the BSD semantics and not the SysV semantics.

It's quite possible that depending on how bash is configured, someone
might pull in one or the other -- but many of the signal problems are
there because of gnu not using linux OS calls but re-implementing their
own stuff using BSD.

Anyway -- not being able to respond to signals in ***real-time*** is
a problem that is only going to get worse as machines become more
*parallel*.  Not being able to load-balance and respond to sigs in any
real-time is only going to become more costly as cpu resources continue
growing, but only in parallel.

I'm guessing that readline would have to be turned "inside-out" to
become event driven and thread-safe?
*sigh*









reply via email to

[Prev in Thread] Current Thread [Next in Thread]