qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] coroutine-sigaltstack: Keep SIGUSR2 handler up


From: Max Reitz
Subject: Re: [PATCH] coroutine-sigaltstack: Keep SIGUSR2 handler up
Date: Fri, 22 Jan 2021 18:58:27 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0

On 22.01.21 17:38, Laszlo Ersek wrote:
On 01/22/21 11:20, Max Reitz wrote:
Modifying signal handlers is a process-global operation.  When two
threads run coroutine-sigaltstack's qemu_coroutine_new() concurrently,
they may interfere with each other: One of them may revert the SIGUSR2
handler back to the default between the other thread setting up
coroutine_trampoline() as the handler and raising SIGUSR2.  That SIGUSR2
will then lead to the process exiting.

Outside of coroutine-sigaltstack, qemu does not use SIGUSR2.  We can
thus keep the signal handler installed all the time.
CoroutineThreadState.tr_handler tells coroutine_trampoline() whether its
stack is set up so a new coroutine is to be launched (i.e., it should
invoke sigsetjmp()), or not (i.e., the signal came from an external
source and we should just perform the default action, which is to exit
the process).

Note that in user-mode emulation, the guest can register signal handlers
for any signal but SIGSEGV and SIGBUS, so if it registers a SIGUSR2
handler, sigaltstack coroutines will break from then on.  However, we do
not use coroutines for user-mode emulation, so that is fine.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
  util/coroutine-sigaltstack.c | 56 +++++++++++++++++++-----------------
  1 file changed, 29 insertions(+), 27 deletions(-)

(1) With SIGUSR2 permanently dedicated to "coroutine-sigaltstack.c", the
comment on the qemu_init_main_loop() declaration, in
"include/qemu/main-loop.h", also needs to be updated. SIGUSR2 is no
longer a "free for thread-internal usage" signal.

It’s possible that I haven’t understood that comment, but I haven’t adjusted it because I didn’t interpret it to mean that the signals listed there were free to use. For example, SIGUSR1 (aliased to SIG_IPI) is generated by cpus_kick_thread() and caught by KVM and HVF, so it doesn’t seem like it would be free for thread-internal usage either.

Instead, I understood it to mean that the signals listed there do not have to be blocked by non-main threads, because it is OK for non-main threads to catch them. I can’t think of a reason why SIGUSR2 should be blocked by any thread, so I decided not to change the comment.

But perhaps I just didn’t understand the whole comment. That’s definitely possible, given that I don’t even see a place where non-main threads would block the signals not listed there.

diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
index aade82afb8..2d32afc322 100644
--- a/util/coroutine-sigaltstack.c
+++ b/util/coroutine-sigaltstack.c
@@ -59,6 +59,8 @@ typedef struct {

  static pthread_key_t thread_state_key;

+static void coroutine_trampoline(int signal);
+
  static CoroutineThreadState *coroutine_get_thread_state(void)
  {
      CoroutineThreadState *s = pthread_getspecific(thread_state_key);
@@ -80,6 +82,7 @@ static void qemu_coroutine_thread_cleanup(void *opaque)

  static void __attribute__((constructor)) coroutine_init(void)
  {
+    struct sigaction sa;
      int ret;

      ret = pthread_key_create(&thread_state_key, 
qemu_coroutine_thread_cleanup);
@@ -87,6 +90,20 @@ static void __attribute__((constructor)) coroutine_init(void)
          fprintf(stderr, "unable to create leader key: %s\n", strerror(errno));
          abort();
      }
+
+    /*
+     * Establish the SIGUSR2 signal handler.  This is a process-wide
+     * operation, and so will apply to all threads from here on.
+     */
+    sa = (struct sigaction) {
+        .sa_handler = coroutine_trampoline,
+        .sa_flags   = SA_ONSTACK,
+    };
+
+    if (sigaction(SIGUSR2, &sa, NULL) != 0) {
+        perror("Unable to install SIGUSR2 handler");
+        abort();
+    }
  }

  /* "boot" function
@@ -121,7 +138,17 @@ static void coroutine_trampoline(int signal)
      /* Get the thread specific information */
      coTS = coroutine_get_thread_state();
      self = coTS->tr_handler;
+
+    if (!self) {
+        /*
+         * This SIGUSR2 came from an external source, not from
+         * qemu_coroutine_new(), so perform the default action.
+         */
+        exit(0);
+    }

(2) exit() is generally unsafe to call in signal handlers.

We could reason whether or not it is safe in this particular case (POSIX
describes the exact conditions --
<https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03>),
but it's much simpler to just call _exit().


(3) "Performing the default action" would be slightly different from
calling _exit(). When a process is terminated with a signal, the parent
can distinguish that, when reaping the child. See waitpid() /
WIFSIGNALED() / WTERMSIG(), versus WIFEXITED() / WEXITSTATUS().

So for the "default action", we'd have to:
- restore the SIGUSR2 handler to SIG_DFL, and
- re-raise the signal for the thread, and
- because the signal being handled is automatically blocked unless
   SA_NODEFER was set: unblock the signal for the thread.

The pthread_sigmask() call, made for the last step, would be the one
that would not return.

*However*, all of this complexity is not really useful here. We don't
really want to simulate being "forcefully terminated" by the unexpected
(asynchronous) SIGUSR2. We just want to exit.

Therefore, _exit() is fine. But, we should use an exit code different
from 0, as this is definitely not a pristine exit from QEMU. I'm not
sure if a convention exists for nonzero exit codes in QEMU; if not, then
just pass EXIT_FAILURE to _exit().

I’m fine with calling _exit(). I hope, Eric is, too (as long as the comment no longer claims this were the default behavior).

Given that SIGUSR2 is not something that qemu is prepared to receive from the outside, EXIT_FAILURE seems right to me. (Even abort() could be justified, but I don’t think generating a core dump does any good here.)

(As for qemu-specific exit code conventions, the only ones I know of are for certain qemu-img subcommands. I’m sure you grepped, too, but I can’t find anything for the system emulator.)

(4) Furthermore, please update the comment, because "perform the default
action" is not precise.

Sure, that’s definitely easier than to re-raise SIGUSR2.

+
      coTS->tr_called = 1;
+    coTS->tr_handler = NULL;
      co = &self->base;

      /*

(5) I see that "tr_called" has type "volatile sig_atomic_t", which is
great (I think that "sig_atomic_t" is not even necessary here, because
of the careful signal masking that we do, and because the signal is
raised synchronously).

"volatile" is certainly justified though (the compiler may not be able
to trace the call through the signal), and that's what I'm missing from
"tr_handler" too. IMO, the "tr_handler" field should be decalered as
follow:

   volatile void * volatile tr_handler;

wherein the second "volatile" serves just the same purpose as the
"volatile" in "tr_called", and the first "volatile" follows from *that*
-- wherever we chase the *chain of pointers* rooted in "tr_handler"
should not be optimized out by the compiler.

But: my point (5) is orthogonal to this patch. In practice, it may not
matter at all. So feel free to ignore now, I guess.

I suppose signal handlers are indeed preempting functions (i.e., the compiler is not aware of them executing). I overlooked that, given that logically, we more or less explicitly invoke the SIGUSR2 handler, but of course, technically, we just trigger the signal and the handler is then invoked preemptively.

I suspect the pthread_kill() function is sufficiently complex that the compiler can’t prove that it won’t access *coTS (e.g. because perhaps it contains a syscall?), but better be safe than sorry.

But I don’t really like the “volatile void *volatile tr_handler”, particularly the “volatile void *” combinations. I find volatile voids weird.

Why is tr_handler even a void pointer, and not a pointer to CoroutineSigAltStack, if all it can point to is such an object? “volatile CoroutineSigAltStack *” would make more sense to me.

Given that perhaps the CoroutineSigAltStack object should be volatile, shouldn’t also the CoroutineThreadState object be volatile? If it were, we could drop the volatile from tr_called and wouldn’t need to make the pointer value tr_handler volatile.


OTOH, if I look more through the code, making everything volatile seems a bit excessive to me. I understand correctly that sigsetjmp()/siglongjmp() act as barriers to the compiler, don’t they?

The only value that I can see the in-coroutine code writing that the calling code reads (before the next sigsetjmp()) is tr_called. So shouldn’t it be sufficient to insert a barrier() before the pthread_kill(), and then it’d be sufficient to keep tr_called volatile, but the rest could stay as it is?

@@ -150,12 +177,9 @@ Coroutine *qemu_coroutine_new(void)
  {
      CoroutineSigAltStack *co;
      CoroutineThreadState *coTS;
-    struct sigaction sa;
-    struct sigaction osa;
      stack_t ss;
      stack_t oss;
      sigset_t sigs;
-    sigset_t osigs;
      sigjmp_buf old_env;

      /* The way to manipulate stack is with the sigaltstack function. We
@@ -172,24 +196,6 @@ Coroutine *qemu_coroutine_new(void)
      co->stack = qemu_alloc_stack(&co->stack_size);
      co->base.entry_arg = &old_env; /* stash away our jmp_buf */

-    coTS = coroutine_get_thread_state();
-    coTS->tr_handler = co;
-
-    /*
-     * Preserve the SIGUSR2 signal state, block SIGUSR2,
-     * and establish our signal handler. The signal will
-     * later transfer control onto the signal stack.
-     */
-    sigemptyset(&sigs);
-    sigaddset(&sigs, SIGUSR2);
-    pthread_sigmask(SIG_BLOCK, &sigs, &osigs);
-    sa.sa_handler = coroutine_trampoline;
-    sigfillset(&sa.sa_mask);
-    sa.sa_flags = SA_ONSTACK;
-    if (sigaction(SIGUSR2, &sa, &osa) != 0) {
-        abort();
-    }
-
      /*
       * Set the new stack.
       */
@@ -207,6 +213,8 @@ Coroutine *qemu_coroutine_new(void)
       * signal can be delivered the first time sigsuspend() is
       * called.
       */
+    coTS = coroutine_get_thread_state();
+    coTS->tr_handler = co;
      coTS->tr_called = 0;
      pthread_kill(pthread_self(), SIGUSR2);
      sigfillset(&sigs);
@@ -230,12 +238,6 @@ Coroutine *qemu_coroutine_new(void)
          sigaltstack(&oss, NULL);
      }

-    /*
-     * Restore the old SIGUSR2 signal handler and mask
-     */
-    sigaction(SIGUSR2, &osa, NULL);
-    pthread_sigmask(SIG_SETMASK, &osigs, NULL);
-
      /*
       * Now enter the trampoline again, but this time not as a signal
       * handler. Instead we jump into it directly. The functionally


(6) This change, for qemu_coroutine_new(), is too heavy-handed, in my
opinion. I suggest (a) removing only the sigaction() calls and their
directly needed aux variables, and (b) *not* moving around operations.

In particular, you remove the blocking of SIGUSR2 for the thread -- the
pthread_sigmask() call. That means the sigsuspend() later on becomes
superfluous, as the signal will not be delivered inside sigsuspend(),
but inside pthread_kill(). With the signal not blocked, *generation* and
*delivery* will coalesce into a single event.

Are you saying that (a) is a problem because this is too heavy-handed of a change for a single patch, or because it would actually be a problem to deliver the signal inside pthread_kill()?

(Either way will result in me dropping the change from this patch, so it’s just a question out of curiosity.)

As for (b), just FYI, I deliberately moved around the operation, so that tr_handler is only set once the coroutine stack is set up. Otherwise it might be a problem if an external SIGUSR2 comes in before then.

But if we keep SIGUSR2 blocked, there is no reason for that movement, hence the “just FYI”.

The general logic should stay the same, only the signal action
manipulation should be removed. IOW, for this function, I propose the
following change only:

diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
index aade82afb8c0..722fed7b2502 100644
--- a/util/coroutine-sigaltstack.c
+++ b/util/coroutine-sigaltstack.c
@@ -149,107 +149,97 @@ static void coroutine_trampoline(int signal)
  Coroutine *qemu_coroutine_new(void)
  {
      CoroutineSigAltStack *co;
      CoroutineThreadState *coTS;
-    struct sigaction sa;
-    struct sigaction osa;
      stack_t ss;
      stack_t oss;
      sigset_t sigs;
      sigset_t osigs;
      sigjmp_buf old_env;

      /* The way to manipulate stack is with the sigaltstack function. We
       * prepare a stack, with it delivering a signal to ourselves and then
       * put sigsetjmp/siglongjmp where needed.
       * This has been done keeping coroutine-ucontext as a model and with the
       * pth ideas (GNU Portable Threads). See coroutine-ucontext for the basics
       * of the coroutines and see pth_mctx.c (from the pth project) for the
       * sigaltstack way of manipulating stacks.
       */

      co = g_malloc0(sizeof(*co));
      co->stack_size = COROUTINE_STACK_SIZE;
      co->stack = qemu_alloc_stack(&co->stack_size);
      co->base.entry_arg = &old_env; /* stash away our jmp_buf */

      coTS = coroutine_get_thread_state();
      coTS->tr_handler = co;

      /*
-     * Preserve the SIGUSR2 signal state, block SIGUSR2,
-     * and establish our signal handler. The signal will
-     * later transfer control onto the signal stack.
+     * Block SIGUSR2. The signal will later transfer control onto the signal
+     * stack.
       */
      sigemptyset(&sigs);
      sigaddset(&sigs, SIGUSR2);
      pthread_sigmask(SIG_BLOCK, &sigs, &osigs);
-    sa.sa_handler = coroutine_trampoline;
-    sigfillset(&sa.sa_mask);
-    sa.sa_flags = SA_ONSTACK;
-    if (sigaction(SIGUSR2, &sa, &osa) != 0) {
-        abort();
-    }

      /*
       * Set the new stack.
       */
      ss.ss_sp = co->stack;
      ss.ss_size = co->stack_size;
      ss.ss_flags = 0;
      if (sigaltstack(&ss, &oss) < 0) {
          abort();
      }

      /*
       * Now transfer control onto the signal stack and set it up.
       * It will return immediately via "return" after the sigsetjmp()
       * was performed. Be careful here with race conditions.  The
       * signal can be delivered the first time sigsuspend() is
       * called.
       */
      coTS->tr_called = 0;
      pthread_kill(pthread_self(), SIGUSR2);
      sigfillset(&sigs);
      sigdelset(&sigs, SIGUSR2);
      while (!coTS->tr_called) {
          sigsuspend(&sigs);
      }

      /*
       * Inform the system that we are back off the signal stack by
       * removing the alternative signal stack. Be careful here: It
       * first has to be disabled, before it can be removed.
       */
      sigaltstack(NULL, &ss);
      ss.ss_flags = SS_DISABLE;
      if (sigaltstack(&ss, NULL) < 0) {
          abort();
      }
      sigaltstack(NULL, &ss);
      if (!(oss.ss_flags & SS_DISABLE)) {
          sigaltstack(&oss, NULL);
      }

      /*
-     * Restore the old SIGUSR2 signal handler and mask
+     * Restore the old mask
       */
-    sigaction(SIGUSR2, &osa, NULL);
      pthread_sigmask(SIG_SETMASK, &osigs, NULL);

      /*
       * Now enter the trampoline again, but this time not as a signal
       * handler. Instead we jump into it directly. The functionally
       * redundant ping-pong pointer arithmetic is necessary to avoid
       * type-conversion warnings related to the `volatile' qualifier and
       * the fact that `jmp_buf' usually is an array type.
       */
      if (!sigsetjmp(old_env, 0)) {
          siglongjmp(coTS->tr_reenter, 1);
      }

      /*
       * Ok, we returned again, so now we're finished
       */

      return &co->base;
  }


(7) The sigaction() call has not been moved entirely correctly from
qemu_coroutine_new() to coroutine_init(), in my opinion.

Namely, the original call site fills the "sa_mask" member, meaning that,
while the signal handler is executing, not only SIGUSR2 itself should be
blocked, but *all* signals.

This is missing from the new call site, in coroutine_init() -- the
"sa_mask" member is left zeroed.

Now, in practice, this may not matter a whole lot, because "sa_mask" is
additive, and at the only place we allow the signal to be delivered,
namely in sigsuspend(), we already have everything blocked, except for
SIGUSR2. So when "sa_mask" is ORed with the set of blocked signals, upon
delivery of SIGUSR2, there is no actual change to the signal mask.

*But*, I feel that such a change would really deserve its own argument,
i.e. a separate patch, or at least a separate paragraph in the commit
message. And I suggest not doing those; instead please just faithfully
transfer the "sa_mask" setting too, to coroutine_init(). (My impression
is that the effective removal of the "sa_mask" population was an
oversight in this patch, not a conscious decision.)

Yes, it was totally an oversight.

Thanks a lot for your detailed review! I have absolutely no experience with coroutine-sigaltstack in particular, and very little experience with signal handling in projects where correctness actually matters. I’m grateful that you inspect this patch with great scrutiny.

(Btw, that’s why I was very close to just ending a new version of the mutex patch just with the commit message adjusted, because that felt like I could do much less wrong than here. Turns out I was right. O:))

Max




reply via email to

[Prev in Thread] Current Thread [Next in Thread]