bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [BUG] Bash not reacting to Ctrl-C


From: Linus Torvalds
Subject: Re: [BUG] Bash not reacting to Ctrl-C
Date: Wed, 9 Feb 2011 13:57:35 -0800

On Wed, Feb 9, 2011 at 1:18 PM, Bob Proulx <bob@proulx.com> wrote:
>
> Since the exit status of /bin/true is ignored then I think that test
> case is flawed.  I think at the least needs to check the exit status
> of the /bin/true process.
>
>  bash -c 'while true; do /bin/true || exit 1; done'

The "|| exit 1" doesn't make any sense. If you seriously claim that
that is needed for ^C to work reliably, you're just totally mistaken.

Your whole premise that you should look at the error return code is
total and utter crap. Lookie here:

   while : ; do sleep 1; done

which is *exactly* the same case, and dammit, if ^C doesn't break out
of that loop, then the shell is a broken POS. Agreed? If you tell me
that it needs a "|| exit 1", you're just broken.

Try it.

And now go back to the original case. The same "^C should break out"
is true when you replace "sleep' with "/bin/true" or with anything
else. It had better break out every single time, on the first try.

And it really doesn't. And it's a bash bug. I don't understand why
bash people can't accept that. People even debugged it to the
particular line of source code in bash.

I just tried it:

  [torvalds@i5 ~]$ while : ; do /bin/sleep 1; done
  ^C
  [torvalds@i5 ~]$ while : ; do /bin/true; done
  ^C^C^C
  [torvalds@i5 ~]$ while : ; do /bin/true; done
  ^C
  [torvalds@i5 ~]$ while : ; do /bin/true; done
  ^C
  [torvalds@i5 ~]$ while : ; do /bin/true; done
  ^C^C^C

and the thing to notice is that it clearly is very much about some
race condition. Sometimes it works on the first try, sometimes it
doesn't.

Why are you arguing? Why are you bringing up totally idiotic
arguments, while others are ignoring it because they can't reproduce
it.

There were people who reproduced this on OS X too, btw, so it clearly
is not a Linux issue, even if you put your blinders on and ignore the
fact that it was already root-caused by Oleg. The problem is that
'set_job_status_and_cleanup()' does that

   if (wait_sigint_received && (WTERMSIG (child->status) == SIGINT) && ..

which just looks totally buggy and racy. There's even a comment about
it in the bash source code, for chrissake!

Here's the scenario:

 - wait_for() sets wait_sigint_received to zero (look for the comment
here!), and installs the sigint handler
 - it does other things too, but it does waitchld() that does the
actual waitpid() system call
 - now, imagine the following scenario: the ^C happens just as the
child already exited successfully!
 - so bash itself gets the sigint, and sets wait_sigint_received to 1

So what happens? child->status will be successful (the child was not
interrupted by the signal, it exited at just the right time), but bash
saw the SIGINT. But because it thinks it needs to see *both* the
sigint _and_ the WTERMSIG(child->status)==SIGINT, bash essentially
ignores the ^C.

Note how bash magically would have worked correctly if the child
process had taken one extra millisecond, and also seen the ^C and died
of it.

Notice how bash acts differently based on that millisecond difference?

So it's a bug.  Please don't make inane and incorrect excuses for it
("you didn't have an '|| exit 1' there), and please don't say "I can't
reproduce it". Even without reproducing it, just looking at the source
code should be good enough, no?

Especially as Oleg already pinpointed the exact line for you.

Now, it does look like the problem is at least partly because bash has
a horrible time trying to figure out a truly ambigious case: did the
child process explicitly ignore the ^C or not? It looks like bash is
trying to basically ignore the ^C in the case the child ignored it. I
think that's misguided, but that does seem to be what bash is trying
to do. It's misguided exactly because there is absolutely no way to
know whether the child returned successfully because it just happened
to exit just before the ^C came in, or whether it blocked ^C and
ignored it. So even _trying_ to make that judgement call seems to be a
bad idea.

And no, I don't know bash sources all that well. I played around with
them a long time ago, and for this I only glanced at it quickly to get
more of a view into what bash is trying to do (all thanks should go to
Oleg who already pinpointed the line that breaks). Maybe there are
subtle issues, maybe there are broken historical shell semantics here.

But please don't ignore this bug just because you cannot reproduce it.

                                          Linus



reply via email to

[Prev in Thread] Current Thread [Next in Thread]