[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wait_reading_process_ouput hangs in certain cases (w/ patches)

From: Eli Zaretskii
Subject: Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
Date: Thu, 26 Oct 2017 19:23:06 +0300

> Cc: address@hidden
> From: Matthias Dahl <address@hidden>
> Date: Thu, 26 Oct 2017 16:07:31 +0200
> When committing, Magit prepares a COMMIT_MSG buffer and does some
> process magic of its own which is pretty much irrelevant for this.
> At some point during that, while we are already in an instance of
> wait_reading_process_output (due to sit_for), the post-command-hooks are
> run.

AFAIK, post-command-hooks cannot be run while we are in sit-for, but I
guess this is not relevant to the rest of the description?

> And here things get interesting. Eventually flyspell-post-command-hook
> is run which executes flyspell-word synchronously. That basically does
> write out a word that needs to be checked to the spellchecker process,
> waits for the results from stdin via accept-process-output and goes on.
> Of special note here is that it a) specifies a wait_proc (spellchecker
> process) and no timeout or whatsoever.
> The output from the spellchecker is usually there instantaneously, so
> that is actually unnoticeable, unless wait_reading_process_output, that
> was invoked through that specific accept-process-output, decides to run
> the timers.
> And here comes the catch: At this point, usually the spellchecker output
> is already available but not yet read. When the timers run, one of them
> calls accept-process-output again which will read the entire available
> output of the spellchecker process.

I understand that this timer calls accept-process-output with its
argument nil, is that correct?  If so, isn't that a bug for a timer to
do that?  Doing that runs the risk of eating up output from some
subprocess for which the foreground Lisp program is waiting.

So please point out the timer that does this, because I think that
timer needs to be fixed.

> The gist of it is: If we have an active wait_reading_process_output call
> with a wait_proc set but no timeout that calls out to either timers or
> filters, it is entirely possible that those directly or indirectly call
> us again recursively, thus reading the output we are waiting for without
> us ever noticing it, if no further output becomes available in addition
> to what was read unnoticed... like it happens with flyspell.
> That is what my patches fix: They simply add a bytes read metric to each
> process structure that we can check for change at strategically relevant
> points and decide if we got some data back that went unnoticed and break
> out from wait_reading_process_output.

We already record the file descriptors on which we wait for process
output, see compute_non_keyboard_wait_mask.  Once
wait_reading_process_output exits, it clears these records.  So it
should be possible for us to prevent accept-process-output calls
issued by such runaway timers from waiting on the descriptors that are
already "taken": if, when we set the bits in the pselect mask, we find
that some of the descriptors are already watched by the same thread as
the current thread, we could exclude them from the pselect mask we are
setting up.  Wouldn't that be a better solution?  Because AFAIU, your
solution just avoids an infinite wait, but doesn't avoid losing the
process output, because it was read by the wrong Lisp code.  Right?

> But I also think that wait_reading_process_output violates its contract
> and is buggy in this regard as well, since it should properly function
> even if it calls out to filters or timers -- and it clearly does not and
> I would wager more hangs seen in the wild that weren't debugged, could
> be attributed to this very bug.

I think the basic contract that is violated here is that the output
from a subprocess is "stolen" by an unrelated call to
accept-process-output, and we should prevent that if we can.

> If there are any question marks left hanging over your head, please
> don't hesitate to ask and I will try my best to clear them up -- but it
> might end up being another longish mail, so be warned. ;)

Well, I'd like to eyeball the timer which commits this crime.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]