[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wait_reading_process_ouput hangs in certain cases (w/ patches)

From: Matthias Dahl
Subject: Re: wait_reading_process_ouput hangs in certain cases (w/ patches)
Date: Thu, 26 Oct 2017 20:56:03 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0

Hello Eli...

On 26/10/17 18:23, Eli Zaretskii wrote:

> AFAIK, post-command-hooks cannot be run while we are in sit-for, but I
> guess this is not relevant to the rest of the description?

This probably comes from server.el (server-visit-files) because Magit
uses emacsclient for some of its magic.

I have attached a backtrace, taken during the hang. Unfortunately it is
from a optimized build (would have needed to recompile just now, and I
am a bit in a hurry) but it at least shows the callstack (more or less)

> I understand that this timer calls accept-process-output with its
> argument nil, is that correct?  If so, isn't that a bug for a timer to
> do that?  Doing that runs the risk of eating up output from some
> subprocess for which the foreground Lisp program is waiting.

I haven't actually checked which timer it is, to be quite honest since I
didn't think of it as a bug at all.

Correct me if I am wrong, calling accept-process-output w/o arguments
is expected to be quite harmless and can be useful. If you specify a
specific process, you will most definitely wait at least as long as
it takes for that process to produce any output.

Nevertheless: If am not completely mistaken, there is no data lost at
all. It is read and passed to the filter function which was registered
by the interested party -- otherwise the default filter will simply
append it to the buffer it belongs to.

The only thing that is lost is that it was ever read at all and thus
an endless wait begins.

> So please point out the timer that does this, because I think that
> timer needs to be fixed.

If you still need that, I will do some digging and try to find it.

> We already record the file descriptors on which we wait for process
> output, see compute_non_keyboard_wait_mask.  Once
> wait_reading_process_output exits, it clears these records.  So it
> should be possible for us to prevent accept-process-output calls
> issued by such runaway timers from waiting on the descriptors that are
> already "taken": if, when we set the bits in the pselect mask, we find
> that some of the descriptors are already watched by the same thread as
> the current thread, we could exclude them from the pselect mask we are
> setting up.  Wouldn't that be a better solution?  Because AFAIU, your
> solution just avoids an infinite wait, but doesn't avoid losing the
> process output, because it was read by the wrong Lisp code.  Right?

Hm... at the moment I don't see where data is lost with my solution.
Maybe I am being totally blind and making a fool out of myself but I
honestly don't see it.

What you suggest could be dangerous as well, depending on how it is
implemented and the circumstances. What fds get excluded in recursive
calls? Only wait_proc ones? Or every one that is watched somewhere up
in the callstack? Depending on what we do, we could end up with an
almost empty list that doesn't get "ready" as easily as one would
have expected by a naked accept-process-output call... and it could thus
potentially stall as well... worst-case, I know.

Just thinking out loud here. I would really need to check this, those
are just my initial thoughts.

> Well, I'd like to eyeball the timer which commits this crime.

If you still do, let me know and I will try to track it down...

So long,

Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu

Attachment: emacs-bt.txt
Description: Text document

reply via email to

[Prev in Thread] Current Thread [Next in Thread]