[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: call-process should not block process filters from running

From: Spencer Baugh
Subject: Re: call-process should not block process filters from running
Date: Sat, 01 Jul 2023 14:24:59 -0400
User-agent: Gnus/5.13 (Gnus v5.13)

Spencer Baugh <sbaugh@janestreet.com> writes:
> On Wed, Jun 28, 2023 at 8:52 AM Eli Zaretskii <eliz@gnu.org> wrote:
>> > From: Spencer Baugh <sbaugh@janestreet.com>
>> > Cc: app-emacs-dev@janestreet.com
>> > Date: Tue, 27 Jun 2023 17:55:00 -0400
>> >
>> >
>> > When Lisp code calls call-process, then while call-process is running,
>> > all Lisp is blocked from running, including process filters and timers
>> > created beforehand by other Lisp.  This call-process behavior is
>> > harmful, but we can fix call-process to not behave this way.
>> >
>> > This call-process behavior is harmful:
>> > Many packages rely on their process filters and timers being able to
>> > run:
>> > - Packages which communicate over the network (such as ERC)
>> >   rely on being able to respond to heartbeats and prevent timeouts.
>> > - Packages which respond to requests from other programs (such as EXWM)
>> >   rely on being able to respond to those requests.
>> > A simple (shell-command "sleep 60") will cause most such packages to
>> > break or behave poorly.
>> Packages that rely on the above should be defensive in the face of
>> delays, because Emacs calls blocking APIs all over the place;
>> call-process is just one of them.  Moreover, a long-running Lisp
>> program will also block timers and async communications, because Emacs
>> can only be relied upon to run timers and read subprocess output when
>> it is idle; otherwise we need some special calls or signals to arrive
>> to trigger reading input that waits.
>> Therefore, I think the above exaggerates the problem, and also
>> "blames" a single API for something that is ubiquitous in Emacs Lisp
>> programming and should be accounted for by any package that is
>> sensitive to delays.
>> > My suggestion is that we should create a new helper function in Lisp,
>> > perhaps called "process-run", which has an identical interface as
>> > call-process, except that while it is running, other process filters and
>> > other Lisp are still able to run.  Then we can move users of
>> > call-process over to this new function.  Most (but perhaps not all) Lisp
>> > using call-process should be using process-run, since most Lisp doesn't
>> > actually want to block process filters from running.
>> I have no objections to adding new functionality that could make
>> call-process less blocking (but rewriting it in Lisp based on
>> make-process and accept-process-output is not the only possible
>> implementation, see below).  But I am firmly against any massive
>> moving of the current users of call-process to this new functionality,
>> for several reasons:
>>   . some Lisp programs might rely on the fact that no other Lisp runs
>>     while the sub-process is running, and no input from any source but
>>     that subprocess can arrive;
>>   . using async subprocesses has its downsides:
>>     - it is hard to read only from a given subprocess without causing
>>       issues to other filters
>>     - reading from async subprocess has known problems when you need
>>       to decode its output
>>     - the timing and synchronization is problematic when using
>>       accept-process-output, and has its pitfalls
>>     - on Windows the number of simultaneous async subprocesses is
>>       limited by a relatively small number
>>   . reimplementing this in Lisp will increase consing, which will
>>     trigger more GCs, which will slow down callers of this new API wrt
>>     call-process
>> So I think we must consider each caller of call-process separately, on
>> a case by case basis, and only switch where (a) the process can indeed
>> take a long time, and (b) after careful auditing of the code and its
>> expectations from the subprocess call.
>> > - This does not require changing the C core of Emacs at all, nor
>> >   changing the call-process implementation.  The existing
>> >   accept-process-output API is sufficient for creating a synchronous
>> >   process-running API which does not block Lisp from running in process
>> >   filters and timers and so on.
>> I would actually suggest to consider a different approach, which does
>> need changes in C (but they are relatively simple changes).
>> Reimplementing all the complexities of the C code (don't forget
>> call-process-region) in Lisp will definitely cause bugs and
>> destabilize features that worked flawlessly for decades, so if a way
>> exists that allows us to have a "less blocking" call-process, without
>> requiring such massive reimplementation, I think we should seriously
>> consider it.
>> And AFAIU, such a way does exist.  The implementation of call-process
>> actually forks a subprocess, then reads from its pipe until EOF, and
>> then waits in waitpid for the subprocess to exit.  Both the reading
>> and the waiting are done in a loop.  So one way of making call-process
>> less blocking is by adding to these loops calls to
>> wait_reading_process_output, like we do, for example, in sleep-for,
>> but conditioned by some new variable exposed to Lisp.  Lisp programs
>> which want this behavior could then activate it by binding that new
>> variable.  This could give us most or all of the advantages of
>> non-blocking API without most disadvantages.  (We'd still need to move
>> to this functionality on a case by case basis, because some of the
>> considerations against that could still be valid in individual cases.)
> This sounds great, I would be happy to implement this.  I think we
> would also want a tiny wrapper in Lisp which binds this new variable
> then calls call-process, rather than having lots of programs binding
> the variable directly, to make it easier to change the implementation
> strategy in the future.
> I'll work on implementing this new variable.  If you have any other
> suggestions for it, let me know.

Just to sanity check before I go down the wrong path: When this variable
is set, instead of doing the reading from the subprocess's pipe in
call-process, we'll need to do it in wait_reading_process_output, so
that other Lisp can run.  We can't just add in calls to
wait_reading_process_output alongside the existing calls to read,
because read is blocking, and we need to process other input if there is
some, even if there's no input from the process under call_process.  A
similar change will need to happen for waitpid handling.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]