[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Concurrency via isolated process/thread

From: Ihor Radchenko
Subject: Re: Concurrency via isolated process/thread
Date: Sun, 09 Jul 2023 15:49:41 +0000

Eli Zaretskii <eliz@gnu.org> writes:

>> I now understand that the gap can be moved by the code that is not
>> actually writing text in buffer. However, I do not see how this is a
>> problem we need to care about more than about generic problem with
>> simultaneous write.
> Imagine a situation where we need to process XML or HTML, and since
> that's quite expensive, we want to do that in a thread.  What you are
> saying is that this will either be impossible/impractical to do from a
> thread, or will require to lock the entire buffer from access, because
> the above processing moves the gap.  If that is not a problem, I don't
> know what is, because there could be a lot of such scenarios, and they
> all will be either forbidden or very hard to implement.

In this particular example, the need to move the gap is there because
htmlReadMemory requires memory segment as input. Obviously, it requires
a block in the current implementation.

Can it be done async-safe? We would need to memcpy the parsed buffer
block. Or up to 3 memcpy if we do not want to move the gap.

>> If a variable or object value is being written, we need to block it.
>> If a buffer object is being written (like when moving the gap or writing
>> text), we need to block it. And this blocking will generally pose a
>> problem only when multiple threads try to access the same object, which
>> is generally unlikely.
> My impression is that this is very likely, because of the many global
> objects in Emacs.

There are many objects, but each individual thread will use a subset of
these objects. What are the odds that intersection of these subsets are
frequent? Not high, except certain frequently used objects. And we need
to focus on identifying and figuring out what to do with these
likely-to-clash objects.

> ... Moreover, if you intend to allow several threads
> using the same buffer (and I'm not yet sure whether you want that or
> not),

It would be nice if multiple threads can work with the same buffer in
read-only mode, maybe with a single main thread editing the buffer (and
pausing the async read-only threads while doing so).
Writing simultaneously is a much bigger ask.

> ... then the buffer-local variables of that buffer present the same
> problem as global variables.  Take the case-table or display-table,
> for example: those are buffer-local in many cases, but their changes
> will affect all the threads that work on the buffer.

And how frequently are case-table and display-table changed? AFAIK, not
frequently at all.

>>    We need to ensure that simultaneous consing will never happen. AFAIU,
>>    it should be ok if something that does not involve consing is running
>>    at the same time with cons (correct me if I am wrong here).
> What do you do if some thread hits the memory-full condition?  The
> current handling includes GC.

May you please explain a bit more about the situation you are referring
to? My above statement was about consing, not GC.

For GC, as I mentioned earlier, we can pause each thread once maybe_gc()
determines that GC is necessary, until all the threads are paused. Then,
GC is executed and the threads continue.

>> 2. Redisplay cannot be asynchronous in a sense that it does not make
>>    sense that multiple threads, possibly working with different buffers
>>    and different points in those buffers, request redisplay
>>    simultaneously. Of course, it is impossible to display several places
>>    in a buffer at once.
> But what about different threads redisplaying different windows? is
> that allowed?  If not, here goes one more benefit of concurrent
> threads.

I think I need to elaborate what I mean by "redisplay cannot be

If an async thread want to request redisplay, it should be possible. But
the redisplay itself must not be done by this same thread. Instead, the
thread will send a request that Emacs needs redisplay and optionally
block until that redisplay finishes (optionally, because something like
displaying notification may not require waiting). The redisplay requests
will be processed separately.

Is Emacs display code even capable of redisplaying two different windows
at the same time?

> Also, that issue with prompting the user also needs some solution,
> otherwise the class of jobs that non-main threads can do will be even
> smaller.

We can make reading input using similar idea to the above, but it will
always block until the response.

For non-blocking input, you said that it has been discussed.
I do vaguely recall such discussion in the past and I even recall some
ideas about it, but it would be better if you can link to that
discussion, so that the participants of this thread can review the
previously proposed ideas.

>>    Only a single `main-thread' should be allowed to modify frames,
>>    window configurations, and generally trigger redisplay. And thread
>>    that attempts to do such modifications must wait to become
>>    `main-thread' first.
> What about changes to frame-parameters?  Those don't necessarily
> affect display.

But doesn't it depend on graphic toolkit? I got an impression (from Po
Lu's replies) that graphic toolkits generally do not handle async
requests well.

>>    This means that any code that is using things like
>>    `save-window-excursion', `display-buffer', and other display-related
>>    staff cannot run asynchronously.
> What about with-selected-window? also forbidden?

Yes. A given frame must always have a single window active, which is not
compatible with async threads.
In addition, `with-selected-window' triggers redisplay. In particular,
it triggers redisplaying mode-lines.

It is a problem similar to async redisplay.

>>    Async threads will make an assumption that
>>    (set-buffer "1") (goto-char 100) (set-buffer "2") (set-buffer "1")
>>    (= (point) 100) invalid.
> If this is invalid, I don't see how one can write useful Lisp
> programs, except of we request Lisp to explicitly define critical
> sections.

Hmm. I realized that it is already invalid. At least, if `thread-yield'
is triggered somewhere between `set-buffer' calls and other thread
happens to move point in buffer "1".

But I realize that something like

(while (re-search-forward "foo") nil t)
  (with-current-buffer "bar" (insert (match-string 0))))

may be broken if point is moved when switching between "bar" and "foo".

Maybe, the last PV, ZV, and BEGV should not be stored in the buffer
object upon switching away and instead recorded in a thread-local
((buffer PV ZV BEGV) ...) alist. Then, thread will set PV, ZV, and BEGV
from its local alist rather than by reading buffer->... values.

>> > What if the main thread modifies buffer text, while one of the other
>> > threads wants to read from it?
>> Reading and writing should be blocked while buffer is being modified.
> This will basically mean many/most threads will be blocked most of the
> time.  Lisp programs in Emacs read and write buffers a lot, and the
> notion of forcing a thread to work only on its own single set of
> buffers is quite a restriction, IMO.

But not the same buffers!

>> >> >> For example, `org-element-interpret-data' converts Org mode AST to
>> >> >> string. Just now, I tried it using AST of one of my large Org buffers.
>> >> >> It took 150seconds to complete, while blocking Emacs.
>> >> >
>> >> > It isn't side-effect-free, though.
>> >> 
>> >> It is, just not declared so.
>> >
>> > No, it isn't.  For starters, it changes obarray.
>> Do you mean `intern'? `intern-soft' would be equivalent there.
> "Equivalent" in what way?  AFAIU, the function does want to create a
> symbol when it doesn't already exist.

(intern (format "org-element-%s-interpreter" type)) is just to retrieve
existing function symbol used for a given AST element type.

                      (let ((fun (intern-soft
                                  (format "org-element-%s-interpreter" type))))
                        (if (and fun (fboundp fun)) fun (lambda (_ contents) 

would also work.

To be clear, I do know how this function is designed to work.
It may not be de-facto pure, but that's just because nobody tried to
ensure it - the usefulness of pure declarations is questionable in Emacs

>> There will indeed be a lot of work to make the range of Lisp functions
>> available for async code large enough. But it does not have to be done
>> all at once.
> No, it doesn't.  But until we have enough of those functions
> available, one will be unable to write applications without
> implementing and debugging a lot of those new functions as part of the
> job.  It will make simple programming jobs much larger and more
> complicated, especially since it will require the programmers to
> understand very well the limitations and requirements of concurrent
> code programming, something Lisp programmers don't know very well, and
> rightfully so.

I disagree.
If Emacs supports async threads, it does not mean that every single
peace of Elisp should be async-compatible.
But if a programmer is explicitly writing async code, it is natural to
expect limitations.

Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]