bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16775: dbus interacts poorly with lisp-level event loops


From: Daniel Colascione
Subject: bug#16775: dbus interacts poorly with lisp-level event loops
Date: Mon, 17 Feb 2014 07:22:26 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

On 02/17/2014 07:07 AM, Michael Albinus wrote:
Daniel Colascione <dancol@dancol.org> writes:

On 02/17/2014 05:27 AM, Michael Albinus wrote:
Daniel Colascione <dancol@dancol.org> writes:

dbus-call-method expects read-event to return the dbus event
immediately, but read_char in keyboard.c treats the dbus event as a
special event and runs it through special-event-map itself before
sitting and reading another event. The event waiting loop always times
out, so dbus-call-method always takes at least 100ms due to the
hard-coded 0.1 timeout parameter to read-event.

dbus-call-method does not expect the D-Bus event to be returned by
read-event.

Then why was it calling dbus-check-event on the result? I checked in a
hack that addresses the immediate issue.

Just for safety. Likely, I was not convinced the time I was writing
this, that it couldn't be returned by read-event. Don't remember
exactly.

Does it really hurt?

It certainly makes it more difficult to determine the intent of the code.

It simply calls read-event in order to trigger event
handling. The loop itself checks, whether the respective event has been
inserted into dbus-return-values-table. And when *other* but D-Bus events
do arrive in the meantime, they must be preserved in unread-command-events.

Why is it a problem to wait at least 100ms? D-Bus messages are not
expected to perform in real time (whatever this means).

Because secrets.el was taking a whole second to load due almost
entirely to dbus delays.

Well, that's serious. I'll check.

I used time emacs -Q -batch --eval "(require 'secrets)".

However, is it just secrets.el loading, or also normal operation in
secrets.el, which are delayed

I don't know. Secrets has other problems that I have to fix separately. (By the way: dbus appears to hang if there's an error in a message spec and the bus disconnects from underneath us. Try (dbus-call-method :session "org.freedesktop.secrets" "/org/freedesktop/secrets/collection/login" "org.freedesktop.Secret.Collection" "SearchItems" '(:array (:dict-entry "host" ("xxxx" "xxxx")) (:dict-entry "port" "imaps"))). A signal of some sort pertaining to the disconnection is delivered to xd_read_message_1, but it drops this signal on the floor, leading dbus-call-method to loop until its timeout expires.


This problem is hairy: special-event-map functions can execute arbitrary
code and re-enter the dbus synchronous event loop, and there's no way to
non-locally terminate a particular read-event loop. Here's the
problematic scenario: dbus-call-method works by setting up an
asynchronous dbus call and calling read-event until the specific
asynchronous call on which it is waiting completes.

Why do you want to terminate non-locally in dbus-call-method? If you
need asynchronous behaviour, there is dbus-call-method-asynchronously.

The goal is to make dbus-call-method return as soon as the method call
is complete.

We're speaking about 0.1sec delay. Does it really hurt? (Yes, I'll check
the secrets.el case).

Yes, 100 milliseconds is far about the threshold at which interactivity is visibly affected. It's completely unacceptable. It's worse when a single logical operation involves multiple dbus calls.

The immediate problem is that read-event never actually returns because
the dbus event is special

As said above this is not a problem but intended.

I find it hard to believe that the overall effects were
intentional. Randomly delaying all of Emacs because something tried to
make a dbus call is completely unacceptable.

You misunderstand me. Intentionally?

No. You seem to be operating under the something interesting notion that it's reasonable for an interactive program to simply stop responding for 100ms. The threshold of human perception is widely regarded to be somewhere in the 30ms neighborhood.

It is not intended to delay Emacs. But there was an error with the old
implementation (Emacs was blocked sometimes during dbus-call-method);
that's why it is changed to be called asynchronously internally. By
this, you could interrupt dbus-call-method.

Right. I looked up the change history, and making dbus-call-method async was the right choice. The event delivery, however, needs to be refined. There's no reason, in principle, that dbus-call-method shouldn't be able to return instantly on call completion.


--- but let's say we worked around that
problem by modifying special-event-map around the read-event call so
that read-event returned immediately. We'd still have a serious issue
because *other*, non-dbus special event handles can run arbitrary code
and enter an inner dbus-call-method reply-waiting loop. If the reply to
the outer synchronous dbus call arrives before the reply to the inner
synchronous dbus call, dbus-call-method-handler (which is run from
special-event-map inside read-event or, in our hypothetical partial fix,
manually from the wait loop) will dutifully put the reply on
dbus-return-values-table. But the inner event loop has no way of waking
the *outer* event loop, so when the special event handler that called
the inner dbus-call-method returns, read_char will loop around and wait
for the full timeout period before returning to the outer dbus-call-method.

I don't understand the scenario. Could you, please, give a code example?

No, because the current code is so broken that any example I gave
wouldn't actually demonstrate the problem I'm trying to explain above:

Soory, then it will be hard to debug ...

I wasn't expecting the desirability of random 100ms delays to be a
point of debate, so I jumped right to the problems one might encounter
with various solutions. You can trigger the scenario I'm worried about
by performing a dbus call in an X drag-and-drop handler while somebody
else is already blocked waiting for a dbus call.

I've written another small example, using file notifications (see
attached file). No problems there.

You misunderstand me. The code I checked in "fixes" the problem for both the base case and the scenario I described. The code that was there yesterday is similarly broken --- with respect to the 100ms delay --- for both uses. There's nothing to debug right now.

If dbus had been implemented as a process type instead of a special
event source, we'd just be able to use accept-process-output in dbus-call.

There is already the discussion that such events (dbus, file
notification) should be implemented differently. I don't know whether
this shall be done as process type or as a separate queue to be checked for.

I have a partial patch that implements this scheme: add another
argument early-return to Fread-event and sit-for. early-return can
either be nil for present behavior or a function of no
arguments. Plumb early-return all the way down to
wait_reading_process_output, and have that function call early-return
just before running xg_select. If early-return returns true, abort the
select loop and return nil from Fread-event. This way, you could pass
into read-event in dbus.el an early-return function that checked for
the existence of the return value on dbus-return-values-table. Then,
the function would return immediately on a dbus call being complete no
matter how deeply the calls were nested. The mechanism could also take
over the role of the current wait_for_cell stuff.

Definitely nothing for 24.4, we're in feature freeze. And before we're
adding such a patch, I recommend to discuss Stefan's proposal to add
another event queue in the main loop for such kind of special events.

I'll have to go find that thread. I'm not sure what "another event queue" means, exactly, or how it would help. If it's just a different kind of accept-process-input, one that pulls only events of a certain sort out of the kbd buffer, then it's still vulnerable to the reentrancy problem I described in my first message.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]