[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#40665: 28.0.50; tls hang on local ssl
From: |
Robert Pluim |
Subject: |
bug#40665: 28.0.50; tls hang on local ssl |
Date: |
Sun, 19 Apr 2020 16:34:38 +0200 |
>>>>> On Sat, 18 Apr 2020 02:44:05 +0000 (UTC), Derek Zhou <derek@3qin.us> said:
Derek> Derek Zhou writes:
>> When this thing happens, the tls handshakes are done properly. However,
>> emacs did not write anything into gnutls before starting to read and
>> obviously cannot get anything out at all. It is not really a hang, but
>> write never happen and the display buffer stays empty.
>>
>> Derek
Derek> Took my nearly the whole day to debug, but this one-line patch fixed
my
Derek> problem.
Derek> My server finishes tls handshake within the gnutls_boot itself, and
if the
Derek> sentinel is not called right after, it will never be called so write
Derek> will not happen. Someone should review this carefully.
Derek> diff --git a/src/process.c b/src/process.c
Derek> index 91d426103d..6d497ef854 100644
Derek> --- a/src/process.c
Derek> +++ b/src/process.c
Derek> @@ -5937,8 +5937,7 @@ wait_reading_process_output (intmax_t
time_limit, int nsecs, int read_kbd,
Derek> /* If we have an incompletely set up TLS connection,
Derek> then defer the sentinel signaling until
Derek> later. */
Derek> - if (NILP (p->gnutls_boot_parameters)
Derek> - && !p->gnutls_p)
Derek> + if (NILP (p->gnutls_boot_parameters))
Derek> #endif
Derek> {
Derek> pset_status (p, Qrun);
Hereʼs what I think is happening:
The only way for p->gnutls_boot_parameters to become nil is here in
connect_network_socket:
if (p->gnutls_initstage == GNUTLS_STAGE_READY)
{
p->gnutls_boot_parameters = Qnil;
/* Run sentinels, etc. */
finish_after_tls_connection (proc);
}
and finish_after_tls_connection should call the sentinel, but
NON_BLOCKING_CONNECT_FD is still set, so it doesnʼt.
The next chance to call the sentinel would be from
wait_reading_process_output, but only if handshaking has been tried
and not completed, except it is complete already.
wait_reading_process_output then calls delete_write_fd, which clears
NON_BLOCKING_CONNECT_FD, and doesnʼt run the sentinel because
p->gnutls_boot_parameters is nil and p->gnutls_p is true
finish_after_tls_connection never gets another chance to run, since
the socket is connected and handshaking is complete.
After your change, you've fixed this case:
if p->gnutls_boot_parameters is nil, that means the handshake
completed already and the TLS connection is up, so
calling the sentinel is ok.
In other cases where the handshake does not complete straight away in
Fgnutls_boot, it will complete here in wait_reading_process_output
/* Continue TLS negotiation. */
if (p->gnutls_initstage == GNUTLS_STAGE_HANDSHAKE_TRIED
&& p->is_non_blocking_client)
{
gnutls_try_handshake (p);
p->gnutls_handshakes_tried++;
if (p->gnutls_initstage == GNUTLS_STAGE_READY)
{
gnutls_verify_boot (aproc, Qnil);
finish_after_tls_connection (aproc);
}
which always happens after delete_write_fd has been called, which
clears NON_BLOCKING_CONNECT_FD, so finish_after_tls_connection calls
the sentinel.
One change we could make is to set p->gnutls_boot_parameters to nil
here, so that in the sequence
Fgnutls_boot, handshake does not complete
handshake succeeds first time in wait_reading_process_output
delete_write_fd then checks p->gnutls_boot_parameters
the sentinel ends up getting run, but Iʼve not seen the handshake ever
succeed straight away before the delete_write_fd, and if it ever has
in the wild we would have seen bug reports (and this is dragon-filled
code, so I donʼt want to make changes to it if I can help it :-))
In short: I think the change is ok. It passes the network-stream
tests, so Iʼll run with it for a while, and push it in a week or so.
Robert
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/16
- bug#40665: 28.0.50; tls hang on local ssl, Robert Pluim, 2020/04/16
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/16
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/16
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/16
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/17
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/17
- bug#40665: 28.0.50; tls hang on local ssl,
Robert Pluim <=
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/19
- bug#40665: 28.0.50; tls hang on local ssl, Robert Pluim, 2020/04/19
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/20
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/20
- bug#40665: 28.0.50; tls hang on local ssl, Robert Pluim, 2020/04/21
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/21
- bug#40665: 28.0.50; tls hang on local ssl, Robert Pluim, 2020/04/21
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/21
- bug#40665: 28.0.50; tls hang on local ssl, Derek Zhou, 2020/04/21
- bug#40665: 28.0.50; tls hang on local ssl, Robert Pluim, 2020/04/21