[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#58485: [shepherd] Restarting guix-publish fails
From: |
Lars-Dominik Braun |
Subject: |
bug#58485: [shepherd] Restarting guix-publish fails |
Date: |
Mon, 20 Feb 2023 14:25:22 +0100 |
Hi Ludo,
> Can you confirm shepherd (PID 1) is 0.9.3?
it is:
root 1 0.2 0.2 308148 76816 ? Sl Feb07 52:08
/gnu/store/kphp5d85rrb3q1rdc2lfqc1mdklwh3qp-guile-3.0.9/bin/guile
--no-auto-compile
/gnu/store/4nw0zb4swga0cb8i35nvng3rg6z5qm8p-shepherd-0.9.3/bin/shepherd
--config /gnu/store/cvrai6z8777jf7860rnvppfznl1lcxi1-shepherd.conf
> ‘sudo herd restart ssh-daemon’ works fine on my laptop FWIW.
This works fine too. Only unattended-upgrades seems to have this issue :/
The strace looks unsuspicious right now:
---snip---
1 14:12:15.117035 read(21, "(shepherd-command (version 0) (action restart)
(service ssh-daemon) (arguments ()) (directory \"/root\"))", 1024) = 103
1 14:12:15.117254 close(27) = 0
1 14:12:15.117283 close(30) = 0
1 14:12:15.117416 newfstatat(AT_FDCWD, "/etc/localtime",
{st_dev=makedev(0x8, 0x2), st_ino=110100491, st_mode=S_IFREG|0444, st_nlink=1,
st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, s
t_size=2298, st_atime=1676898665 /* 2023-02-20T14:11:05.338746772+0100 */,
st_atime_nsec=338746772, st_mtime=1676898664 /*
2023-02-20T14:11:04.874743456+0100 */, st_mtime_nsec=874743456, st_c
time=1676898664 /* 2023-02-20T14:11:04.874743456+0100 */,
st_ctime_nsec=874743456}, 0) = 0
1 14:12:15.117475 write(17, "shepherd[1]: Service ssh-daemon has been
stopped.\n", 50) = 50
1 14:12:15.117524 socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK,
IPPROTO_IP) = 26
1 14:12:15.117561 setsockopt(26, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
1 14:12:15.117598 bind(26, {sa_family=AF_INET, sin_port=htons(2222),
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)
1 14:12:15.117724 write(21, "(reply (version 0) (result #f) (error (error
(version 0) action-exception start ssh-daemon system-error (\"bind\" \"~A\"
(\"Address already in use\") (98)))) (messages (\"Service ssh-daemon has been
stopped.\")))", 204) = 204
1 14:12:15.117754 close(21) = 0
---snap---
But nginx seems to have the same issue, except that it does not fail
entirely and succeeds after waiting a short period of time:
---snip---
2023/02/20 14:12:14 [notice] 7136#0: signal 15 (SIGTERM) received from 6644,
exiting
2023/02/20 14:12:14 [notice] 7137#0: exiting
2023/02/20 14:12:14 [notice] 7137#0: exit
2023/02/20 14:12:14 [notice] 7136#0: signal 17 (SIGCHLD) received from 7137
2023/02/20 14:12:14 [notice] 7136#0: worker process 7137 exited with code 0
2023/02/20 14:12:14 [emerg] 6645#0: bind() to 0.0.0.0:443 failed (98: Address
already in use)
2023/02/20 14:12:14 [emerg] 6645#0: bind() to 0.0.0.0:80 failed (98: Address
already in use)
2023/02/20 14:12:14 [emerg] 6645#0: bind() to [::]:80 failed (98: Address
already in use)
2023/02/20 14:12:14 [notice] 7136#0: exit
2023/02/20 14:12:14 [notice] 6645#0: try again to bind() after 500ms
2023/02/20 14:12:14 [notice] 6645#0: using the "epoll" event method
2023/02/20 14:12:14 [notice] 6645#0: nginx/1.23.3
2023/02/20 14:12:14 [notice] 6645#0: OS: Linux 6.1.9
2023/02/20 14:12:14 [notice] 6645#0: getrlimit(RLIMIT_NOFILE): 1024:4096
2023/02/20 14:12:14 [notice] 6648#0: start worker processes
2023/02/20 14:12:14 [notice] 6648#0: start worker process 6649
2023/02/20 14:12:32 [info] 6649#0: epoll_wait() failed (4: Interrupted system
call)
---snap---
I see we’re already using SO_REUSEADDR, so all of this is a bit of a
mystery to me.
Thanks,
Lars
--
Lars-Dominik Braun
Wissenschaftlicher Mitarbeiter/Research Associate
www.leibniz-psychology.org
ZPID - Leibniz-Institut für Psychologie /
ZPID - Leibniz Institute for Psychology
Universitätsring 15
D-54296 Trier - Germany
Tel.: +49–651–201-4964
signature.asc
Description: PGP signature