bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#58485: [shepherd] Restarting guix-publish fails


From: Lars-Dominik Braun
Subject: bug#58485: [shepherd] Restarting guix-publish fails
Date: Mon, 20 Feb 2023 14:25:22 +0100

Hi Ludo,

> Can you confirm shepherd (PID 1) is 0.9.3?
it is:

root         1  0.2  0.2 308148 76816 ?        Sl   Feb07  52:08 
/gnu/store/kphp5d85rrb3q1rdc2lfqc1mdklwh3qp-guile-3.0.9/bin/guile 
--no-auto-compile 
/gnu/store/4nw0zb4swga0cb8i35nvng3rg6z5qm8p-shepherd-0.9.3/bin/shepherd 
--config /gnu/store/cvrai6z8777jf7860rnvppfznl1lcxi1-shepherd.conf

> ‘sudo herd restart ssh-daemon’ works fine on my laptop FWIW.
This works fine too. Only unattended-upgrades seems to have this issue :/

The strace looks unsuspicious right now:

---snip---
1     14:12:15.117035 read(21, "(shepherd-command (version 0) (action restart) 
(service ssh-daemon) (arguments ()) (directory \"/root\"))", 1024) = 103
1     14:12:15.117254 close(27)         = 0
1     14:12:15.117283 close(30)         = 0
1     14:12:15.117416 newfstatat(AT_FDCWD, "/etc/localtime", 
{st_dev=makedev(0x8, 0x2), st_ino=110100491, st_mode=S_IFREG|0444, st_nlink=1, 
st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, s
t_size=2298, st_atime=1676898665 /* 2023-02-20T14:11:05.338746772+0100 */, 
st_atime_nsec=338746772, st_mtime=1676898664 /* 
2023-02-20T14:11:04.874743456+0100 */, st_mtime_nsec=874743456, st_c
time=1676898664 /* 2023-02-20T14:11:04.874743456+0100 */, 
st_ctime_nsec=874743456}, 0) = 0
1     14:12:15.117475 write(17, "shepherd[1]: Service ssh-daemon has been 
stopped.\n", 50) = 50
1     14:12:15.117524 socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 
IPPROTO_IP) = 26
1     14:12:15.117561 setsockopt(26, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
1     14:12:15.117598 bind(26, {sa_family=AF_INET, sin_port=htons(2222), 
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)
1     14:12:15.117724 write(21, "(reply (version 0) (result #f) (error (error 
(version 0) action-exception start ssh-daemon system-error (\"bind\" \"~A\" 
(\"Address already in use\") (98)))) (messages (\"Service ssh-daemon has been 
stopped.\")))", 204) = 204
1     14:12:15.117754 close(21)         = 0
---snap---

But nginx seems to have the same issue, except that it does not fail
entirely and succeeds after waiting a short period of time:

---snip---
2023/02/20 14:12:14 [notice] 7136#0: signal 15 (SIGTERM) received from 6644, 
exiting
2023/02/20 14:12:14 [notice] 7137#0: exiting
2023/02/20 14:12:14 [notice] 7137#0: exit
2023/02/20 14:12:14 [notice] 7136#0: signal 17 (SIGCHLD) received from 7137
2023/02/20 14:12:14 [notice] 7136#0: worker process 7137 exited with code 0
2023/02/20 14:12:14 [emerg] 6645#0: bind() to 0.0.0.0:443 failed (98: Address 
already in use)
2023/02/20 14:12:14 [emerg] 6645#0: bind() to 0.0.0.0:80 failed (98: Address 
already in use)
2023/02/20 14:12:14 [emerg] 6645#0: bind() to [::]:80 failed (98: Address 
already in use)
2023/02/20 14:12:14 [notice] 7136#0: exit
2023/02/20 14:12:14 [notice] 6645#0: try again to bind() after 500ms
2023/02/20 14:12:14 [notice] 6645#0: using the "epoll" event method
2023/02/20 14:12:14 [notice] 6645#0: nginx/1.23.3
2023/02/20 14:12:14 [notice] 6645#0: OS: Linux 6.1.9
2023/02/20 14:12:14 [notice] 6645#0: getrlimit(RLIMIT_NOFILE): 1024:4096
2023/02/20 14:12:14 [notice] 6648#0: start worker processes
2023/02/20 14:12:14 [notice] 6648#0: start worker process 6649
2023/02/20 14:12:32 [info] 6649#0: epoll_wait() failed (4: Interrupted system 
call)
---snap---

I see we’re already using SO_REUSEADDR, so all of this is a bit of a
mystery to me.

Thanks,
Lars

-- 
Lars-Dominik Braun
Wissenschaftlicher Mitarbeiter/Research Associate

www.leibniz-psychology.org
ZPID - Leibniz-Institut für Psychologie /
ZPID - Leibniz Institute for Psychology
Universitätsring 15
D-54296 Trier - Germany
Tel.: +49–651–201-4964

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]