[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug#32408] [PATCH shepherd] Allow replacement of services
From: |
Ludovic Courtès |
Subject: |
[bug#32408] [PATCH shepherd] Allow replacement of services |
Date: |
Tue, 21 Aug 2018 12:27:25 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) |
Hey Carlo,
Carlo Zancanaro <address@hidden> skribis:
> On Tue, Aug 21 2018, Ludovic Courtès wrote:
>> For instance, the ‘actions’ field of <shepherd-service> could, by
>> default, include an “upgrade” action that simply sets the
>> ‘replacement’ slot. For nginx, we’d provide a custom “upgrade”
>> action that does “nginx -s restart” or whatever it is that needs to
>> be done.
>>
>> ‘guix system reconfigure’ would automatically invoke the ‘upgrade’
>> action for each new service.
>>
>> WDYT?
>
> How many services can we meaningfully upgrade like this?
Probably very few.
> My understanding is that most of our system services a fed an immutable
> configuration file, and thus restarting/reloading won't actually
> upgrade them. In order to make an upgrade action work the service
> would have to mutate itself into a new correct configuration, as well
> as restarting/reloading the underlying daemon. It's even trickier if
> the daemon itself has been upgraded, because then the process will
> have to be restarted anyway.
Yeah.
> At any rate, I think the replacement mechanism (this patch) is just
> one way that a service can be reloaded. It would probably be a good
> idea to create a higher-level abstraction over it. I think other
> mechanisms (like a upgrade/reload action) should be handled on the
> Guix side of things.
Sounds good.
>>> + (let ((replacement (slot-ref service 'replacement)))
>>> + (define (copy-slot! slot)
>>> + (slot-set! service slot (slot-ref replacement slot)))
>>> + (when replacement
>>> + (copy-slot! 'provides)
>>> + (copy-slot! 'requires)
>>> + (copy-slot! 'respawn?)
>>> + (copy-slot! 'start)
>>> + (copy-slot! 'stop)
>>> + (copy-slot! 'actions)
>>> + (copy-slot! 'running)
>>> + (copy-slot! 'docstring))
>>> + service))
>>
>> Having a hardcoded list of slots sounds error-prone—surely we’ll
>> forget to update it down the road. I wonder what else could be
>> done.
>>
>> One option would be to grab the block asyncs and atomically replace
>> the service in the ‘%services’ hash table. Then we only need to
>> copy the ‘last-respawns’ slot to the new service, I believe. (This
>> changes the object identity of the service but I think its OK.)
>>
>> Another option would be to use GOOPS tricks to iterate over the list
>> of slots and have a list of slots *not* to copy. I’m not a big fan
>> of this option, though.
>
> My favourite option for this would be to separate the <service> object
> into an immutable <service> and a mutable <service-state>. The
> <service-state> object would have a reference to a <service> object in
> order to invoke actions on it, and it could also hold a second
> <service> object as a replacement. Then the swap would be much more
> straightforward. I haven't done any real work towards this, though.
I agree that separating state is the right approach longer-term (it’s
beyond the scope of this patch.)
> In the short term, I'd rather replace it in the %services hash
> table. I did it by copying slots because I wasn't sure I would get the
> details of the swap right and didn't have time to properly work out
> how to do it. I'll give it a go!
Alright!
>>> +(let ((service (lookup-running 'test)))
>>> + (slot-set! service 'replacement
>>> + (make <service>
>>
>> I wonder if we should let users fiddle with ‘replacement’ directly,
>> or if we should provide a higher-level construct.
>>
>> For instance, ‘register-services’ could transparently set the
>> ‘replacement’ field for services already registered instead of
>> doing:
>>
>> (assert (null? (lookup-services (canonical-name new))))
>>
>> Not sure if there are cases where this behavior would be
>> undesirable, though.
>>
>> Thoughts?
>
> With this current patch the replacement field is only checked at the
> point when the service is stopped, so the field could only be set when
> the service is actually running. I think it makes the most sense to
> just replace the service directly if it's not stopped.
>
> I can't think of any undesirable cases, but having a higher-level
> interface is a good idea.
Right. Currently the Guix side of things does the equivalent of:
herd eval root '(register-services (load "a.scm") (load "b.scm"))'
for services currently stopped, which is okay but not great. IWBN if it
could directly use a higher-level action.
> At the very least we need to control the inherent race condition
> involved in (if running? do-x do-y) for if the service is stopped
> after the running? check. At the moment I think the only thing we have
> to worry about there is signals, but if we're going to move to have
> more parallelism through fibers then we might need to be even more
> careful.
Indeed.
Thank you,
Ludo’.