[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 5/7] spapr_drc.c: introduce unplug_timeout_timer
From: |
David Gibson |
Subject: |
Re: [PATCH v3 5/7] spapr_drc.c: introduce unplug_timeout_timer |
Date: |
Wed, 17 Feb 2021 12:20:54 +1100 |
On Thu, Feb 11, 2021 at 07:52:44PM -0300, Daniel Henrique Barboza wrote:
> The LoPAR spec provides no way for the guest kernel to report failure of
> hotplug/hotunplug events. This wouldn't be bad if those operations were
> granted to always succeed, but that's far for the reality.
>
> What ends up happening is that, in the case of a failed hotunplug,
> regardless of whether it was a QEMU error or a guest misbehavior, the pSeries
> machine is retaining the unplug state of the device in the running guest.
> This state is cleanup in machine reset, where it is assumed that this state
> represents a device that is pending unplug, and the device is hotunpluged
> from the board. Until the reset occurs, any hotunplug operation of the same
> device is forbid because there is a pending unplug state.
>
> This behavior has at least one undesirable side effect. A long standing
> pending
> unplug state is, more often than not, the result of a hotunplug error. The
> user
> had to dealt with it, since retrying to unplug the device is noy allowed, and
> then
> in the machine reset we're removing the device from the guest. This means that
> we're failing the user twice - failed to hotunplug when asked, then
> hotunplugged
> without notice.
>
> Solutions to this problem range between trying to predict when the hotunplug
> will
> fail and forbid the operation from the QEMU layer, from opening up the IRQ
> queue
> to allow for multiple hotunplug attempts, from telling the users to 'reboot
> the
> machine if something goes wrong'. The first solution is flawed because we
> can't
> fully predict guest behavior from QEMU, the second solution is a trial and
> error
> remediation that counts on a hope that the unplug will eventually succeed,
> and the
> third is ... well.
>
> This patch introduces a crude, but effective solution to hotunplug errors in
> the pSeries machine. For each unplug done, we'll timeout after some time. If
> a certain amount of time passes, we'll cleanup the hotunplug state from the
> machine.
> During the timeout period, any unplug operations in the same device will still
> be blocked. After that, we'll assume that the guest failed the operation, and
> allow the user to try again. If the timeout is too short we'll prevent
> legitimate
> hotunplug situations to occur, so we'll need to overestimate the regular time
> an unplug operation takes to succeed to account that.
>
> The true solution for the hotunplug errors in the pSeries machines is a PAPR
> change
> to allow for the guest to warn the platform about it. For now, the work done
> in this
> timeout design can be used for the new PAPR 'abort hcall' in the future,
> given that
> for both cases we'll need code to cleanup the existing unplug states of the
> DRCs.
>
> At this moment we're adding the basic wiring of the timer into the DRC. Next
> patch
> will use the timer to timeout failed CPU hotunplugs.
>
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> ---
> hw/ppc/spapr_drc.c | 36 ++++++++++++++++++++++++++++++++++++
> include/hw/ppc/spapr_drc.h | 2 ++
> 2 files changed, 38 insertions(+)
>
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index 67041fb212..c88bb524c5 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -57,6 +57,8 @@ static void spapr_drc_release(SpaprDrc *drc)
> drck->release(drc->dev);
>
> drc->unplug_requested = false;
> + timer_del(drc->unplug_timeout_timer);
> +
> g_free(drc->fdt);
> drc->fdt = NULL;
> drc->fdt_start_offset = 0;
> @@ -453,6 +455,24 @@ static const VMStateDescription
> vmstate_spapr_drc_unplug_requested = {
> }
> };
>
> +static bool spapr_drc_unplug_timeout_timer_needed(void *opaque)
> +{
> + SpaprDrc *drc = opaque;
> +
> + return timer_pending(drc->unplug_timeout_timer);
> +}
> +
> +static const VMStateDescription vmstate_spapr_drc_unplug_timeout_timer = {
> + .name = "DRC unplug timeout timer",
> + .version_id = 1,
> + .minimum_version_id = 1,
> + .needed = spapr_drc_unplug_timeout_timer_needed,
> + .fields = (VMStateField[]) {
> + VMSTATE_TIMER_PTR(unplug_timeout_timer, SpaprDrc),
> + VMSTATE_END_OF_LIST()
> + }
> +};
> +
> static bool spapr_drc_needed(void *opaque)
> {
> SpaprDrc *drc = opaque;
> @@ -486,10 +506,20 @@ static const VMStateDescription vmstate_spapr_drc = {
> },
> .subsections = (const VMStateDescription * []) {
> &vmstate_spapr_drc_unplug_requested,
> + &vmstate_spapr_drc_unplug_timeout_timer,
> NULL
> }
> };
>
> +static void drc_unplug_timeout_cb(void *opaque)
> +{
> + SpaprDrc *drc = opaque;
> +
> + if (drc->unplug_requested) {
> + drc->unplug_requested = false;
> + }
Sorry, forgot to mention in first pass - I think we want some kind of
reporting here. At least a trace, and maybe also a warn_report() or
the like.
Hrm.. looking wider, I wonder if we should add a DEVICE_DELETE_FAILED
message to QAPI to mirror the DEVICE_DELETED message. Fixing PAPR is
pretty tedious, but fixing at least qemu's own interfaces is a bit
more approachable. That could certainly be a follow up enhancement,
though.
> +}
> +
> static void drc_realize(DeviceState *d, Error **errp)
> {
> SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
> @@ -512,6 +542,11 @@ static void drc_realize(DeviceState *d, Error **errp)
> object_property_add_alias(root_container, link_name,
> drc->owner, child_name);
> g_free(link_name);
> +
> + drc->unplug_timeout_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
> + drc_unplug_timeout_cb,
> + drc);
> +
> vmstate_register(VMSTATE_IF(drc), spapr_drc_index(drc),
> &vmstate_spapr_drc,
> drc);
> trace_spapr_drc_realize_complete(spapr_drc_index(drc));
> @@ -529,6 +564,7 @@ static void drc_unrealize(DeviceState *d)
> name = g_strdup_printf("%x", spapr_drc_index(drc));
> object_property_del(root_container, name);
> g_free(name);
> + timer_free(drc->unplug_timeout_timer);
> }
>
> SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index 02a63b3666..b2e6222d09 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -187,6 +187,8 @@ typedef struct SpaprDrc {
> bool unplug_requested;
> void *fdt;
> int fdt_start_offset;
> +
> + QEMUTimer *unplug_timeout_timer;
> } SpaprDrc;
>
> struct SpaprMachineState;
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
- [PATCH v3 2/7] spapr_pci.c: simplify spapr_pci_unplug_request() function handling, (continued)
[PATCH v3 3/7] spapr_drc.c: use spapr_drc_release() in isolate_physical/set_unusable, Daniel Henrique Barboza, 2021/02/11
[PATCH v3 5/7] spapr_drc.c: introduce unplug_timeout_timer, Daniel Henrique Barboza, 2021/02/11
[PATCH v3 4/7] spapr: rename spapr_drc_detach() to spapr_drc_unplug_request(), Daniel Henrique Barboza, 2021/02/11
[PATCH v3 6/7] spapr_drc.c: add hotunplug timeout for CPUs, Daniel Henrique Barboza, 2021/02/11
[PATCH v3 7/7] spapr_drc.c: use DRC reconfiguration to cleanup DIMM unplug state, Daniel Henrique Barboza, 2021/02/11