[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] block: asynchronously stop the VM on I/O errors
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] [PATCH] block: asynchronously stop the VM on I/O errors |
Date: |
Tue, 3 Jun 2014 16:37:19 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Am 03.06.2014 um 16:16 hat Paolo Bonzini geschrieben:
> With virtio-blk dataplane, I/O errors might occur while QEMU is
> not in the main I/O thread. However, it's invalid to call vm_stop
> when we're neither in a VCPU thread nor in the main I/O thread,
> even if we were to take the iothread mutex around it.
>
> To avoid this problem, simply raise a request to the main I/O thread,
> similar to what QEMU does when vm_stop is called from a CPU thread.
> We know that bdrv_error_action is called from an AIO callback, and
> the moment at which the callback will fire is not well-defined; it
> depends on the moment at which the disk or OS finishes the operation,
> which can happen at any time.
>
> Note that QEMU is certainly not in a CPU thread and we do not need to
> call cpu_stop_current() like vm_stop() does.
Do I understand correctly that this is not a fundamental truth of qemu's
operation, but holds true only because the drivers that do support
rerror/werror all use bdrv_aio_readv/writev(), which guarantees that a
BH is used in error cases? Otherwise I think an I/O handler in a vcpu
thread could directly call into the block layer and fail immediately
(might happen for example if we added rerror/werror support to ATAPI).
> This makes bdrv_error_action() thread safe.
>
> Signed-off-by: Paolo Bonzini <address@hidden>
> ---
> block.c | 2 +-
> stubs/vm-stop.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/block.c b/block.c
> index fc2edd3..fa41598 100644
> --- a/block.c
> +++ b/block.c
> @@ -3515,7 +3515,7 @@ void bdrv_error_action(BlockDriverState *bs,
> BlockErrorAction action,
> assert(error >= 0);
> bdrv_emit_qmp_error_event(bs, QEVENT_BLOCK_IO_ERROR, action, is_read);
> if (action == BDRV_ACTION_STOP) {
> - vm_stop(RUN_STATE_IO_ERROR);
> + qemu_system_vmstop_request(RUN_STATE_IO_ERROR);
> bdrv_iostatus_set_err(bs, error);
By delaying the actual state change, does this break the invariant that
bs->iostatus is BLOCK_DEVICE_IO_STATUS_OK while the VM is running?
I know this invariant was mentioned occasionally. Not sure if anything
actually breaks when it's violated, though.
Kevin