[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang
From: |
Kevin Wolf |
Subject: |
Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang |
Date: |
Mon, 28 Sep 2020 12:57:11 +0200 |
Am 27.09.2020 um 15:04 hat Ying Fang geschrieben:
> A VM in the cloud environment may use a virutal disk as the backend storage,
> and there are usually filesystems on the virtual block device. When backend
> storage is temporarily down, any I/O issued to the virtual block device will
> cause an error. For example, an error occurred in ext4 filesystem would make
> the filesystem readonly. However a cloud backend storage can be soon
> recovered.
> For example, an IP-SAN may be down due to network failure and will be online
> soon after network is recovered. The error in the filesystem may not be
> recovered unless a device reattach or system restart. So an I/O rehandle is
> in need to implement a self-healing mechanism.
>
> This patch series propose a feature called I/O hang. It can rehandle AIOs
> with EIO error without sending error back to guest. From guest's perspective
> of view it is just like an IO is hanging and not returned. Guest can get
> back running smoothly when I/O is recovred with this feature enabled.
What is the problem with setting werror=stop and rerror=stop for the
device? Is it that QEMU won't automatically retry, but management tool
interaction is required to resume the guest?
I haven't checked your patches in detail yet, but implementing this
functionality in the backend means that blk_drain() will hang (or if it
doesn't hang, it doesn't do what it's supposed to do), making the whole
QEMU process unresponsive until the I/O succeeds again. Amongst others,
this would make it impossible to migrate away from a host with storage
problems.
Kevin
- [RFC PATCH 6/7] qemu-option: add I/O hang timeout option, (continued)
- [RFC PATCH 6/7] qemu-option: add I/O hang timeout option, Ying Fang, 2020/09/27
- [RFC PATCH 7/7] qapi: add I/O hang and I/O hang timeout qapi event, Ying Fang, 2020/09/27
- [RFC PATCH 1/7] block-backend: introduce I/O rehandle info, Ying Fang, 2020/09/27
- [RFC PATCH 4/7] block-backend: add I/O hang drain when disbale, Ying Fang, 2020/09/27
- [RFC PATCH 2/7] block-backend: rehandle block aios when EIO, Ying Fang, 2020/09/27
- [RFC PATCH 3/7] block-backend: add I/O hang timeout, Ying Fang, 2020/09/27
- [RFC PATCH 5/7] virtio-blk: disable I/O hang when resetting, Ying Fang, 2020/09/27
- Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang, no-reply, 2020/09/27
- Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang, no-reply, 2020/09/27
- Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang,
Kevin Wolf <=