qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC 0/4] mirror: implement incremental and bitmap modes


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [RFC 0/4] mirror: implement incremental and bitmap modes
Date: Thu, 29 Feb 2024 14:48:05 +0300
User-agent: Mozilla Thunderbird

On 29.02.24 13:11, Fiona Ebner wrote:
Am 28.02.24 um 17:06 schrieb Vladimir Sementsov-Ogievskiy:
On 28.02.24 19:00, Vladimir Sementsov-Ogievskiy wrote:
On 16.02.24 13:55, Fiona Ebner wrote:
Now, the IO test added in patch 4/4 actually contains yet another use
case, namely doing incremental mirrors to stand-alone qcow2 "diff"
images, that only contain the delta and can be rebased later. I had to
adapt the IO test, because its output expected the mirror bitmap to
still be dirty, but nowadays the mirror is apparently already done
when the bitmaps are queried. So I thought, I'll just use
'write-blocking' mode to avoid any potential timing issues.

But this exposed an issue with the diff image approach. If a write is
not aligned to the granularity of the mirror target, then rebasing the
diff image onto a backing image will not yield the desired result,
because the full cluster is considered to be allocated and will "hide"
some part of the base/backing image. The failure can be seen by either
using 'write-blocking' mode in the IO test or setting the (bitmap)
granularity to 32 KiB rather than the current 64 KiB.

The question is how to deal with these edge cases? Some possibilities
that would make sense to me:

For 'background' mode:
* prohibit if target's cluster size is larger than the bitmap
    granularity
* document the limitation

For 'write-blocking' mode:
* disallow in combination with bitmap mode (would not be happy about
    it, because I'd like to use this without diff images)

why not just require the same: bitmap granularity must be >= target
granularity


For the iotest's use-case, that only works for background mode. I'll
explain below.

* for writes that are not aligned to the target's cluster size, read
    the relevant/missing parts from the source image to be able to write
    whole target clusters (seems rather complex)

There is another approach: consider and unaligned part of the request,
fit in one cluster (we can always split any request to "aligned"
middle part, and at most two small "unligned" parts, each fit into one
cluster).

We have two possibilities:

1. the cluster is dirty (marked dirty in the bitmap used by background
process)

We can simply ignore this part and rely on background process. This
will not affect the convergence of the mirror job.


Agreed.

2. the cluster is clear (i.e. background process, or some previous
write already copied it)


The iotest creates a new target image for each incremental sync which
only records the diff relative to the previous mirror and those diff
images are later rebased onto each other to get the full picture.

Thus, it can be that a previous mirror job (not just background process
or previous write) already copied a cluster, and in particular, copied
it to a different target!

Aha understand.

For simplicity, let's consider case, when source "cluster size" = "job cluster size" = "bitmap 
granularity" = "target cluster size".

Which types of clusters we should consider, when we want to handle guest write?

1. Clusters, that should be copied by background process

These are dirty clusters from user-given bitmap, or if we do a full-disk 
mirror, all clusters, not yet copied by background process.

For such clusters we simply ignore the unaligned write. We can even ignore the 
aligned write too: less disturbing the guest by delays.

2. Clusters, already copied by background process during this mirror job and 
not dirtied by guest since this time.

For such clusters we are safe to do unaligned write, as target cluster must be 
allocated.

3. Clusters, not marked initially by dirty bitmap.

What to do with them? We can't do unaligned write. I see two variants:

- do additional read from source, to fill the whole cluster, which seems a bit 
too heavy

- just mark the cluster as dirty for background job. So we behave like in "background" mode. But why not? The 
maximum count of such "hacks" is limited to number of "clear" clusters at start of mirror job, 
which means that we don't seriously affect the convergence. Mirror is guaranteed to converge anyway. And the whole 
sense of "write-blocking" mode is to have a guaranteed convergence. What do you think?


----

Of course, we can't distinguish 3 types by on dirty bitmap, so we need the second one. 
For example "done_bitmap", where we can mark clusters that were successfully 
copied. That would be a kind of block-status of target image. But using bitmap is a lot 
better than querying block-status from target.



In this case, we are safe to do unaligned write, as target cluster
must be allocated.

Because the diff image is new, the target's cluster is not necessarily
allocated. When using write-blocking and a write of, e.g., 9 bytes to a
clear source cluster comes in, only those 9 bytes are written to the
target. Now the target's cluster is allocated but with only those 9
bytes of data. When rebasing, the previously copied cluster is "masked"
and when reading the rebased image, we only see the cluster with those 9
bytes (and IIRC, zeroes for the rest of the cluster rather than the
previously copied data).


(for bitmap-mode, I don't consider here clusters that are clear from
the start, which we shouldn't copy in any case)


We do need to copy new writes to any cluster, and with a clear cluster
and write-blocking, the issue can manifest.

OK right, I was misunderstanding bitmap-mode for mirror. Now I understand.



Hmm, right, and that's exactly the logic we already have in
do_sync_target_write(). So that's enough just to require that
bitmap_granularity >= target_granularity


Best Regards,
Fiona


--
Best regards,
Vladimir




reply via email to

[Prev in Thread] Current Thread [Next in Thread]