qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots


From: Andrey Gruzdev
Subject: Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots
Date: Thu, 11 Feb 2021 12:21:51 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 09.02.2021 23:31, Peter Xu wrote:
On Tue, Feb 09, 2021 at 03:09:28PM -0500, Peter Xu wrote:
Hi, David, Andrey,

On Tue, Feb 09, 2021 at 08:06:58PM +0100, David Hildenbrand wrote:
Hi,

just stumbled over this, quick question:

I recently played with UFFD_WP and notices that write protection is
only effective on pages/ranges that have already pages populated (IOW:
!pte_none() in the kernel).

In case memory was never populated (or was discarded using e.g.,
madvice(DONTNEED)), write-protection will be skipped silently and you
won't get WP events for applicable pages.

So if someone writes to a yet unpoupulated page ("zero"), you won't
get WP events.

I can spot that you do a single uffd_change_protection() on the whole
RAMBlock.

How are you handling that scenario, or why don't you have to handle
that scenario?
Good catch..  Indeed I overlooked that as well when reviewing the code.


            
Hi David,

I really wonder if such a problem exists.. If we are talking about a
I immediately ran into this issue with my simplest test cases. :)

write to an unpopulated page, we should get first page fault on
non-present page and populate it with protection bits from respective vma.
For UFFD_WP vma's  page will be populated non-writable. So we'll get
another page fault on present but read-only page and go to handle_userfault.
The problem is even if the page is read-only, it does not yet have the uffd-wp
bit set, so it won't really trigger the handle_userfault() path.

You might have to register also for MISSING faults and place zero pages.
So I think what's missing for live snapshot is indeed to register with both
missing & wp mode.

Then we'll receive two messages: For wp, we do like before.  For missing, we do
UFFDIO_ZEROCOPY and at the same time dump this page as a zero page.

I bet live snapshot didn't encounter this issue simply because normal live
snapshots would still work, especially when there's the guest OS. Say, the
worst case is we could have migrated some zero pages with some random data
filled in along with the snapshot, however all these pages were zero pages and
not used by the guest OS after all, then when we load a snapshot we won't
easily notice either..
I'm thinking some way to verify this from live snapshot pov, and I've got an
idea so I just share it out...  Maybe we need a guest application that does
something like below:

  - mmap() a huge lot of memory

  - call mlockall(), so that pages will be provisioned in the guest but without
    data written.  IIUC on the host these pages should be backed by missing
    pages as long as guest app doesn't write.  Then...

  - the app starts to read input from user:

    - If user inputs "dirty" and enter: it'll start to dirty the whole range.
      Write non-zero to the 1st byte of each page would suffice.

    - If user inputs "check" and enter: it'll read the whole memory chunk to
      see whether all the pages are zero pages.  If it reads any non-zero page,
      it should bail out and print error.

With the help of above program, we can do below to verify the live snapshot
worked as expected on zero pages:

  - Guest: start above program, don't input yet (so waiting to read either
    "dirty" or "check" command)

  - Host: start live snapshot

  - Guest: input "dirty" command, so start quickly dirtying the ram

  - Host: live snapshot completes

Then to verify the snapshot image, we do:

  - Host: load the snapshot we've got

  - Guest: (should still be in the state of waiting for cmd) this time we enter
    "check"

Thanks,

Hi David, Peter,

A little unexpected behavior, from my point of view, for UFFD write-protection.
So, that means that UFFD_WP protection/events works only for locked memory?
I'm now looking at kernel implementation, to understand..
-- 
Andrey Gruzdev, Principal Engineer
Virtuozzo GmbH  +7-903-247-6397
                virtuzzo.com

reply via email to

[Prev in Thread] Current Thread [Next in Thread]