qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 4/8] KVM: Extend the memslot to support fd-based private m


From: Sean Christopherson
Subject: Re: [PATCH v6 4/8] KVM: Extend the memslot to support fd-based private memory
Date: Mon, 23 May 2022 15:22:32 +0000

On Mon, May 23, 2022, Chao Peng wrote:
> On Fri, May 20, 2022 at 06:31:02PM +0000, Sean Christopherson wrote:
> > On Fri, May 20, 2022, Andy Lutomirski wrote:
> > > The alternative would be to have some kind of separate table or bitmap 
> > > (part
> > > of the memslot?) that tells KVM whether a GPA should map to the fd.
> > > 
> > > What do you all think?
> > 
> > My original proposal was to have expolicit shared vs. private memslots, and 
> > punch
> > holes in KVM's memslots on conversion, but due to the way KVM (and 
> > userspace)
> > handle memslot updates, conversions would be painfully slow.  That's how we 
> > ended
> > up with the current propsoal.
> > 
> > But a dedicated KVM ioctl() to add/remove shared ranges would be easy to 
> > implement
> > and wouldn't necessarily even need to interact with the memslots.  It could 
> > be a
> > consumer of memslots, e.g. if we wanted to disallow registering regions 
> > without an
> > associated memslot, but I think we'd want to avoid even that because things 
> > will
> > get messy during memslot updates, e.g. if dirty logging is toggled or a 
> > shared
> > memory region is temporarily removed then we wouldn't want to destroy the 
> > tracking.
> 
> Even we don't tight that to memslots, that info can only be effective
> for private memslot, right? Setting this ioctl to memory ranges defined
> in a traditional non-private memslots just makes no sense, I guess we can
> comment that in the API document.

Hrm, applying it universally would be funky, e.g. emulated MMIO would need to be
declared "shared".  But, applying it selectively would arguably be worse, e.g.
letting userspace map memory into the guest as shared for a region that's 
registered
as private...

On option to that mess would be to make memory shared by default, and so 
userspace
must declare regions that are private.  Then there's no weirdness with emulated 
MMIO
or "legacy" memslots.

On page fault, KVM does a lookup to see if the GPA is shared or private.  If the
GPA is private, but there is no memslot or the memslot doesn't have a private 
fd,
KVM exits to userspace.  If there's a memslot with a private fd, the 
shared/private
flag is used to resolve the 

And to handle the ioctl(), KVM can use kvm_zap_gfn_range(), which will bump the
notifier sequence, i.e. force the page fault to retry if the GPA may have been
(un)registered between checking the type and acquiring mmu_lock.

> > I don't think we'd want to use a bitmap, e.g. for a well-behaved guest, 
> > XArray
> > should be far more efficient.
> 
> What about the mis-behaved guest? I don't want to design for the worst
> case, but people may raise concern on the attack from such guest.

That's why cgroups exist.  E.g. a malicious/broken L1 can similarly abuse nested
EPT/NPT to generate a large number of shadow page tables.

> > One benefit to explicitly tracking this in KVM is that it might be useful 
> > for
> > software-only protected VMs, e.g. KVM could mark a region in the XArray as 
> > "pending"
> > based on guest hypercalls to share/unshare memory, and then complete the 
> > transaction
> > when userspace invokes the ioctl() to complete the share/unshare.
> 
> OK, then this can be another field of states/flags/attributes. Let me
> dig up certain level of details:
> 
> First, introduce below KVM ioctl
> 
> KVM_SET_MEMORY_ATTR

Actually, if the semantics are that userspace declares memory as private, then 
we
can reuse KVM_MEMORY_ENCRYPT_REG_REGION and KVM_MEMORY_ENCRYPT_UNREG_REGION.  
It'd
be a little gross because we'd need to slightly redefine the semantics for TDX, 
SNP,
and software-protected VM types, e.g. the ioctls() currently require a 
pre-exisitng
memslot.  But I think it'd work...

I'll think more on this...



reply via email to

[Prev in Thread] Current Thread [Next in Thread]