qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2 24/32] hw/cxl/device: Add a memory device (8.2.8.5)


From: Jonathan Cameron
Subject: Re: [RFC PATCH v2 24/32] hw/cxl/device: Add a memory device (8.2.8.5)
Date: Thu, 28 Jan 2021 17:40:09 +0000

On Thu, 28 Jan 2021 08:58:01 -0800
Ben Widawsky <ben@bwidawsk.net> wrote:

> On 21-01-28 08:51:51, Ben Widawsky wrote:
> > On 21-01-28 07:14:44, Ben Widawsky wrote:  
> > > On 21-01-28 07:03:18, Ben Widawsky wrote:  
> > > > On 21-01-28 10:25:38, Jonathan Cameron wrote:  
> > > > > On Wed, 27 Jan 2021 13:26:45 -0800
> > > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >   
> > > > > > On 21-01-27 22:03:12, Igor Mammedov wrote:  
> > > > > > > On Tue,  5 Jan 2021 08:53:15 -0800
> > > > > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > > > >     
> > > > > > > > A CXL memory device (AKA Type 3) is a CXL component that 
> > > > > > > > contains some
> > > > > > > > combination of volatile and persistent memory. It also 
> > > > > > > > implements the
> > > > > > > > previously defined mailbox interface as well as the memory 
> > > > > > > > device
> > > > > > > > firmware interface.
> > > > > > > > 
> > > > > > > > The following example will create a 256M device in a 512M 
> > > > > > > > window:
> > > > > > > > 
> > > > > > > > -object 
> > > > > > > > "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
> > > > > > > > -device 
> > > > > > > > "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"    
> > > > > > > 
> > > > > > > I'd expect whole backend used by frontend, so one would not need 
> > > > > > > "size" property
> > > > > > > on frontend (like we do with memory devices).
> > > > > > > So question is why it partially uses memdev?    
> > > > > > 
> > > > > > Answered in a separate thread...  
> > > > > 
> > > > > One possible suggestion inline.
> > > > >   
> > > > > > > > +
> > > > > > > > +static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> > > > > > > > +{
> > > > > > > > +    MemoryRegionSection mrs;
> > > > > > > > +    MemoryRegion *mr;
> > > > > > > > +    uint64_t offset = 0;
> > > > > > > > +    size_t remaining_size;
> > > > > > > > +
> > > > > > > > +    if (!ct3d->hostmem) {
> > > > > > > > +        error_setg(errp, "memdev property must be set");
> > > > > > > > +        return;
> > > > > > > > +    }
> > > > > > > > +
> > > > > > > > +    /* FIXME: need to check mr is the host bridge's MR */
> > > > > > > > +    mr = host_memory_backend_get_memory(ct3d->hostmem);
> > > > > > > > +
> > > > > > > > +    /* Create our new subregion */
> > > > > > > > +    ct3d->cxl_dstate.pmem = g_new(MemoryRegion, 1);
> > > > > > > > +
> > > > > > > > +    /* Find the first free space in the window */
> > > > > > > > +    WITH_RCU_READ_LOCK_GUARD()
> > > > > > > > +    {
> > > > > > > > +        mrs = memory_region_find(mr, offset, 1);
> > > > > > > > +        while (mrs.mr && mrs.mr != mr) {
> > > > > > > > +            offset += memory_region_size(mrs.mr);
> > > > > > > > +            mrs = memory_region_find(mr, offset, 1);
> > > > > > > > +        }
> > > > > > > > +    }
> > > > > > > > +
> > > > > > > > +    remaining_size = memory_region_size(mr) - offset;
> > > > > > > > +    if (remaining_size < ct3d->size) {
> > > > > > > > +        g_free(ct3d->cxl_dstate.pmem);
> > > > > > > > +        error_setg(errp,
> > > > > > > > +                   "Not enough free space (%zd) required for 
> > > > > > > > device (%" PRId64  ")",
> > > > > > > > +                   remaining_size, ct3d->size);
> > > > > > > > +    }
> > > > > > > > +
> > > > > > > > +    /* Register our subregion as non-volatile */
> > > > > > > > +    memory_region_init_ram(ct3d->cxl_dstate.pmem, OBJECT(ct3d),
> > > > > > > > +                           "cxl_type3-memory", ct3d->size, 
> > > > > > > > errp);    
> > > > > > > this allocates ct3d->size of anon RAM, was this an intention?
> > > > > > > If yes, can you clarify why extra RAM is used instead of using 
> > > > > > > what
> > > > > > > backend provides?    
> > > > > > 
> > > > > > It sounds like I'm doing the wrong thing then. There should be one 
> > > > > > chunk of
> > > > > > memory which is a subset of the full memory backend object. Could 
> > > > > > you please
> > > > > > advise on what I should be doing instead? Is add_subregion() 
> > > > > > sufficient?  
> > > > > 
> > > > > Taking inspiration from nvdimm I'm carrying a patch that uses
> > > > > memory_region_init_alias(ct3d->cxl_dstate.pmem, OBJECT(qct3d)q,
> > > > >                        "cxl_type3-memory", mr, offset, ct3d->size);
> > > > > 
> > > > > I 'think' that's doing the right thing, but haven't fully tested it 
> > > > > yet
> > > > > so may be completely wrong :)
> > > > > 
> > > > > Then for the pmem addr, call memory_region_set_address() to put it
> > > > > in a particular location.
> > > > >   
> > > > 
> > > > Yes - this is what I'd like to do and what I initially tried, and I 
> > > > also believe
> > > > it's right, but it doesn't work.
> > > > 
> > > > range_invariant: Assertion `range->lob <= range->upb || range->lob == 
> > > > range->upb + 1' failed.
> > > > 
> > > > I was digging into this yesterday, but opted to start a new thread on 
> > > > the
> > > > matter.
> > > >   
> > > 
> > > Hmm. I think I need to figure out the right add_subregion after this and 
> > > it
> > > might work. I'll keep digging, but if you have ideas, let me know.  
> > 
> > [snip]
> > 
> > I managed to get a bit further. With the following, I start getting 
> > complaints
> > about fragmented memory when adding devices later.
> > 
> >      memory_region_init_alias(ct3d->cxl_dstate.pmem, OBJECT(ct3d),
> >              "cxl_type3-memory", mr, mr->addr + offset, ct3d->size);
> >      memory_region_set_nonvolatile(ct3d->cxl_dstate.pmem, true);
> >      memory_region_add_subregion(mr, offset, ct3d->cxl_dstate.pmem);
> > 
> > -device nvdimm,memdev=nvmem1,id=nv1,label-size=2M,node=5: could not find 
> > position in guest address space for memory device - memory fragmented due 
> > to alignments
> >   
> 
> Ignore this. It was a problem with my commandline.
> 
> I think I have something limping along now.

Great.  It's going to get more interesting to do this 'right' though.

We'll need to define a memory window similar to device_mem which is used for
normal memory hotplug then export that window via the relevant ACPI tables
(once defined somewhere public).  Finally we'll then want to actually
do the memory_region_add_subregion() only on the OS / firmware having
configured the type3 device by setting it's PA base.

Going to be fiddly - but should be doable.  With it basically working
should be a series of sensible (ish) steps to get there.

Jonathan








reply via email to

[Prev in Thread] Current Thread [Next in Thread]