qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 4/5] ramfb: make migration conditional


From: Laszlo Ersek
Subject: Re: [PATCH v2 4/5] ramfb: make migration conditional
Date: Mon, 2 Oct 2023 22:46:49 +0200

On 10/2/23 22:38, Alex Williamson wrote:
> On Mon, 2 Oct 2023 21:41:55 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 10/2/23 21:26, Alex Williamson wrote:
>>> On Mon, 2 Oct 2023 20:24:11 +0200
>>> Laszlo Ersek <lersek@redhat.com> wrote:
>>>   
>>>> On 10/2/23 16:41, Alex Williamson wrote:  
>>>>> On Mon, 2 Oct 2023 15:38:10 +0200
>>>>> Cédric Le Goater <clg@redhat.com> wrote:
>>>>>     
>>>>>> On 10/2/23 13:11, marcandre.lureau@redhat.com wrote:    
>>>>>>> From: Marc-André Lureau <marcandre.lureau@redhat.com>
>>>>>>>
>>>>>>> RAMFB migration was unsupported until now, let's make it conditional.
>>>>>>> The following patch will prevent machines <= 8.1 to migrate it.
>>>>>>>
>>>>>>> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>      
>>>>>> Maybe localize the new 'ramfb_migrate' attribute close to 'enable_ramfb'
>>>>>> in VFIOPCIDevice. Anyhow,    
>>>>>
>>>>> Shouldn't this actually be tied to whether the device is migratable
>>>>> (which for GVT-g - the only ramfb user afaik - it's not)?  What does it
>>>>> mean to have a ramfb-migrate=true property on a device that doesn't
>>>>> support migration, or false on a device that does support migration.  I
>>>>> don't understand why this is a user controllable property.  Thanks,    
>>>>
>>>> The comments in <https://bugzilla.redhat.com/show_bug.cgi?id=1859424>
>>>> (which are unfortunately not public :/ ) suggest that ramfb migration
>>>> was simply forgotten when vGPU migration was implemented. So, "now
>>>> that vGPU migration is done", this should be added.
>>>>
>>>> Comment 8 suggests that the following domain XML snippet
>>>>
>>>>     <hostdev mode='subsystem' type='mdev' managed='no'
>>>> model='vfio-pci' display='on' ramfb='on'> <source>
>>>>         <address uuid='b155147a-663a-4009-ae7f-e9a96805b3ce'/>
>>>>       </source>
>>>>       <alias name='ua-b155147a-663a-4009-ae7f-e9a96805b3ce'/>
>>>>       <address type='pci' domain='0x0000' bus='0x07' slot='0x00'  
>>>> function='0x0'/> </hostdev>  
>>>>
>>>> is migratable, but the ramfb device malfunctions on the destination
>>>> host.
>>>>
>>>> There's also a huge QEMU cmdline in comment#0 of the bug; I've not
>>>> tried to read that.
>>>>
>>>> AIUI BTW the property is not for the user to control, it's just a
>>>> compat knob for versioned machine types. AIUI those are usually
>>>> implemented with such (user-visible / -tweakable) device properties.  
>>>
>>> If it's not for user control it's unfortunate that we expose it to the
>>> user at all, but should it at least use the "x-" prefix to indicate that
>>> it's not intended to be an API?  
>>
>> I *think* it was your commit db32d0f43839 ("vfio/pci: Add option to
>> disable GeForce quirks", 2018-02-06) that hda introduced me to the "x-"
>> prefixed properties!
>>
>> For some reason though, machine type compat knobs are never named like
>> that, AFAIR.
> 
> Maybe I'm misunderstanding your comment, but it appears quite common to
> use "x-" prefix things in the compat tables...

You didn't misunderstand; I was wrong. I judged this off the compat prop
backports to RHEL that I remembered. Your examples from the tree are
good evidence.

> 
> GlobalProperty hw_compat_8_0[] = {
>     { "migration", "multifd-flush-after-each-section", "on"},
>     { TYPE_PCI_DEVICE, "x-pcie-ari-nextfn-1", "on" },
>     { TYPE_VIRTIO_NET, "host_uso", "off"},
>     { TYPE_VIRTIO_NET, "guest_uso4", "off"},
>     { TYPE_VIRTIO_NET, "guest_uso6", "off"},
> };
> const size_t hw_compat_8_0_len = G_N_ELEMENTS(hw_compat_8_0);
> 
> GlobalProperty hw_compat_7_2[] = {
>     { "e1000e", "migrate-timadj", "off" },
>     { "virtio-mem", "x-early-migration", "false" },
>     { "migration", "x-preempt-pre-7-2", "true" },
>     { TYPE_PCI_DEVICE, "x-pcie-err-unc-mask", "off" },
> };
> const size_t hw_compat_7_2_len = G_N_ELEMENTS(hw_compat_7_2);
> [etc]
> 
>>> It's still odd to think that we can
>>> have scenarios of a non-migratable vfio device registering a migratable
>>> ramfb, and vice versa, but I suppose in the end it doesn't matter.  
>>
>> I do think it matters! For one, if migration is not possible with
>> vfio-pci-nohotplug, then how can QE (or anyone else) *test* the patch
>> (i.e. that it makes a difference)? In that case, the ramfb_setup() call
>> from vfio-pci-nohotplug should just open-code "false" for the
>> "migratable" parameter.
> 
> Some vfio devices support migration, most don't.  I was thinking
> ramfb_setup might be called with something like:
> 
>       (vdev->ramfb_migrate && vdev->enable_migration)
> 
> so that at least the ramfb migration state matches the device, but I
> think ultimately it only saves a little bit of overhead in registering
> the vmstate, either one not supporting migration should block migration.
> 
> Hmm, since enable_migration is auto/on/off, it seems like device
> realize should fail if set to 'on' and ramfb_migrate is false.  I think
> that's the only way the device options don't become self contradictory.
> Thanks,

... easy-looking migration patchset becomes quite complex; isn't that
the story with almost all QEMU work? :)

Thanks!
Laszlo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]