qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PULL 1/5] hw/pcie-root-port: Fix hotplug for PCI devices requiring


From: Marcel Apfelbaum
Subject: Re: [PULL 1/5] hw/pcie-root-port: Fix hotplug for PCI devices requiring IO
Date: Wed, 29 Sep 2021 16:55:10 +0300

Hi Daniel,

On Wed, Sep 29, 2021 at 12:05 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
On Mon, Sep 27, 2021 at 05:49:15AM -0400, Michael S. Tsirkin wrote:
> On Mon, Sep 27, 2021 at 10:33:42AM +0100, Daniel P. Berrangé wrote:
> > On Tue, Aug 03, 2021 at 04:52:03PM -0400, Michael S. Tsirkin wrote:
> > > From: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> > >
> > > Q35 has now ACPI hotplug enabled by default for PCI(e) devices.
> > > As opposed to native PCIe hotplug, guests like Fedora 34
> > > will not assign IO range to pcie-root-ports not supporting
> > > native hotplug, resulting into a regression.
> > >
> > > Reproduce by:
> > >     qemu-bin -M q35 -device pcie-root-port,id=p1 -monitor stdio
> > >     device_add e1000,bus=p1
> > > In the Guest OS the respective pcie-root-port will have the IO range
> > > disabled.
> > >
> > > Fix it by setting the "reserve-io" hint capability of the
> > > pcie-root-ports so the firmware will allocate the IO range instead.
> > >
> > > Acked-by: Igor Mammedov <imammedo@redhat.com>
> > > Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
> > > Message-Id: <20210802090057.1709775-1-marcel@redhat.com>
> > > Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > ---
> > >  hw/pci-bridge/gen_pcie_root_port.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> >
> > This change, when combined with the switch to ACPI based hotplug by
> > default, is responsible for a significant regression in QEMU 6.1.0
> >
> > It is no longer possible to have more than 15 pcie-root-port devices
> > added to a q35 VM in 6.1.0.  Before this I've had as many as 80+ devices
> > present before I stopped trying to add more.
> >
> >   https://gitlab.com/qemu-project/qemu/-/issues/641
> >
> > This regression is significant, because it has broken the out of the
> > box default configuration that OpenStack uses for booting all VMs.
> > They add 16 pcie-root-ports by defalt to allow empty slots for device
> > hotplug under q35 [1].


That's bad!
 
>
>
> Indeed, oops. Thanks for the report!
>
> Going back and looking at seabios code, didn't we get confused?
> Shouldn't we have reserved memory and not IO?
>
> I see:
>             int resource_optional = pcie_cap && (type == PCI_REGION_TYPE_IO);
>             if (!sum && hotplug_support && !resource_optional)
>                 sum = align; /* reserve min size for hot-plug */
>
>
> generally maybe we should just add an ACPI-hotplug capability and
> teach seabios about it?

Looking at the commit message example:

   qemu-bin -M q35 -device pcie-root-port,id=p1 -monitor stdio
   device_add e1000,bus=p1

IIUC, that is plugging a PCI device into the PCIe root port.

docs/pcie.txt says that IO space is not allocated by SeaBIOS
or OVMF for pcie-root-port, if

    (1) the port is empty, or
    (2) the device behind the port has no IO BARs.

Point (2) is satisified if you only ever plug PCIe devces into
the pcie-root-port. The docs/pcie.txt recommends exactly that,
with any PCI device to be placed downstream of a pcie-pci-bridge
and pci-pci-bridge pair.

IOW that example from the commit message should have been

  qemu-bin -M q35 -device pcie-root-port,id=p1 -monitor stdio \
       -device pcie-pci-bridge,id=br1,bus=pcie.0] \
       -device pci-bridge,id=br2,bus=br1,chassis_nr=1
  device-add e1000,bus=br2

So why did we need IO space for the pcie-root-port at all ?

We don't... we need it for the pci-bridge.
The patch addressed a regression seen in Fedora 33/34 hosts.
PCIe hot-plug for pcie-root ports allowed legacy PCI devices to be hot-plugged,
while the ACPI based hotplug didn't.

The patch tried to fix the problem by pre-allocating IO space at SeaBIOS level,
but it seems it is not the optimal solution.

That means it was the Guest OS that allocated the IO range when necessary,
making the decision at firmware level is wrong.

I confirm we have to find a better solution.

Thanks,
marcel
 
The example given as the reason just looks like user error
per the docs/pcie.txt

Regards,
Daniel
--
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


reply via email to

[Prev in Thread] Current Thread [Next in Thread]