qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] pci: Abort if pci_add_capability fails


From: Alex Williamson
Subject: Re: [PATCH] pci: Abort if pci_add_capability fails
Date: Tue, 30 Aug 2022 12:00:14 -0600

On Tue, 30 Aug 2022 13:37:35 +0200
Markus Armbruster <armbru@redhat.com> wrote:
>        if (!offset) {
>            offset = pci_find_space(pdev, size);
>            /* out of PCI config space is programming error */
>            assert(offset);
>        } else {
>            /* Verify that capabilities don't overlap.  Note: device assignment
>             * depends on this check to verify that the device is not broken.
>             * Should never trigger for emulated devices, but it's helpful
>             * for debugging these. */
> 
> The comment makes me suspect that device assignment of a broken device
> could trigger the error.  It goes back to
> 
> commit c9abe111209abca1b910e35c6ca9888aced5f183
> Author: Jan Kiszka <jan.kiszka@siemens.com>
> Date:   Wed Aug 24 14:29:30 2011 +0200
> 
>     pci: Error on PCI capability collisions
>     
>     Nothing good can happen when we overlap capabilities. This may happen
>     when plugging in assigned devices or when devices models contain bugs.
>     Detect the overlap and report it.
>     
>     Based on qemu-kvm commit by Alex Williamson.
>     
>     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>     Acked-by: Don Dutile <ddutile@redhat.com>
>     Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> If this is still correct, then your patch is a regression: QEMU is no
> longer able to gracefully handle assignment of a broken device.  Does
> this matter?  Alex, maybe?

Ok, that was a long time ago.  I have some vague memories of hitting
something like this with a Broadcom NIC, but a google search for the
error string doesn't turn up anything recently.  So there's a fair
chance this wouldn't break anyone initially.

Even back when the above patch was proposed, there were some
suggestions to turn the error path into an abort, which I pushed back
on since clearly enumerating capabilities of a device can occur due to
a hot-plug and we don't necessarily have control of the device being
added.  This is only more true with the possibility of soft-devices out
of tree, through things like vfio-user.

Personally I think the right approach is to support an error path such
that we can abort when triggered by a cold-plug device, while simply
rejecting a broken hot-plug device, but that seems to be the minority
opinion among QEMU developers afaict.  Thanks,

Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]