[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: XIVE VFIO kernel resample failure in INTx mode under heavy load
From: |
Timothy Pearson |
Subject: |
Re: XIVE VFIO kernel resample failure in INTx mode under heavy load |
Date: |
Fri, 11 Mar 2022 12:53:53 -0600 (CST) |
Correction -- the desynchronization appears to be on the DisINTx line.
Host:
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+
Stepping- SERR+ FastB2B- DisINTx+
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=slow >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Guest:
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=slow >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
This is with the driver stuck, not receiving any interrupts in the guest
despite the card issuing them every 1ms.
----- Original Message -----
> From: "Timothy Pearson" <tpearson@raptorengineering.com>
> To: "qemu-devel" <qemu-devel@nongnu.org>
> Sent: Friday, March 11, 2022 12:35:45 PM
> Subject: XIVE VFIO kernel resample failure in INTx mode under heavy load
> All,
>
> I've been struggling for some time with what is looking like a potential bug
> in
> QEMU/KVM on the POWER9 platform. It appears that in XIVE mode, when the
> in-kernel IRQ chip is enabled, an external device that rapidly asserts IRQs
> via
> the legacy INTx level mechanism will only receive one interrupt in the KVM
> guest.
>
> Changing any one of those items appears to avoid the glitch, e.g. XICS mode
> with
> the in-kernel IRQ chip works (all interrupts are passed through), and XIVE
> mode
> with the in-kernel IRQ chip disabled also works. We are also not seeing any
> problems in XIVE mode with the in-kernel chip from MSI/MSI-X devices.
>
> The device in question is a real time card that needs to raise an interrupt
> every 1ms. It works perfectly on the host, but fails in the guest -- with the
> in-kernel IRQ chip and XIVE enabled, it receives exactly one interrupt, at
> which point the host continues to see INTx+ but the guest sees INTX-, and the
> IRQ handler in the guest kernel is never reentered.
>
> We have also seen some very rare glitches where, over a long period of time,
> we
> can enter a similar deadlock in XICS mode. Disabling the in-kernel IRQ chip
> in
> XIVE mode will also lead to the lockup with this device, since the userspace
> IRQ emulation cannot keep up with the rapid interrupt firing (measurements
> show
> around 100ms required for processing each interrupt in the user mode).
>
> My understanding is the resample mechanism does some clever tricks with level
> IRQs, but that QEMU needs to check if the IRQ is still asserted by the device
> on guest EOI. Since a failure here would explain these symptoms I'm wondering
> if there is a bug in either QEMU or KVM for POWER / pSeries (SPAPr) where the
> IRQ is not resampled and therefore not re-fired in the guest?
>
> Unfortunately I lack the resources at the moment to dig through the QEMU
> codebase and try to find the bug. Any IBMers here that might be able to help
> out? I can provide access to a test setup if desired.
>
> Thanks!