qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] hw/intc/ioapic: Update KVM routes before redelivering IRQ, o


From: David Woodhouse
Subject: Re: [PATCH] hw/intc/ioapic: Update KVM routes before redelivering IRQ, on RTE update
Date: Thu, 09 Mar 2023 09:16:08 +0000
User-agent: Evolution 3.44.4-0ubuntu1

On Wed, 2023-03-08 at 18:09 -0500, Peter Xu wrote:
> On Mon, Mar 06, 2023 at 05:28:24PM +0000, David Woodhouse wrote:
> > Indeed, I don't think we care about the in-kernel I/OAPIC. I don't
> > think we care about the kernel knowing about e.g. "GSI #11" at all. We
> > can just deliver it as MSI (for the I/OAPIC) or using KVM_INTERRUPT and
> > the interrupt window as we do for the PIC. Which is why I'd happily rip
> > that out and let it be delivered via the APIC intercept at 0xfeexxxxx.
> > 
> > The existing code which just keeps IRQ routes updated when they're
> > valid is kind of OK, and well-behaved guests can function. But it isn't
> > *right* in the case where they aren't valid.
> > 
> > What *ought* to happen is that the IOMMU should raise a fault at the
> > moment the interrupt occurs, if the translation isn't valid. And we
> > don't have that at all.
> 
> Right, that's definitely not ideal as an emulator.
> 
> > 
> > As for why I care? I don't really *need* it, as I have everything I
> > need for Xen PIRQ support already merged in 
> > https://gitlab.com/qemu-project/qemu/-/commit/6096cf7877
> > 
> > So while the thread at
> > https://lore.kernel.org/qemu-devel/aaef9961d210ac1873153bf3cf01d984708744fc.camel@infradead.org/
> > was partly driven by expecting to need this for Xen PIRQ support
> > (because in $DAYJOB I did those things in the other order and the PIRQ
> > support ended up just being a trivial different translator like the
> > IOMMU's IR)... I'd still quite like to fix it up in QEMU anyway, just
> > for correctness and fidelity in the faulting cases.
> > 
> > We can do more efficient invalidation too, rather than blowing away the
> > entire routing table every time. Just disconnect the IRQFD for the
> > specific interrupts that get invalidated, and let them get fixed up
> > again next time they occur.
> 
> I'm curious whether there's anything else beside the "correctness of
> emulation" reason.

Nah, at this point it's just because it annoys me. I did this in
another VMM and it works nicely, and I'd like QEMU to do the same. ;)

> I would think it nice if it existed or trivial to have as what you said.  I
> just don't know whether it's as easy, at least so far a new kernel
> interface seems still needed, allowing a kernel irq to be paused until
> being translated by QEMU from some channel we provide.

It doesn't need a new kernel interface.

With IRQ remapping we have a userspace I/OAPIC, so how we deliver its
MSIs is entirely up to us — we absolutely don't need the kernel to have
*any* KVM_IRQ_ROUTING_IRQCHIP entries.

The only IRQs that are handled fully in the kernel are events arriving
on some eventfd which is bound as an IRQFD to some IRQ in the KVM
routing table. (Mostly MSIs coming from VFIO).

If we want to "pause" one of those, all we have to do is unbind the
IRQFD and then we can handle that eventfd in userspace instead. Which
is what we do *anyway* in the case where IRQFD support isn't available.

In
https://lore.kernel.org/kvm/20201027143944.648769-1-dwmw2@infradead.org/
I *optimised* that dance, so userspace doesn't have to stop listening
on the eventfd when the IRQFD is bound; the IRQFD code consumes the
event first. But we can live without that in older kernels.

Basically, it's just treating the IRQFD support as an *optimisation*,
and retaining the 'slow path' of handling the event in userspace and
injecting the resulting MSI.

It's just that in that slow path, as we're translating and injecting
the MSI, we *also* update the IRQ routing table and reattach the IRQFD
for the next time that interrupt fires. Like a cache.

And we stash an invalidation cookie (for Intel-IOMMU the IRTE index)
for each routing table entry, so that when asked to invalidate a
*range* of cookies (that's how the Intel IOMMU invalidate works), we
can just detach the IRQFD from those ones, letting them get handled in
userspace next time and retranslated (with associated fault report, as
appropriate).


Attachment: smime.p7s
Description: S/MIME cryptographic signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]