qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 1/4] memory: prevent dma-reentracy issues


From: Alexander Bulekov
Subject: Re: [PATCH v6 1/4] memory: prevent dma-reentracy issues
Date: Fri, 10 Mar 2023 07:31:17 -0500

On 230310 0723, Alexander Bulekov wrote:
> On 230310 1214, Fiona Ebner wrote:
> > Am 05.02.23 um 05:07 schrieb Alexander Bulekov:
> > > Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA.
> > > This flag is set/checked prior to calling a device's MemoryRegion
> > > handlers, and set when device code initiates DMA.  The purpose of this
> > > flag is to prevent two types of DMA-based reentrancy issues:
> > > 
> > > 1.) mmio -> dma -> mmio case
> > > 2.) bh -> dma write -> mmio case
> > > 
> > > These issues have led to problems such as stack-exhaustion and
> > > use-after-frees.
> > > 
> > > Summary of the problem from Peter Maydell:
> > > https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com
> > > 
> > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/62
> > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/540
> > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/541
> > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/556
> > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/557
> > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/827
> > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1282
> > > 
> > > Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
> > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > > Acked-by: Peter Xu <peterx@redhat.com>
> > > ---
> > >  include/hw/qdev-core.h |  7 +++++++
> > >  softmmu/memory.c       | 17 +++++++++++++++++
> > >  softmmu/trace-events   |  1 +
> > >  3 files changed, 25 insertions(+)
> > > 
> > Hi,
> > there seems to be an issue with this patch or existing issue exposed by
> > this patch in combination with the LSI SCSI controller.
> > 
> > After applying this patch on current master (i.e.
> > ee59483267de29056b5b2ee2421ef3844e5c9932), a Debian 11 with the LSI
> > controller would not boot properly anymore:
> > > [    7.540907] sym0: <895a> rev 0x0 at pci 0000:00:05.0 irq 10
> > > [    7.546028] sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking
> > > [    7.559724] sym0: SCSI BUS has been reset.
> > > [    7.560820] sym0: interrupted SCRIPT address not found.
> > > [    7.563802] scsi host2: sym-2.2.3
> > > [    7.881111] e1000 0000:00:03.0 eth0: (PCI:33MHz:32-bit) 
> > > 52:54:00:12:34:56
> > > [    7.881998] e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network 
> > > Connection
> > > [    7.925902] e1000 0000:00:03.0 ens3: renamed from eth0
> > > [   32.654811] scsi 2:0:0:0: tag#192 ABORT operation started
> > > [   37.764283] scsi 2:0:0:0: ABORT operation timed-out.
> > > [   37.774974] scsi 2:0:0:0: tag#192 DEVICE RESET operation started
> > > [   42.882488] scsi 2:0:0:0: DEVICE RESET operation timed-out.
> > > [   42.883606] scsi 2:0:0:0: tag#192 BUS RESET operation started
> > > [   48.002437] scsi 2:0:0:0: BUS RESET operation timed-out.
> > > [   48.003030] scsi 2:0:0:0: tag#192 HOST RESET operation started
> > > [   48.010226] sym0: SCSI BUS has been reset.
> > > [   53.122472] scsi 2:0:0:0: HOST RESET operation timed-out.
> > > [   53.123030] scsi 2:0:0:0: Device offlined - not ready after error 
> > > recovery
> > 
> > The commandline I used is:
> > ./qemu-system-x86_64 \
> >    -cpu 'kvm64' \
> >    -m 4096 \
> >    -serial 'stdio' \
> >    -device 'lsi,id=scsihw0,bus=pci.0,addr=0x5' \
> >    -drive
> > 'file=/dev/zvol/myzpool/vm-9006-disk-0,if=none,id=drive-scsi0,format=raw' \
> >    -device
> > 'scsi-hd,bus=scsihw0.0,scsi-id=0,drive=drive-scsi0,id=scsi0,bootindex=100' \
> >    -machine 'pc'
> > 
> > Happy to provide any more information if necessary!
> > 
> > CC-ing Fam Zheng (reviewer:SCSI)
> > 
> > Originally reported by one of our community members:
> > https://forum.proxmox.com/threads/123843/
> > 
> > Best Regards,
> > Fiona
> > 
> 
> Thanks, I confirmed this by booting up a livecd iso with an lsi device
> attached.  I will do some digging
> 
> Stack-trace:
> 
> #0  trace_memory_region_reentrant_io (cpu_index=<optimized out>, 
> mr=<optimized out>, offset=<optimized out>, size=<optimized out>) at 
> trace/trace-softmmu.h:337
> #1  0x000055555815ce67 in access_with_adjusted_size (addr=addr@entry=0x1000, 
> value=0x7ffef01fb980, size=size@entry=0x4, access_size_min=0x1, 
> access_size_min@entry=0x0, access_size_max=0x4, access_fn=0x555558181370 
> <memory_region_read_accessor>, mr=0x627000000c50, attrs=...
> ) at ../softmmu/memory.c:552
> #2  0x000055555815aec7 in memory_region_dispatch_read1 (mr=0x627000000c50, 
> addr=0x1000, pval=<optimized out>, size=0x4, attrs=...) at 
> ../softmmu/memory.c:1448

This MR seems to be "lsi-ram".

>From hw/scsi/lsi53c895a.c:

memory_region_init_io(&s->ram_io, OBJECT(s), &lsi_ram_ops, s,
        "lsi-ram", 0x2000);                    

So the LSI device is reading an LSI "Script" from its own IO region.. In
this particular case, I think there was no reason to use
memory_region_init_io rather than memory_region_init_ram, but this makes
me worried that there are other devices that use something like this.

-Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]