qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Crash with CXL + TCG on 8.2: Was Re: qemu cxl memory expander shows


From: Jonathan Cameron
Subject: Re: Crash with CXL + TCG on 8.2: Was Re: qemu cxl memory expander shows numa_node -1
Date: Thu, 15 Feb 2024 15:04:33 +0000

On Thu, 1 Feb 2024 16:00:56 +0000
Peter Maydell <peter.maydell@linaro.org> wrote:

> On Thu, 1 Feb 2024 at 15:17, Alex Bennée <alex.bennee@linaro.org> wrote:
> >
> > Peter Maydell <peter.maydell@linaro.org> writes:  
> > > So, that looks like:
> > >  * we call cpu_tb_exec(), which executes some generated code
> > >  * that generated code calls the lookup_tb_ptr helper to see
> > >    if we have a generated TB already for the address we're going
> > >    to execute next
> > >  * lookup_tb_ptr probes the TLB to see if we know the host RAM
> > >    address for the guest address
> > >  * this results in a TLB walk for an instruction fetch
> > >  * the page table descriptor load is to IO memory
> > >  * io_prepare assumes it needs to do a TLB recompile, because
> > >    can_do_io is clear
> > >
> > > I am not surprised that the corner case of "the guest put its
> > > page tables in an MMIO device" has not yet come up :-)
> > >
> > > I'm really not sure how the icount handling should interact
> > > with that...  
> >
> > Its not just icount - we need to handle it for all modes now. That said
> > seeing as we are at the end of a block shouldn't can_do_io be set?  
> 
> The lookup_tb_ptr helper gets called from tcg_gen_goto_tb(),
> which happens earlier than the tb_stop callback (it can
> happen in the trans function for branch etc insns, for
> example).
> 
> I think it should be OK to clear can_do_io at the start
> of the lookup_tb_ptr helper, something like:
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 977576ca143..7818537f318 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -396,6 +396,15 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
>      uint64_t cs_base;
>      uint32_t flags, cflags;
> 
> +    /*
> +     * By definition we've just finished a TB, so I/O is OK.
> +     * Avoid the possibility of calling cpu_io_recompile() if
> +     * a page table walk triggered by tb_lookup() calling
> +     * probe_access_internal() happens to touch an MMIO device.
> +     * The next TB, if we chain to it, will clear the flag again.
> +     */
> +    cpu->neg.can_do_io = true;
> +
>      cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
> 
>      cflags = curr_cflags(cpu);
> 
> -- PMM

Hi Peter,

I've included this in the series I just sent out:
20240215150133.2088-1-Jonathan.Cameron@huawei.com/T/#t">https://lore.kernel.org/qemu-devel/20240215150133.2088-1-Jonathan.Cameron@huawei.com/T/#t

Could you add your Signed-off-by if you are happy doing so?

Jonathan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]