[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 6/6] accel/tcg: Always require can_do_io
From: |
Clément Chigot |
Subject: |
Re: [PATCH 6/6] accel/tcg: Always require can_do_io |
Date: |
Thu, 26 Oct 2023 10:10:54 +0200 |
On Thu, Oct 26, 2023 at 2:44 AM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 10/24/23 02:50, Clément Chigot wrote:
> > Hi Richard,
> >
> > This commit has broken some of our internal bareboard testing on
> > Risc-V 64. At some point in our programs, there is an AMOSWAP (=
> > atomic swap) instruction on I/O. But since this commit, can_do_io is
> > set to false triggering an infinite loop.
> > IIUC the doc (cf [1]), atomic operations on I/O are allowed.
> >
> > I think there is a CF_LAST_IO flag missing somewhere to allow it, but
> > I'm not sure where this should be. Do you have any ideas ?
> >
> > Sadly I cannot provide a reproducer that easily, mainly because our
> > microchip has a few patches not yet merged making our binaries not
> > running on the upstream master.
> > But here is a bit of the in_asm backtrace:
> >
> > | IN: system__bb__riscv_plic__initialize
> > | Priv: 3; Virt: 0
> > | 0x80000eb4: 1141 addi sp,sp,-16
> > | 0x80000eb6: 0c0027b7 lui a5,49154
> > | 0x80000eba: e406 sd ra,8(sp)
> > | 0x80000ebc: 00010597 auipc a1,16
> > # 0x80010ebc
> > | 0x80000ec0: 47458593 addi a1,a1,1140
> > | 0x80000ec4: f3ffe637 lui a2,-49154
> > | 0x80000ec8: 01878693 addi a3,a5,24
> > | 0x80000ecc: 00f58733 add a4,a1,a5
> > | 0x80000ed0: 9732 add a4,a4,a2
> > | 0x80000ed2: 4318 lw a4,0(a4)
> > | 0x80000ed4: 2701 sext.w a4,a4
> > | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> > | 0x80000eda: 0791 addi a5,a5,4
> > | 0x80000edc: fed798e3 bne a5,a3,-16
> > # 0x80000ecc
> > |
> > | ----------------
> > | IN: system__bb__riscv_plic__initialize
> > | Priv: 3; Virt: 0
> > | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> > |
> > | ----------------
> > | IN: system__bb__riscv_plic__initialize
> > | Priv: 3; Virt: 0
> > | 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
> > | * Freeze *
>
> I would expect two translations:
>
> (1) with the original TB, aborts execution on !can_do_io.
> (2) with the second TB, we get further into the actual execution and abort
> execution on
> TLB_MMIO.
> (3) with the third TB, we clear CF_PARALLEL and execute under
> cpu_exec_step_atomic.
>
> Both 2 and 3 should have had CF_LAST_IO set.
> You can verify this with '-d exec' output.
>
> As a trivial example from qemu-system-alpha bios startup:
>
> > Trace 0: 0x7f2584008380 [00000000/fffffc0000003ee4/01000000/ff000000]
> > uart_init_line
> > cpu_io_recompile: rewound execution of TB to fffffc0000003ee4
> > ----------------
> > IN: uart_init_line
> > 0xfffffc0000003f20: stb t8,0(t6)
> >
> > Trace 0: 0x7f2584008a00 [00000000/fffffc0000003f20/01000000/ff018001]
> > uart_init_line
>
> Note that the final "/" field is cflags. The first "Trace" corresponds to
> (1), where the
> store is in the middle of the TB. You can see the io_recompile abort, then
> the second
> "Trace" contains {CF_COUNT=1, CF_LAST_IO, CF_MEMI_ONLY}.
With the exec, it's indeed a bit clearer.
I do have a new translation made by cpu_io_recompile which sets
CF_LAST_IO (2). But afterward, it somehow loops back to the previous
translation which doesn't have CF_LAST_IO which calls cpu_io_recompile
again, etc. This triggers an infinite loop between these two
translations
| ----------------
| IN: system__bb__riscv_plic__initialize
| 0x80000eb4: 1141 addi sp,sp,-16
| ...
| 0x80000ed4: 2701 sext.w a4,a4
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
| 0x80000eda: 0791 addi a5,a5,4
| 0x80000edc: fed798e3 bne a5,a3,-16
# 0x80000ecc
|
| Linking TBs 0x7f0d3199db40 index 0 -> 0x7f0d3199dd00
| Trace 1: 0x7f0d3199dd00
[00000000/0000000080000eb4/0b02401b/ff280000]
system__bb__riscv_plic__initialize
(1) First exec of amoswap without CF_LAST_IO.
| ----------------
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
|
| Trace 1: 0x7f0d3199df80
[00000000/0000000080000ed6/0b02401b/ff200601]
system__bb__riscv_plic__initialize
(2) Second exec with CF_LAST_IO.
| cpu_io_recompile: rewound execution of TB to 0000000080000ed6
| ----------------
| IN: system__bb__riscv_plic__initialize
| Priv: 3; Virt: 0
| 0x80000ed6: 08e7a02f amoswap.w zero,a4,(a5)
|
| Trace 1: 0x7f0d3199e140
[00000000/0000000080000ed6/0b02401b/ff298001]
system__bb__riscv_plic__initialize
Loop back (1) and start the infinite loop between (1) and (2).
| Trace 1: 0x7f0d3199df80
[00000000/0000000080000ed6/0b02401b/ff200601]
system__bb__riscv_plic__initialize
| cpu_io_recompile: rewound execution of TB to 0000000080000ed6
| Trace 1: 0x7f0d3199e140
[00000000/0000000080000ed6/0b02401b/ff298001]
system__bb__riscv_plic__initialize
| Trace 1: 0x7f0d3199df80
[00000000/0000000080000ed6/0b02401b/ff200601]
system__bb__riscv_plic__initialize
| cpu_io_recompile: rewound execution of TB to 0000000080000ed6
I'll continue to investigate. Even if the workaround you provided
works, there is something weird happening here.
> In the short term, try adding CF_LAST_IO to cflags in cpu_exec_step_atomic.
Thanks for that workaround.
Thanks,
Clément
> I think probably the logic of CF_LAST_IO should always be applied now, since
> can_do_io is
> always live, and thus the flag itself should go away.
>
>
> r~