[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PULL 01/39] accel/tcg: mttcg remove false-negative halted assertion
From: |
Nicholas Piggin |
Subject: |
Re: [PULL 01/39] accel/tcg: mttcg remove false-negative halted assertion |
Date: |
Mon, 18 Sep 2023 20:53:23 +1000 |
On Mon Sep 18, 2023 at 5:59 PM AEST, Alex Bennée wrote:
>
> "Nicholas Piggin" <npiggin@gmail.com> writes:
>
> > On Sat Sep 16, 2023 at 1:29 PM AEST, Richard Henderson wrote:
> >> From: Nicholas Piggin <npiggin@gmail.com>
> >>
> >> mttcg asserts that an execution ending with EXCP_HALTED must have
> >> cpu->halted. However between the event or instruction that sets
> >> cpu->halted and requests exit and the assertion here, an
> >> asynchronous event could clear cpu->halted.
> >>
> >> This leads to crashes running AIX on ppc/pseries because it uses
> >> H_CEDE/H_PROD hcalls, where H_CEDE sets self->halted = 1 and
> >> H_PROD sets other cpu->halted = 0 and kicks it.
> >>
> >> H_PROD could be turned into an interrupt to wake, but several other
> >> places in ppc, sparc, and semihosting follow what looks like a similar
> >> pattern setting halted = 0 directly. So remove this assertion.
> >>
> >> Reported-by: Ivan Warren <ivan@vmfacility.fr>
> >> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> >> Message-Id: <20230829010658.8252-1-npiggin@gmail.com>
> >> [rth: Keep the case label and adjust the comment.]
> >
> > Hey Richard,
> >
> > Thanks for picking this up.
> >
> > I think EXCP_HLT and EXCP_HALTED are effectively the same, so they could
> > be merged after this.
> >
> > I couldn't quite decipher the intended difference between them, HLT is
> > "hlt instruction reached", but it does tend to go into a mode where it
> > is halted waiting for external event. Is there some useful difference in
> > semantics we should retain (and at least try to find a way to assert)?
>
> I always thought HALTED was where the system was halted (e.g. during a
> shutdown) but I agree its less than clear.
Maybe that was so. I didn't manage to track down the original intention
of them, but now they are not different, HALTED does just wait for event
too. EXCP_HALTED did previously require the operation set ->halted = 1
before calling (the assert only breaks due to concurrent wakeup clearing
it). But some ops that use EXCP_HLT also set ->halted.
So nowadays halted == 1 means to check ->cpu_has_work() before running
the CPU again (and otherwise wait on io event as you say). And
EXCP_HLT/HALTED are both just ways to return from the cpu exec loop.
One thing I'm not sure of is why you would set EXCP_HLT without setting
halted. In some cases it could be a bug (e.g., avr helper_sleep()), but
there are a few ops that use it after a CPU reset or shutdown which
might be valid. Could call those ones something like EXCP_RESET or
EXCP_REEXEC.
Thanks,
Nick
[PULL 02/39] accel/tcg: Fix the comment for CPUTLBEntryFull, Richard Henderson, 2023/09/15
[PULL 03/39] util: Delete checks for old host definitions, Richard Henderson, 2023/09/15
[PULL 05/39] thunk: Delete checks for old host definitions, Richard Henderson, 2023/09/15
[PULL 04/39] softmmu: Delete checks for old host definitions, Richard Henderson, 2023/09/15
[PULL 07/39] tcg/loongarch64: Lower basic tcg vec ops to LSX, Richard Henderson, 2023/09/15
[PULL 08/39] tcg: pass vece to tcg_target_const_match(), Richard Henderson, 2023/09/15
[PULL 10/39] tcg/loongarch64: Lower add/sub_vec to vadd/vsub, Richard Henderson, 2023/09/15
[PULL 11/39] tcg/loongarch64: Lower vector bitwise operations, Richard Henderson, 2023/09/15
[PULL 09/39] tcg/loongarch64: Lower cmp_vec to vseq/vsle/vslt, Richard Henderson, 2023/09/15