qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PULL 01/39] accel/tcg: mttcg remove false-negative halted assertion


From: Nicholas Piggin
Subject: Re: [PULL 01/39] accel/tcg: mttcg remove false-negative halted assertion
Date: Mon, 18 Sep 2023 20:53:23 +1000

On Mon Sep 18, 2023 at 5:59 PM AEST, Alex Bennée wrote:
>
> "Nicholas Piggin" <npiggin@gmail.com> writes:
>
> > On Sat Sep 16, 2023 at 1:29 PM AEST, Richard Henderson wrote:
> >> From: Nicholas Piggin <npiggin@gmail.com>
> >>
> >> mttcg asserts that an execution ending with EXCP_HALTED must have
> >> cpu->halted. However between the event or instruction that sets
> >> cpu->halted and requests exit and the assertion here, an
> >> asynchronous event could clear cpu->halted.
> >>
> >> This leads to crashes running AIX on ppc/pseries because it uses
> >> H_CEDE/H_PROD hcalls, where H_CEDE sets self->halted = 1 and
> >> H_PROD sets other cpu->halted = 0 and kicks it.
> >>
> >> H_PROD could be turned into an interrupt to wake, but several other
> >> places in ppc, sparc, and semihosting follow what looks like a similar
> >> pattern setting halted = 0 directly. So remove this assertion.
> >>
> >> Reported-by: Ivan Warren <ivan@vmfacility.fr>
> >> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> >> Message-Id: <20230829010658.8252-1-npiggin@gmail.com>
> >> [rth: Keep the case label and adjust the comment.]
> >
> > Hey Richard,
> >
> > Thanks for picking this up.
> >
> > I think EXCP_HLT and EXCP_HALTED are effectively the same, so they could
> > be merged after this.
> >
> > I couldn't quite decipher the intended difference between them, HLT is
> > "hlt instruction reached", but it does tend to go into a mode where it
> > is halted waiting for external event. Is there some useful difference in
> > semantics we should retain (and at least try to find a way to assert)?
>
> I always thought HALTED was where the system was halted (e.g. during a
> shutdown) but I agree its less than clear.

Maybe that was so. I didn't manage to track down the original intention
of them, but now they are not different, HALTED does just wait for event
too. EXCP_HALTED did previously require the operation set ->halted = 1
before calling (the assert only breaks due to concurrent wakeup clearing
it). But some ops that use EXCP_HLT also set ->halted.

So nowadays halted == 1 means to check ->cpu_has_work() before running
the CPU again (and otherwise wait on io event as you say). And
EXCP_HLT/HALTED are both just ways to return from the cpu exec loop.

One thing I'm not sure of is why you would set EXCP_HLT without setting
halted. In some cases it could be a bug (e.g., avr helper_sleep()), but
there are a few ops that use it after a CPU reset or shutdown which
might be valid. Could call those ones something like EXCP_RESET or
EXCP_REEXEC.

Thanks,
Nick



reply via email to

[Prev in Thread] Current Thread [Next in Thread]