qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v2 PATCH] record-replay: support SMP target machine


From: Nicholas Piggin
Subject: Re: [RFC v2 PATCH] record-replay: support SMP target machine
Date: Fri, 25 Aug 2023 18:22:48 +1000

On Tue Aug 22, 2023 at 2:44 PM AEST, Pavel Dovgalyuk wrote:
> On 11.08.2023 04:47, Nicholas Piggin wrote:
> > RR CPU switching is driven by timers and events so it is deterministic
> > like everything else. Record a CPU switch event and use that to drive
> > the CPU switch on replay.
> > 
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> > This is still in RFC phase because so far I've only really testd ppc
> > pseries, and only with patches that are not yet upstream (but posted
> > to list).
> > 
> > It works with smp 2, can step, reverse-step, reverse-continue, etc.
> > throughout a Linux boot.
>
> I still didn't have time to test it, but here are some comments.

That's okay, I got a little further, mainly adding vmstate to
migrate it (otherwise we can only use the initial snapshot).

Unless there is more interest, I will focus on getting ppc fixes
upstream first. Let me know if you have more time to look, I can
send you the latest.

[snip]

> > @@ -294,9 +346,9 @@ static void *rr_cpu_thread_fn(void *arg)
> >               qatomic_set_mb(&cpu->exit_request, 0);
> >           }
> >   
> > -        if (all_cpu_threads_idle()) {
> > -            rr_stop_kick_timer();
> > +        qatomic_set(&rr_next_cpu, cpu);
>
> This does not seem to be in the mainline.

Sorry I meant to sqush that in or send it out. The kick timer
init vs start needed to be moved to make it work.

[snip]

> > -bool replay_exception(void)
> > +bool replay_switch_cpu(void)
> > +{
> > +    if (replay_mode == REPLAY_MODE_RECORD) {
> > +        g_assert(replay_mutex_locked());
> > +        replay_save_instructions();
> > +        replay_put_event(EVENT_SWITCH_CPU);
> > +        return true;
> > +    } else if (replay_mode == REPLAY_MODE_PLAY) {
> > +        bool res = replay_has_switch_cpu();
> > +        if (res) {
> > +            replay_finish_event();
> > +        } else {
> > +            g_assert_not_reached();
> > +        }
> > +        return res;
> > +    }
> > +
> > +    return true;
> > +}
> > +
> > +bool replay_has_switch_cpu(void)
>
> Is this function really needed?

I found it was easier to fit in the way the CPU scheduling is done
in rr.

I think that main scheduling loop could be refactored a bit that
could then avoid the need for this (e.g., a helper function to
return the next CPU and all the selection code including rr is
in there). But that became non-trivial and looks like the code is
a bit delicate. I might try to tackle that afterwards.

Thanks,
Nick



reply via email to

[Prev in Thread] Current Thread [Next in Thread]