qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Detecting Faulting Instructions From Plugins


From: Alex Bennée
Subject: Re: Detecting Faulting Instructions From Plugins
Date: Fri, 05 Feb 2021 15:03:27 +0000
User-agent: mu4e 1.5.7; emacs 28.0.50

Aaron Lindsay <aaron@os.amperecomputing.com> writes:

> On Feb 05 11:19, Alex Bennée wrote:
>> Aaron Lindsay <aaron@os.amperecomputing.com> writes:
>> 
>> > For the below output, I've got a plugin which registers a callback via
>> > `qemu_plugin_register_vcpu_insn_exec_cb` for each instruction executed.
>> > I've enabled `-d in_asm` and added prints in my instruction execution
>> > callback when it sees the opcode for the `ldr` instruction in question.
>> > I'm running a local source build of the v5.2.0 release.
>> >
>> > Note in the output below the instruction at 0xffffdd2f1d4102c0 is
>> > getting re-translated for some reason, and that two callbacks are made
>> > to my function registered with qemu_plugin_register_vcpu_insn_exec_cb
>> > (the "*** saw encoding"... output) for what should be one instruction
>> > execution.
>> 
>> I wonder is that load accessing a HW location? I suspect what is
>> happening is we detect a io_readx/io_writex when ->can_do_io is not
>> true. As HW access can only happen at the end of a block (because it may
>> change system state) we trigger a recompile of that instruction and try 
>> again.
>
> I just added additional instrumentation, and
> `qemu_plugin_hwaddr_is_io(hwaddr)` returns true in the mem_cb for this
> access.
>
>> > Do you have any tips for debugging this further or ideas for ensuring the
>> > callback is called only once for this instruction?
>> 
>> If you also plant a memory callback you should only see one load
>> happening for that instruction. Could you verify that?
>
> Yes, I've verified there is only one load happening for the instruction,
> and that the ordering of callbacks for this instruction is 1) first
> insn_exec_cb, 2) second insn_exec_cb, 3) mem_cb.
>
> Is there anything else you'd like me to check to validate your theory?

No I think that pretty much confirms the theory.

>> > ----------------
>> > IN:
>> > 0xffffdd2f1d410250:  aa1e03e9  mov      x9, x30
>> > 0xffffdd2f1d410254:  d503201f  nop
>> > 0xffffdd2f1d410258:  a9bc7bfd  stp      x29, x30, [sp, #-0x40]!
>> > 0xffffdd2f1d41025c:  910003fd  mov      x29, sp
>> > 0xffffdd2f1d410260:  a90153f3  stp      x19, x20, [sp, #0x10]
>> > 0xffffdd2f1d410264:  b000f2d3  adrp     x19, #0xffffdd2f1f269000
>> > 0xffffdd2f1d410268:  911c4273  add      x19, x19, #0x710
>> > 0xffffdd2f1d41026c:  a9025bf5  stp      x21, x22, [sp, #0x20]
>> > 0xffffdd2f1d410270:  f000cad6  adrp     x22, #0xffffdd2f1ed6b000
>> > 0xffffdd2f1d410274:  aa0003f5  mov      x21, x0
>> > 0xffffdd2f1d410278:  f9409674  ldr      x20, [x19, #0x128]
>> > 0xffffdd2f1d41027c:  913d42d6  add      x22, x22, #0xf50
>> > 0xffffdd2f1d410280:  f9001bf7  str      x23, [sp, #0x30]
>> > 0xffffdd2f1d410284:  91003297  add      x23, x20, #0xc
>> > 0xffffdd2f1d410288:  91004294  add      x20, x20, #0x10
>> > 0xffffdd2f1d41028c:  1400000d  b        #0xffffdd2f1d4102c0
>> >
>> > ----------------
>> > IN:
>> > 0xffffdd2f1d4102c0:  b94002e2  ldr      w2, [x23]
>> > 0xffffdd2f1d4102c4:  12002441  and      w1, w2, #0x3ff
>> > 0xffffdd2f1d4102c8:  710fec3f  cmp      w1, #0x3fb
>> > 0xffffdd2f1d4102cc:  54fffe29  b.ls     #0xffffdd2f1d410290
>> >
>> > *** saw encoding 0xb94002e2 (@ 504107673 instructions)
>> > ----------------
>> > IN:
>> > 0xffffdd2f1d4102c0:  b94002e2  ldr      w2, [x23]
>> >
>> > *** saw encoding 0xb94002e2 (@ 504107674 instructions)
>> > ----------------
>> > IN:
>> > 0xffffdd2f1d4102c4:  12002441  and      w1, w2, #0x3ff
>> > 0xffffdd2f1d4102c8:  710fec3f  cmp      w1, #0x3fb
>> > 0xffffdd2f1d4102cc:  54fffe29  b.ls     #0xffffdd2f1d410290
>> 
>> I think you can work around this in your callback by looking for a
>> double execution but that exposes rather more of the knowledge of what
>> is going on behind the scenes than we intended for the plugin interface.
>> The point is you shouldn't need to know the details of the translator to
>> write your instruments.
>
> Yes, working around it in that way was initial my thought as well. I
> think there may be a few (albeit unlikely) corner cases this wouldn't
> work correctly for - like self-branches. I don't think that's a major
> roadblock for now, but I'd love to help work towards a cleaner solution
> in the long-term.

A perhaps lighter weight mechanism is to detect load/store insns and
install a memory callback for those instructions instead of an
instruction callback. That way you only have one callback and it will
always be one that will execute once. Plugins are certainly allowed to
make decisions based on the guest instructions - hence we give access to
that data at translation time.

>
>> My initial thought is that maybe when we install the callbacks we should
>> place them after translation if we know there is a guest load/store
>> happening. However my concern is having such heuristics might miss other
>> cases - could you see a load from HW indirect jump instruction for
>> example? It also has the potential to get confusing when we add the
>> ability to access register values.
>
> Assuming you're right that TCG is detecting "a io_readx/io_writex when
> ->can_do_io is not true", could we detect this case when it occurs and
> omit the instruction callbacks for the re-translation of the single
> instruction (allow the initial callback to stand instead of trying to
> turn back time, in a way, to prevent it)? Maybe there would have be some
> bookkeeping in the plugin infrastructure side rather than entirely
> omitting the callbacks when re-translating, in case that translation got
> re-used in a case which didn't hit the same behavior and shouldn't be
> skipped?

They are happening in two separate phases. The translation phase has no
idea what the runtime condition will be. Once we get to runtime it's too
late - and we trigger a new translation phase.

I'll see what Richard thinks. I must admit I thought can_do_io was only
an issue for -icount modes but I think the real picture is slightly more
confused than that.

>
> I admit I don't understand all the intricacies here, so what I suggest
> may not be reasonable...
>
> -Aaron


-- 
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]