qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] tcg/ppc: Fix race in goto_tb implementation


From: Michael Tokarev
Subject: Re: [PATCH] tcg/ppc: Fix race in goto_tb implementation
Date: Mon, 17 Jul 2023 11:56:09 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

17.07.2023 04:23, Jordan Niethe wrote:
Commit 20b6643324 ("tcg/ppc: Reorg goto_tb implementation") modified
goto_tb to ensure only a single instruction was patched to prevent
incorrect behaviour if a thread was in the middle of multiple
instructions when they were replaced. However this introduced a race
between loading the jmp target into TCG_REG_TB and patching and
executing the direct branch.

The relevent part of the goto_tb implementation:

     ld TCG_REG_TB, TARGET_ADDR_LOCATION(TCG_REG_TB)
   patch_location:
     mtctr TCG_REG_TB
     bctr

tb_target_set_jmp_target() will replace 'patch_location' with a direct
branch if the target is in range. The direct branch now relies on
TCG_REG_TB being set up correctly by the ld. Prior to this commit
multiple instructions were patched in for the direct branch case; these
instructions would initalise TCG_REG_TB to the same value as the branch
target.

Imagine the following sequence:

1) Thread A is executing the goto_tb sequence and loads the jmp
    target into TCG_REG_TB.

2) Thread B updates the jmp target address and calls
    tb_target_set_jmp_target(). This patches a new direct branch into the
    goto_tb sequence.

3) Thread A executes the newly patched direct branch. The value in
    TCG_REG_TB still contains the old jmp target.

TCG_REG_TB MUST contain the translation block's tc.ptr. Execution will
eventually crash after performing memory accesses generated from a
faulty value in TCG_REG_TB.

This presents as segfaults or illegal instruction exceptions.

Do not revert commit 20b6643324 as it did fix a different race
condition. Instead remove the direct branch optimization and always use
indirect branches.

The direct branch optimization can be re-added later with a race free
sequence.

Gitlab issue: https://gitlab.com/qemu-project/qemu/-/issues/1726

I confirm this fixes the issue hit by debian as well, - 30 runs in
a row already with this patch and counting, while before it failed
almost reliable on first try, sometimes on 2nd.

Tested-by: Michael Tokarev <mjt@tls.msk.ru>

(this is max I can do, as I don't know tcg at all :) )

Thank you very much for the fix!

/mjt

@@ -2565,10 +2564,11 @@ void tb_target_set_jmp_target(const TranslationBlock 
*tb, int n,
      intptr_t diff = addr - jmp_rx;
      tcg_insn_unit insn;
+ if (USE_REG_TB)
+        return;
+
      if (in_range_b(diff)) {
          insn = B | (diff & 0x3fffffc);





reply via email to

[Prev in Thread] Current Thread [Next in Thread]