qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [QUESTION] tcg: Is concurrent storing and code translation of the sa


From: Liren Wei
Subject: Re: [QUESTION] tcg: Is concurrent storing and code translation of the same code page considered as racing in MTTCG?
Date: Tue, 2 Feb 2021 00:59:08 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0

On 2/1/21 7:01 AM, Richard Henderson wrote:
> Yes, this is a bug, because we are trying to support e.g. x86 which does not
> require an icache flush.

That is too bad :(

I know nothing about the modern hardware, it's really hard to imagine what
is done in CPU to maintain the coherence when this kind of cross-modifying
scenario happens.

> I think the page lock, the TLB_NOTDIRTY setting, and a possible sync on the
> setting, needs to happen before the bytes are read during translation.
> Otherwise we don't catch the case above, nor do we catch
>
>     CPU1                  CPU2
>     ------------------    --------------------------
>     TLB check -> fast
>                           tb_gen_code() -> all of it
>       write to ram
>
> Also because of x86 (and other architectures in which a single instruction can > span a page boundary), I think this lock+set+sync sequence needs to happen on
> demand in something called from the function set defined in
> include/exec/translator.h
>
> That also means that any target/cpu/ which has not been converted to use that
> interface remains broken, and should be converted or deprecated.

I failed to figure out what do you mean by lock+set+sync, in particular:
  - What is the use of the page lock here (Is this the lock of PageDesc?)
  - Is the "possible sync" means some kind of wait so that TLB_NOTDIRTY is
    definitely able to catch further "write to ram"?

> Are you planning to work on this?

No, sorry for that.. Neither do I see myself qualified enough to do this job,
nor do I have enough time for it. But I did considered the following:

Since "TLB check" and "fast path write to ram" are separate steps, it seems
to me that CPU1 can always (in the extreme case) enter the fast path before
CPU2 starts doing translation, and then write to already-translated code
of CPU2 without informing it.

Therefore maybe we can mark the RAM backing page in QEMU's page table as
non-writable at an early stage in tb_gen_code() using the ability of the
underlying OS, register a signal handler to intercept the first "write to ram"
happened, restore the page to be writable, and eventually inform the
translating thread to do something about it. (e.g. queue_work_on_cpu() and
cpu_exit() the translating vCPU so that it has chance to invalidate the TB
after possibly running that TB for several times)

But all these sounds very intrusive to the existing code base, and I'm not
sure whether it make sense...

Thanks
Liren Wei





reply via email to

[Prev in Thread] Current Thread [Next in Thread]