qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 08/14] RISC-V: Adding T-Head MemPair extension


From: LIU Zhiwei
Subject: Re: [PATCH v3 08/14] RISC-V: Adding T-Head MemPair extension
Date: Tue, 31 Jan 2023 10:34:07 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1


On 2023/1/31 3:03, Richard Henderson wrote:
On 1/29/23 22:41, LIU Zhiwei wrote:

On 2023/1/30 13:43, Richard Henderson wrote:
On 1/29/23 16:03, LIU Zhiwei wrote:
Thanks. It's a bug. We should load all memory addresses to  local TCG temps first.

Do you think we should probe all the memory addresses for the store pair instructions? If so, can we avoid the use of a helper function?

Depends on what the hardware does.  Even with a trap in the middle the stores are restartable, since no register state changes.

I refer to the specification of LDP and STP on AARCH64. The specification allows

"any access performed before the exception was taken is repeated".

In detailed,

"If, according to these rules, an instruction is executed as a sequence of accesses, exceptions, including interrupts, can be taken during that sequence, regardless of the memory type being accessed. If any of these exceptions are returned from using their preferred return address, the instruction that generated the sequence of accesses is re-executed, and so any access performed before the exception was taken is repeated. See also Taking an interrupt
during a multi-access load or store on page D1-4664."

However I see the implementation of LDP and STP on QEMU are in different ways. LDP will only load the first register when it ensures no trap in the second access.

So I have two questions here.

1) One for the QEMU implementation about LDP. Can we implement the LDP as two directly loads to cpu registers instead of local TCG temps?

For the Thead specification, where rd1 != rs1 (and you enforce it), then yes, I suppose you could load directly to the cpu registers, because on restart rs1 would be unmodified.

For AArch64, which you quote above, there is no constraint that the destinations do not overlap the address register, so we must implement "LDP r0, r1, [r0]" as a load into temps.

Got it. Thanks.

2) One for the comment. Why register state changes cause non-restartable? Do you mean if the first register changes, it may influence the calculation of address after the trap?

Yes, that's what I mean about non-restartable -- if any of the input registers are changed before the trap is recognized.


Thanks for the clarification.

Once I thought the reason of non-restartable is the side effects of repeat execution, which may cause watchpoint matches twice or access MMIO device twice.

Yes.  Conciser what happens when the insn is encoded with .long. Does the hardware trap an illegal instruction?  Is the behavior simply unspecified?  The manual could be improved to specify, akin to the Arm terms: UNDEFINED, CONSTRAINED UNPREDICTABLE, IMPLEMENTATION DEFINED, etc.


Thanks, I will fix the manual.

The manual has been fixed  by Christopher.  Thanks.

Best Regards,
Zhiwei


Excellent, thanks.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]