qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 08/14] RISC-V: Adding T-Head MemPair extension


From: Richard Henderson
Subject: Re: [PATCH v3 08/14] RISC-V: Adding T-Head MemPair extension
Date: Mon, 30 Jan 2023 09:03:32 -1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 1/29/23 22:41, LIU Zhiwei wrote:

On 2023/1/30 13:43, Richard Henderson wrote:
On 1/29/23 16:03, LIU Zhiwei wrote:
Thanks. It's a bug. We should load all memory addresses to  local TCG temps 
first.

Do you think we should probe all the memory addresses for the store pair instructions? If so, can we avoid the use of a helper function?

Depends on what the hardware does.  Even with a trap in the middle the stores are restartable, since no register state changes.

I refer to the specification of LDP and STP on AARCH64. The specification allows

"any access performed before the exception was taken is repeated".

In detailed,

"If, according to these rules, an instruction is executed as a sequence of 
accesses, exceptions, including interrupts,
can be taken during that sequence, regardless of the memory type being 
accessed. If any of these exceptions are
returned from using their preferred return address, the instruction that 
generated the sequence of accesses is
re-executed, and so any access performed before the exception was taken is 
repeated. See also Taking an interrupt
during a multi-access load or store on page D1-4664."

However I see the implementation of LDP and STP on QEMU are in different ways. LDP will only load the first register when it ensures no trap in the second access.

So I have two questions here.

1) One for the QEMU implementation about LDP. Can we implement the LDP as two directly loads to cpu registers instead of local TCG temps?

For the Thead specification, where rd1 != rs1 (and you enforce it), then yes, I suppose you could load directly to the cpu registers, because on restart rs1 would be unmodified.

For AArch64, which you quote above, there is no constraint that the destinations do not overlap the address register, so we must implement "LDP r0, r1, [r0]" as a load into temps.


2) One for the comment. Why register state changes cause non-restartable? Do you mean if the first register changes, it may influence the calculation of address after the trap?

Yes, that's what I mean about non-restartable -- if any of the input registers are changed before the trap is recognized.


Yes.  Conciser what happens when the insn is encoded with .long. Does the hardware trap an illegal instruction?  Is the behavior simply unspecified?  The manual could be improved to specify, akin to the Arm terms: UNDEFINED, CONSTRAINED UNPREDICTABLE, IMPLEMENTATION DEFINED, etc.


Thanks, I will fix the manual.

Excellent, thanks.


r~




reply via email to

[Prev in Thread] Current Thread [Next in Thread]