qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] target/riscv: reduce overhead of MSTATUS_SUM change


From: Wu, Fei
Subject: Re: [PATCH] target/riscv: reduce overhead of MSTATUS_SUM change
Date: Wed, 22 Mar 2023 10:47:29 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0

On 3/22/2023 9:58 AM, LIU Zhiwei wrote:
> 
> On 2023/3/22 0:10, Richard Henderson wrote:
>> On 3/20/23 23:37, fei2.wu@intel.com wrote:
>>> From: Fei Wu <fei2.wu@intel.com>
>>>
>>> Kernel needs to access user mode memory e.g. during syscalls, the window
>>> is usually opened up for a very limited time through MSTATUS.SUM, the
>>> overhead is too much if tlb_flush() gets called for every SUM change.
>>> This patch saves addresses accessed when SUM=1, and flushs only these
>>> pages when SUM changes to 0. If the buffer is not large enough to save
>>> all the pages during SUM=1, it will fall back to tlb_flush when
>>> necessary.
>>>
>>> The buffer size is set to 4 since in this MSTATUS.SUM open-up window,
>>> most of the time kernel accesses 1 or 2 pages, it's very rare to see
>>> more than 4 pages accessed.
>>>
>>> It's not necessary to save/restore these new added status, as
>>> tlb_flush() is always called after restore.
>>>
>>> Result of 'pipe 10' from unixbench boosts from 223656 to 1327407. Many
>>> other syscalls benefit a lot from this one too.
>>
>> This is not the correct approach.
>>
>> You should be making use of different softmmu indexes, similar to how
>> ARM uses a separate index for PAN (privileged access never) mode.  If
>> I read the manual properly, PAN == !SUM.
>>
>> When you do this, you need no additional flushing.
> 
> Hi Fei,
> 
> Let's follow Richard's advice.
>Yes, I'm thinking about how to do it, and thank Richard for the advice.

My question is:
* If we ensure this separate index (S+SUM) has no overlapping tlb
entries with S-mode (ignore M-mode so far), during SUM=1, we have to
look into both (S+SUM) and S index for kernel address translation, that
should be not desired.

* If all the tlb operations are against (S+SUM) during SUM=1, then
(S+SUM) could contain some duplicated tlb entries of kernel address in S
index, the duplication means extra tlb lookup and fill. Also if we want
to flush tlb entry of specific addr0, we have to flush both index.

I will take a look at how arm handles this.

Thanks,
Fei.

> Zhiwei
> 
>>
>>
>> r~




reply via email to

[Prev in Thread] Current Thread [Next in Thread]