qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] hw/arm/virt: Improve address assignment for highmem IO r


From: Eric Auger
Subject: Re: [PATCH 1/2] hw/arm/virt: Improve address assignment for highmem IO regions
Date: Wed, 3 Aug 2022 14:52:51 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0

Hi,
On 8/3/22 15:02, Gavin Shan wrote:
> Hi Marc,
>
> On 8/3/22 5:01 PM, Marc Zyngier wrote:
>> On Wed, 03 Aug 2022 04:01:04 +0100,
>> Gavin Shan <gshan@redhat.com> wrote:
>>> On 8/2/22 7:41 PM, Eric Auger wrote:
>>>> On 8/2/22 08:45, Gavin Shan wrote:
>>>>> There are 3 highmem IO regions as below. They can be disabled in
>>>>> two situations: (a) The specific region is disabled by user. (b)
>>>>> The specific region doesn't fit in the PA space. However, the base
>>>>> address and highest_gpa are still updated no matter if the region
>>>>> is enabled or disabled. It's incorrectly incurring waste in the PA
>>>>> space.
>>>> If I am not wrong highmem_redists and highmem_mmio are not user
>>>> selectable
>>>>
>>>> Only highmem ecam depends on machine type & ACPI setup. But I would
>>>> say
>>>> that in server use case it is always set. So is that optimization
>>>> really
>>>> needed?
>>>
>>> There are two other cases you missed.
>>>
>>> - highmem_ecam is enabled after virt-2.12, meaning it stays disabled
>>>    before that.
>>
>> I don't get this. The current behaviour is to disable highmem_ecam if
>> it doesn't fit in the PA space. I can't see anything that enables it
>> if it was disabled the first place.
>>
>
> There are several places or conditions where vms->highmem_ecam can be
> disabled:
>
> - virt_instance_init() where vms->highmem_ecam is inherited from
>   !vmc->no_highmem_ecam. The option is set to true after virt-2.12
>   in virt_machine_2_12_options().
>
> - machvirt_init() where vms->highmem_ecam can be disable if we have
>   32-bits vCPUs and failure on loading firmware.
>
> - Another place is where we're talking about. It's address assignment
>   to fit the PA space.
>
>>>
>>> - The high memory region can be disabled if user is asking large
>>>    (normal) memory space through 'maxmem=' option. When the requested
>>>    memory by 'maxmem=' is large enough, the high memory regions are
>>>    disabled. It means the normal memory has higher priority than those
>>>    high memory regions. This is the case I provided in (b) of the
>>>    commit log.
>>
>> Why is that a problem? It matches the expected behaviour, as the
>> highmem IO region is floating and is pushed up by the memory region.
>>
>
> Eric thought that VIRT_HIGH_GIC_REDIST2 and VIRT_HIGH_PCIE_MMIO regions
> aren't user selectable. I tended to explain why it's not true. 'maxmem='
> can affect the outcome. When 'maxmem=' value is big enough, there will be
> no free area in the PA space to hold those two regions.
>
>>>
>>> In the commit log, I was supposed to say something like below for
>>> (a):
>>>
>>> - The specific high memory region can be disabled through changing
>>>    the code by user or developer. For example, 'vms->highmem_mmio'
>>>    is changed from true to false in virt_instance_init().
>>
>> Huh. By this principle, the user can change anything. Why is it
>> important?
>>
>
> Still like above. I was explaining the possible cases where those
> 3 switches can be turned on/off by users or developers. Our code
> needs to be consistent and comprehensive.
>
>   vms->highmem_redists
>   vms->highmem_ecam
>   vms->mmio
>
>>>
>>>>>
>>>>> Improve address assignment for highmem IO regions to avoid the waste
>>>>> in the PA space by putting the logic into virt_memmap_fits().
>>
>> I guess that this is what I understand the least. What do you mean by
>> "wasted PA space"? Either the regions fit in the PA space, and
>> computing their addresses in relevant, or they fall outside of it and
>> what we stick in memap[index].base is completely irrelevant.
>>
>
> It's possible that we run into the following combination. we should
> have enough PA space to enable VIRT_HIGH_PCIE_MMIO region. However,
> the region is disabled in the original implementation because
> VIRT_HIGH_{GIC_REDIST2, PCIE_ECAM} regions consumed 1GB, which is
> unecessary and waste in the PA space.
each region's base is aligned on its size.
>
>     static MemMapEntry extended_memmap[] = {
>         [VIRT_HIGH_GIC_REDIST2] =   { 0x0, 64 * MiB },
>         [VIRT_HIGH_PCIE_ECAM] =     { 0x0, 256 * MiB },
>         [VIRT_HIGH_PCIE_MMIO] =     { 0x0, 512 * GiB },
so anyway MMIO is at least at 512GB. Having a 1TB IPA space does not
imply any amount of RAM. This depends on the address space.
I    };
>
>     IPA_LIMIT           = (1UL << 40)
>     '-maxmem'           = 511GB              /* Memory starts from 1GB */
>     '-slots'            = 0
>     vms->highmem_rdist2 = false
How can this happen? the only reason for highmem_redists to be reset is
if it does not fit into map_ipa. So if mmio fits, highmem_redists does
too. What do I miss?
>     vms->highmem_ecam   = false
    vms->highmem_mmio   = true
I am still skeptic about the relevance of such optim. Isn't it sensible
to expect the host to support large IPA if you want to use such amount
of memory.
I think we should rather better document the memory map from a user
point of view.
Besides if you change the memory map, you need to introduce yet another
option to avoid breaking migration, no?

Eric
>
>>>>>
>>>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>>>> ---
>>>>>    hw/arm/virt.c | 54
>>>>> +++++++++++++++++++++++++++++----------------------
>>>>>    1 file changed, 31 insertions(+), 23 deletions(-)
>>>>>
>>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>>> index 9633f822f3..bc0cd218f9 100644
>>>>> --- a/hw/arm/virt.c
>>>>> +++ b/hw/arm/virt.c
>>>>> @@ -1688,6 +1688,34 @@ static uint64_t
>>>>> virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
>>>>>        return arm_cpu_mp_affinity(idx, clustersz);
>>>>>    }
>>>>>    +static void virt_memmap_fits(VirtMachineState *vms, int index,
>>>>> +                             bool *enabled, hwaddr *base, int
>>>>> pa_bits)
>>>>> +{
>>>>> +    hwaddr size = extended_memmap[index].size;
>>>>> +
>>>>> +    /* The region will be disabled if its size isn't given */
>>>>> +    if (!*enabled || !size) {
>>>> In which case do you have !size?
>>>
>>> Yeah, we don't have !size and the condition should be removed.
>>>
>>>>> +        *enabled = false;
>>>>> +        vms->memmap[index].base = 0;
>>>>> +        vms->memmap[index].size = 0;
>>>> It looks dangerous to me to reset the region's base and size like
>>>> that.
>>>> for instance fdt_add_gic_node() will add dummy data in the dt.
>>>
>>> I would guess you missed that the high memory regions won't be exported
>>> through device-tree if they have been disabled. We have a check there,
>>> which is "if (nb_redist_regions == 1)"
>>>
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    /*
>>>>> +     * Check if the memory region fits in the PA space. The
>>>>> memory map
>>>>> +     * and highest_gpa are updated if it fits. Otherwise, it's
>>>>> disabled.
>>>>> +     */
>>>>> +    *enabled = (ROUND_UP(*base, size) + size <= BIT_ULL(pa_bits));
>>>> using a 'fits' local variable would make the code more obvious I think
>>>
>>> Lets confirm if you're suggesting something like below?
>>>
>>>          bool fits;
>>>
>>>          fits = (ROUND_UP(*base, size) + size <= BIT_ULL(pa_bits));
>>>
>>>          if (fits) {
>>>             /* update *base, memory mapping, highest_gpa */
>>>          } else {
>>>             *enabled = false;
>>>          }
>>>
>>> I guess we can simply do
>>>
>>>          if (ROUND_UP(*base, size) + size <= BIT_ULL(pa_bits)) {
>>>             /* update *base, memory mapping, highest_gpa */
>>>          } else {
>>>             *enabled = false;
>>>          }
>>>
>>> Please let me know which one looks best to you.
>>
>> Why should the 'enabled' flag be updated by this function, instead of
>> returning the value and keeping it as an assignment in the caller
>> function? It is purely stylistic though.
>>
>
> The idea to put the logic, address assignment for those 3 high memory
> regions or updating the flags (vms->high_redist2/ecam/mmio), to the
> newly introduced function, to make virt_set_memmap() a bit simplified.
> Eric tends to agree the changes with minor adjustments. So I'm going
> to keep it as of being, which doesn't mean the stylistic thought is
> a bad one :)
>
> Lastly, I need to rewrite the comit log completely because it seems
> causing confusions to Eric and you.
>
> Thanks,
> Gavin
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]