qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4] hw/i386/pc: improve physical address space bound check fo


From: Ani Sinha
Subject: Re: [PATCH v4] hw/i386/pc: improve physical address space bound check for 32-bit x86 systems
Date: Fri, 22 Sep 2023 18:35:52 +0530


> On 22-Sep-2023, at 6:26 PM, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
> 
> On 22/9/23 14:00, Ani Sinha wrote:
>>> On 22-Sep-2023, at 4:12 PM, Philippe Mathieu-Daudé <philmd@linaro.org> 
>>> wrote:
>>> 
>>> On 22/9/23 06:16, Ani Sinha wrote:
>>>> 32-bit x86 systems do not have a reserved memory for hole64. On those 
>>>> 32-bit
>>>> systems without PSE36 or PAE CPU features, hotplugging memory devices are 
>>>> not
>>>> supported by QEMU as QEMU always places hotplugged memory above 4 GiB 
>>>> boundary
>>>> which is beyond the physical address space of the processor. Linux guests 
>>>> also
>>>> does not support memory hotplug on those systems. Please see Linux
>>>> kernel commit b59d02ed08690 ("mm/memory_hotplug: disable the functionality
>>>> for 32b") for more details.
>>>> Therefore, the maximum limit of the guest physical address in the absence 
>>>> of
>>>> additional memory devices effectively coincides with the end of
>>>> "above 4G memory space" region for 32-bit x86 without PAE/PSE36. When users
>>>> configure additional memory devices, after properly accounting for the
>>>> additional device memory region to find the maximum value of the guest
>>>> physical address, the address will be outside the range of the processor's
>>>> physical address space.
>>>> This change adds improvements to take above into consideration.
>>>> For example, previously this was allowed:
>>>> $ ./qemu-system-x86_64 -cpu pentium -m size=10G
>>>> With this change now it is no longer allowed:
>>>> $ ./qemu-system-x86_64 -cpu pentium -m size=10G
>>>> qemu-system-x86_64: Address space limit 0xffffffff < 0x2bfffffff phys-bits 
>>>> too low (32)
>>>> However, the following are allowed since on both cases physical address
>>>> space of the processor is 36 bits:
>>>> $ ./qemu-system-x86_64 -cpu pentium2 -m size=10G
>>>> $ ./qemu-system-x86_64 -cpu pentium,pse36=on -m size=10G
>>>> For 32-bit, without PAE/PSE36, hotplugging additional memory is no longer 
>>>> allowed.
>>>> $ ./qemu-system-i386 -m size=1G,maxmem=3G,slots=2
>>>> qemu-system-i386: Address space limit 0xffffffff < 0x1ffffffff phys-bits 
>>>> too low (32)
>>>> $ ./qemu-system-i386 -machine q35 -m size=1G,maxmem=3G,slots=2
>>>> qemu-system-i386: Address space limit 0xffffffff < 0x1ffffffff phys-bits 
>>>> too low (32)
>>>> A new compatibility flag is introduced to make sure pc_max_used_gpa() keeps
>>>> returning the old value for machines 8.1 and older.
>>>> Therefore, the above is still allowed for older machine types in order to 
>>>> support
>>>> compatibility. Hence, the following still works:
>>>> $ ./qemu-system-i386 -machine pc-i440fx-8.1 -m size=1G,maxmem=3G,slots=2
>>>> $ ./qemu-system-i386 -machine pc-q35-8.1 -m size=1G,maxmem=3G,slots=2
>>>> Further, following is also allowed as with PSE36, the processor has 36-bit
>>>> address space:
>>>> $ ./qemu-system-i386 -cpu 486,pse36=on -m size=1G,maxmem=3G,slots=2
>>>> After calling CPUID with EAX=0x80000001, all AMD64 compliant processors
>>>> have the longmode-capable-bit turned on in the extended feature flags (bit 
>>>> 29)
>>>> in EDX. The absence of CPUID longmode can be used to differentiate between
>>>> 32-bit and 64-bit processors and is the recommended approach. QEMU takes 
>>>> this
>>>> approach elsewhere (for example, please see x86_cpu_realizefn()), With
>>>> this change, pc_max_used_gpa() also uses the same method to detect 32-bit
>>>> processors.
>>>> Unit tests are modified to not run 32-bit x86 tests that use memory 
>>>> hotplug.
>>>> Suggested-by: David Hildenbrand <david@redhat.com>
>>>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
>>>> ---
>>>>  hw/i386/pc.c                   | 31 ++++++++++++++++++++++++++++---
>>>>  hw/i386/pc_piix.c              |  4 ++++
>>>>  hw/i386/pc_q35.c               |  2 ++
>>>>  include/hw/i386/pc.h           |  6 ++++++
>>>>  tests/qtest/bios-tables-test.c | 26 ++++++++++++++++++--------
>>>>  tests/qtest/numa-test.c        |  7 ++++++-
>>>>  6 files changed, 64 insertions(+), 12 deletions(-)
>>> 
>>> 
>>>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>>>> index 54838c0c41..2a689cf0bd 100644
>>>> --- a/hw/i386/pc.c
>>>> +++ b/hw/i386/pc.c
>>>> @@ -907,12 +907,37 @@ static uint64_t pc_get_cxl_range_end(PCMachineState 
>>>> *pcms)
>>>>  static hwaddr pc_max_used_gpa(PCMachineState *pcms, uint64_t 
>>>> pci_hole64_size)
>>>>  {
>>>>      X86CPU *cpu = X86_CPU(first_cpu);
>>>> +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
>>>> +    MachineState *ms = MACHINE(pcms);
>>>> +    uint64_t devmem_start = 0;
>>>> +    ram_addr_t devmem_size = 0;
>>>>  -    /* 32-bit systems don't have hole64 thus return max CPU address */
>>>> -    if (cpu->phys_bits <= 32) {
>>>> -        return ((hwaddr)1 << cpu->phys_bits) - 1;
>>>> +    /*
>>>> +     * 32-bit systems don't have hole64 but they might have a region for
>>>> +     * memory devices. Even if additional hotplugged memory devices might
>>>> +     * not be usable by most guest OSes, we need to still consider them 
>>>> for
>>>> +     * calculating the highest possible GPA so that we can properly report
>>>> +     * if someone configures them on a CPU that cannot possibly address 
>>>> them.
>>>> +     */
>>>> +    if (!(cpu->env.features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM)) {
>>>> +        /* 32-bit systems */
>>>> +        if (!pcmc->broken_32bit_mem_addr_check) {
>>> 
>>> Nitpicking, code is simplified if you invert this condition check.
>> Maybe it reads better, but simplified? .. I am not so sure.
> 
> As:
> 
> -- >8 --
> @@ -907,12 +907,35 @@ static uint64_t pc_get_cxl_range_end(PCMachineState 
> *pcms)
> static hwaddr pc_max_used_gpa(PCMachineState *pcms, uint64_t pci_hole64_size)
> {
>     X86CPU *cpu = X86_CPU(first_cpu);
> +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    MachineState *ms = MACHINE(pcms);
> 
> -    /* 32-bit systems don't have hole64 thus return max CPU address */
> -    if (cpu->phys_bits <= 32) {
> -        return ((hwaddr)1 << cpu->phys_bits) - 1;
> +    /*
> +     * 32-bit systems don't have hole64 but they might have a region for
> +     * memory devices. Even if additional hotplugged memory devices might
> +     * not be usable by most guest OSes, we need to still consider them for
> +     * calculating the highest possible GPA so that we can properly report
> +     * if someone configures them on a CPU that cannot possibly address them.
> +     */
> +    if (!(cpu->env.features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM)) {
> +        /* 32-bit systems */
> +        if (pcmc->broken_32bit_mem_addr_check) {
> +            /* old value for compatibility reasons */
> +            return ((hwaddr)1 << cpu->phys_bits) - 1;
> +        }
> +        if (pcmc->has_reserved_memory && (ms->ram_size < ms->maxram_size)) {
> +            uint64_t devmem_start = 0;
> +            ram_addr_t devmem_size = 0;
> +
> +            pc_get_device_memory_range(pcms, &devmem_start, &devmem_size);
> +            devmem_start += devmem_size;
> +
> +            return devmem_start - 1;
> +        }
> +        return pc_above_4g_end(pcms) - 1;
>     }
> 
> +    /* 64-bit systems */
>     return pc_pci_hole64_start() + pci_hole64_size - 1;
> }
> ---
> 
> But then even simpler (to review):
> 
> -- >8 --
> @@ -907,13 +907,35 @@ static uint64_t pc_get_cxl_range_end(PCMachineState 
> *pcms)
> static hwaddr pc_max_used_gpa(PCMachineState *pcms, uint64_t pci_hole64_size)
> {
>     X86CPU *cpu = X86_CPU(first_cpu);
> +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    MachineState *ms = MACHINE(pcms);
> 
> -    /* 32-bit systems don't have hole64 thus return max CPU address */
> -    if (cpu->phys_bits <= 32) {
> -        return ((hwaddr)1 << cpu->phys_bits) - 1;
> +    if (cpu->env.features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> +        /* 64-bit systems */
> +        return pc_pci_hole64_start() + pci_hole64_size - 1;
>     }
> 
> -    return pc_pci_hole64_start() + pci_hole64_size - 1;
> +    /*
> +     * 32-bit systems don't have hole64 but they might have a region for
> +     * memory devices. Even if additional hotplugged memory devices might
> +     * not be usable by most guest OSes, we need to still consider them for
> +     * calculating the highest possible GPA so that we can properly report
> +     * if someone configures them on a CPU that cannot possibly address them.
> +     */
> +    if (pcmc->broken_32bit_mem_addr_check) {
> +        /* old value for compatibility reasons */
> +        return ((hwaddr)1 << cpu->phys_bits) - 1;
> +    }
> +    if (pcmc->has_reserved_memory && (ms->ram_size < ms->maxram_size)) {
> +        uint64_t devmem_start = 0;
> +        ram_addr_t devmem_size = 0;
> +
> +        pc_get_device_memory_range(pcms, &devmem_start, &devmem_size);
> +        devmem_start += devmem_size;
> +
> +        return devmem_start - 1;
> +    }
> +    return pc_above_4g_end(pcms) - 1;
> }
> ---

Yeah I started reworking it before your email and realized it does make the 
function much cleaner.

> 
> Personally I'd do it in 2 steps, first invert the if statement:
> 
> -- >8 --
> @@ -908,12 +908,13 @@ static hwaddr pc_max_used_gpa(PCMachineState *pcms, 
> uint64_t pci_hole64_size)
> {
>     X86CPU *cpu = X86_CPU(first_cpu);
> 
> -    /* 32-bit systems don't have hole64 thus return max CPU address */
> -    if (cpu->phys_bits <= 32) {
> -        return ((hwaddr)1 << cpu->phys_bits) - 1;
> +    if (cpu->phys_bits > 32) {
> +        /* 64-bit systems */
> +        return pc_pci_hole64_start() + pci_hole64_size - 1;
>     }
> 
> -    return pc_pci_hole64_start() + pci_hole64_size - 1;
> +    /* 32-bit systems don't have hole64 thus return max CPU address */
> +    return ((hwaddr)1 << cpu->phys_bits) - 1;
> }
> ---
> 
> Then your patch as:
> 
> -- >8 --
> @@ -907,14 +907,35 @@ static uint64_t pc_get_cxl_range_end(PCMachineState 
> *pcms)
> static hwaddr pc_max_used_gpa(PCMachineState *pcms, uint64_t pci_hole64_size)
> {
>     X86CPU *cpu = X86_CPU(first_cpu);
> +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    MachineState *ms = MACHINE(pcms);
> 
> -    if (cpu->phys_bits > 32) {
> +    if (cpu->env.features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
>         /* 64-bit systems */
>         return pc_pci_hole64_start() + pci_hole64_size - 1;
>     }
> 
> -    /* 32-bit systems don't have hole64 thus return max CPU address */
> -    return ((hwaddr)1 << cpu->phys_bits) - 1;
> +    /*
> +     * 32-bit systems don't have hole64 but they might have a region for
> +     * memory devices. Even if additional hotplugged memory devices might
> +     * not be usable by most guest OSes, we need to still consider them for
> +     * calculating the highest possible GPA so that we can properly report
> +     * if someone configures them on a CPU that cannot possibly address them.
> +     */
> +    if (pcmc->broken_32bit_mem_addr_check) {
> +        /* old value for compatibility reasons */
> +        return ((hwaddr)1 << cpu->phys_bits) - 1;
> +    }
> +    if (pcmc->has_reserved_memory && (ms->ram_size < ms->maxram_size)) {
> +        uint64_t devmem_start = 0;
> +        ram_addr_t devmem_size = 0;
> +
> +        pc_get_device_memory_range(pcms, &devmem_start, &devmem_size);
> +        devmem_start += devmem_size;
> +
> +        return devmem_start - 1;
> +    }
> +    return pc_above_4g_end(pcms) - 1;
> }
> ---
> 
>>> 
>>>> +            if (pcmc->has_reserved_memory &&
>>>> +                (ms->ram_size < ms->maxram_size)) {
>>>> +                pc_get_device_memory_range(pcms, &devmem_start,
>>>> +                                           &devmem_size);
>>>> +                devmem_start += devmem_size;
>>>> +                return devmem_start - 1;
>>>> +            } else {
>>>> +                return pc_above_4g_end(pcms) - 1;
>>>> +            }
>>>> +        } else {
>>>> +            /* old value for compatibility reasons */
>>>> +            return ((hwaddr)1 << cpu->phys_bits) - 1;
> 
> Left shifting by N and removing 1 makes a mask of N bits,
> but I had to do the logical operation mentally to figure it out,
> 
>>> 
>>> Since you change this line, can we convert to
>>> MAKE_64BIT_MASK(0, cpu->phys_bits) ?
> 
> while here this macro says "make a mask" without having to look
> at the definition.

I will leave this part as is for now. 

> 
> Anyhow as I said, "nitpicking...".
> 
>> Doesn’t the existing code reads better? Assuming that the macro does exactly 
>> the same thing, one has to still look up the definition. And
>>  (((~0ULL) >> (64 - (length))) << (shift))
>> Is such a brain twister :-)
>>> 
>>>> +        }
>>>>      }
>>>>  +    /* 64-bit systems */
>>>>      return pc_pci_hole64_start() + pci_hole64_size - 1;
>>>>  }




reply via email to

[Prev in Thread] Current Thread [Next in Thread]