qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [RFC v1 1/3] hw/acpi/aml-build: Add cache structure table creation f


From: JeeHeng Sia
Subject: RE: [RFC v1 1/3] hw/acpi/aml-build: Add cache structure table creation for PPTT table
Date: Tue, 30 Jan 2024 05:00:57 +0000


> -----Original Message-----
> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Sent: Monday, January 29, 2024 7:03 PM
> To: JeeHeng Sia <jeeheng.sia@starfivetech.com>
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; qemu-riscv@nongnu.org; 
> mst@redhat.com; imammedo@redhat.com;
> anisinha@redhat.com; shannon.zhaosl@gmail.com; peter.maydell@linaro.org; 
> sunilvl@ventanamicro.com; palmer@dabbelt.com;
> alistair.francis@wdc.com; bin.meng@windriver.com; liwei1518@gmail.com; 
> dbarboza@ventanamicro.com;
> zhiwei_liu@linux.alibaba.com
> Subject: Re: [RFC v1 1/3] hw/acpi/aml-build: Add cache structure table 
> creation for PPTT table
> 
> On Mon, 29 Jan 2024 00:14:21 -0800
> Sia Jee Heng <jeeheng.sia@starfivetech.com> wrote:
> 
> > Adds cache structure table generation for the Processor Properties
> > Topology Table (PPTT) to describe cache hierarchy information for
> > ACPI guests.
> >
> > A 3-level cache topology is employed here, referring to the type 1 cache
> > structure according to ACPI spec v6.3. The L1 cache and L2 cache are
> > private resources for the core, while the L3 cache is the private
> > resource for the cluster.
> >
> > In the absence of cluster values in the QEMU command, a 2-layer cache is
> > expected. The default cache value should be passed in from the
> > architecture code.
> >
> > Examples:
> > 3-layer: -smp 4,sockets=1,clusters=2,cores=2,threads=1
> > 2-layer: -smp 4,sockets=1,cores=2,threads=2
> >
> > Signed-off-by: Sia Jee Heng <jeeheng.sia@starfivetech.com>
> 
> Hi,
> 
> I'm not keen on the topology assumptions this is making.
> If were to use this on our Kunpeng 920 for guests then the description would
> be wrong as we only share the l3 tags at the cluster level, the
> L3 is die level (NUMA node). So for the physical machine we present
> a cluster with no associated caches.  For other platforms this would be
> even further from the truth.
Should you consider a file like kunpeng920.c and then pass the necessary
value to the build_pptt() function?
> 
> If we are presenting caches in PPTT (which I do want to see) then
> we need additional controls to specify the levels at which the
> appropriate caches are found.
I understood and I'm wonder if adding default value meet your needs?
> 
> There have been various proposals for how to do that description:
> https://lore.kernel.org/qemu-devel/20230808115713.2613-2-Jonathan.Cameron@huawei.com/
> was my brief go at this (and had PPTT cache descriptions).
I can spend time to try out your patches, but it will be good for a
short command. Btw, it seems you stop for many months, do you
plan for a v2 or I will continue by update with your v2?
> 
> Maybe it's acceptable to have some defaults.
I would suggest to have some default value.
> 
> A few other review comments inline.
> 
> Give an example of the disassembled PPTT so we can see what is being
> built.  Need to clear if you are sharing descriptions across multiple
> instances of a given cache (which is allowed if no cache IDs).
> Looks like you do separate entries which is good because that's needed
> in latest definition (but wasn't in 6.3 and people built systems that
> didn't do separate entries).
Sure, here is the example output with clusters=2,core=2,thread=1
[000h 0000 004h]                   Signature : "PPTT"    [Processor Properties 
Topology Table]
[004h 0004 004h]                Table Length : 00000208
[008h 0008 001h]                    Revision : 02
[009h 0009 001h]                    Checksum : 88
[00Ah 0010 006h]                      Oem ID : "BOCHS "
[010h 0016 008h]                Oem Table ID : "BXPC    "
[018h 0024 004h]                Oem Revision : 00000001
[01Ch 0028 004h]             Asl Compiler ID : "BXPC"
[020h 0032 004h]       Asl Compiler Revision : 00000001


[024h 0036 001h]               Subtable Type : 00 [Processor Hierarchy Node]
[025h 0037 001h]                      Length : 14
[026h 0038 002h]                    Reserved : 0000
[028h 0040 004h]       Flags (decoded below) : 00000001
                            Physical package : 1
                     ACPI Processor ID valid : 0
                       Processor is a thread : 0
                              Node is a leaf : 0
                    Identical Implementation : 0
[02Ch 0044 004h]                      Parent : 00000000
[030h 0048 004h]           ACPI Processor ID : 00000000
[034h 0052 004h]     Private Resource Number : 00000000
[038h 0056 001h]               Subtable Type : 01 [Cache Type]
[039h 0057 001h]                      Length : 18
[03Ah 0058 002h]                    Reserved : 0000
[03Ch 0060 004h]       Flags (decoded below) : 0000007F
                                  Size valid : 1
                        Number of Sets valid : 1
                         Associativity valid : 1
                       Allocation Type valid : 1
                            Cache Type valid : 1
                          Write Policy valid : 1
                             Line Size valid : 1
                              Cache ID valid : 0
[040h 0064 004h]         Next Level of Cache : 00000000
[044h 0068 004h]                        Size : 00400000
[048h 0072 004h]              Number of Sets : 00002000
[04Ch 0076 001h]               Associativity : 08
[04Dh 0077 001h]                  Attributes : 0A
                             Allocation Type : 2
                                  Cache Type : 2
                                Write Policy : 0
[04Eh 0078 002h]                   Line Size : 0040

[050h 0080 001h]               Subtable Type : 00 [Processor Hierarchy Node]
[051h 0081 001h]                      Length : 18
[052h 0082 002h]                    Reserved : 0000
[054h 0084 004h]       Flags (decoded below) : 00000000
                            Physical package : 0
                     ACPI Processor ID valid : 0
                       Processor is a thread : 0
                              Node is a leaf : 0
                    Identical Implementation : 0
[058h 0088 004h]                      Parent : 00000024
[05Ch 0092 004h]           ACPI Processor ID : 00000000
[060h 0096 004h]     Private Resource Number : 00000001
[064h 0100 004h]            Private Resource : 00000038
[068h 0104 001h]               Subtable Type : 01 [Cache Type]
[069h 0105 001h]                      Length : 18
[06Ah 0106 002h]                    Reserved : 0000
[06Ch 0108 004h]       Flags (decoded below) : 0000007F
                                  Size valid : 1
                        Number of Sets valid : 1
                         Associativity valid : 1
                       Allocation Type valid : 1
                            Cache Type valid : 1
                          Write Policy valid : 1
                             Line Size valid : 1
                              Cache ID valid : 0
[070h 0112 004h]         Next Level of Cache : 00000000
[074h 0116 004h]                        Size : 00200000
[078h 0120 004h]              Number of Sets : 00001000
[07Ch 0124 001h]               Associativity : 08
[07Dh 0125 001h]                  Attributes : 0A
                             Allocation Type : 2
                                  Cache Type : 2
                                Write Policy : 0
[07Eh 0126 002h]                   Line Size : 0040

[080h 0128 001h]               Subtable Type : 01 [Cache Type]
[081h 0129 001h]                      Length : 18
[082h 0130 002h]                    Reserved : 0000
[084h 0132 004h]       Flags (decoded below) : 0000007F
                                  Size valid : 1
                        Number of Sets valid : 1
                         Associativity valid : 1
                       Allocation Type valid : 1
                            Cache Type valid : 1
                          Write Policy valid : 1
                             Line Size valid : 1
                              Cache ID valid : 0
[088h 0136 004h]         Next Level of Cache : 00000068
[08Ch 0140 004h]                        Size : 00010000
[090h 0144 004h]              Number of Sets : 00000100
[094h 0148 001h]               Associativity : 04
[095h 0149 001h]                  Attributes : 02
                             Allocation Type : 2
                                  Cache Type : 0
                                Write Policy : 0
[096h 0150 002h]                   Line Size : 0040
[098h 0152 001h]               Subtable Type : 01 [Cache Type]
[099h 0153 001h]                      Length : 18
[09Ah 0154 002h]                    Reserved : 0000
[09Ch 0156 004h]       Flags (decoded below) : 0000007F
                                  Size valid : 1
                        Number of Sets valid : 1
                         Associativity valid : 1
                       Allocation Type valid : 1
                            Cache Type valid : 1
                          Write Policy valid : 1
                             Line Size valid : 1
                              Cache ID valid : 0
[0A0h 0160 004h]         Next Level of Cache : 00000068
[0A4h 0164 004h]                        Size : 00010000
[0A8h 0168 004h]              Number of Sets : 00000100
[0ACh 0172 001h]               Associativity : 04
[0ADh 0173 001h]                  Attributes : 04
                             Allocation Type : 0
                                  Cache Type : 1
                                Write Policy : 0
[0AEh 0174 002h]                   Line Size : 0040

[0B0h 0176 001h]               Subtable Type : 00 [Processor Hierarchy Node]
[0B1h 0177 001h]                      Length : 20
[0B2h 0178 002h]                    Reserved : 0000
[0B4h 0180 004h]       Flags (decoded below) : 0000000A
                            Physical package : 0
                     ACPI Processor ID valid : 1
                       Processor is a thread : 0
                              Node is a leaf : 1
                    Identical Implementation : 0
[0B8h 0184 004h]                      Parent : 00000050
[0BCh 0188 004h]           ACPI Processor ID : 00000000
[0C0h 0192 004h]     Private Resource Number : 00000003
[0C4h 0196 004h]            Private Resource : 00000068
[0C8h 0200 004h]            Private Resource : 00000080
[0CCh 0204 004h]            Private Resource : 00000098
> 
> 
> > ---
> >  hw/acpi/aml-build.c         | 65 ++++++++++++++++++++++++++++++++++---
> >  include/hw/acpi/aml-build.h | 26 ++++++++++++++-
> >  2 files changed, 85 insertions(+), 6 deletions(-)
> >
> > diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> > index af66bde0f5..416275fdcc 100644
> > --- a/hw/acpi/aml-build.c
> > +++ b/hw/acpi/aml-build.c
> > @@ -1994,18 +1994,48 @@ static void build_processor_hierarchy_node(GArray 
> > *tbl, uint32_t flags,
> >      }
> >  }
> >
> > +/* ACPI spec, Revision 6.3 Cache type structure (Type 1) */
> > +static void build_cache_structure(GArray *tbl,
> > +                                  uint32_t next_level,
> > +                                  CPUCacheInfo *cache_info)
> > +{
> > +    /* 1 – Cache type structure */
> > +    build_append_byte(tbl, 1);
> > +    /* Length */
> > +    build_append_byte(tbl, 24);
> 
> If we are introducing cache descriptions, can we jump directly to the latest
> definition. That has an extra 4 byte Cache ID field so length is 28.
We don’t have the compatible acpi guest to test it, do you?
> 
> I need that for MPAM support and I'd rather we didn't go through the churn
> of first introducing cache descriptions then updating them (and the tests
> etc) soon after.
Does your acpi os support the cache id?
> 
> > +    /* Reserved */
> > +    build_append_int_noprefix(tbl, 0, 2);
> > +    /* Flags */
> > +    build_append_int_noprefix(tbl, 0x7f, 4);
> > +    /* Next level cache */
> > +    build_append_int_noprefix(tbl, next_level, 4);
> > +    /* Size */
> > +    build_append_int_noprefix(tbl, cache_info->size, 4);
> > +    /* Number of sets */
> > +    build_append_int_noprefix(tbl, cache_info->sets, 4);
> > +    /* Associativity */
> > +    build_append_byte(tbl, cache_info->associativity);
> > +    /* Attributes */
> > +    build_append_byte(tbl, cache_info->attributes);
> > +    /* Line size */
> > +    build_append_int_noprefix(tbl, cache_info->line_size, 2);
> > +}
> > +
> >  /*
> >   * ACPI spec, Revision 6.3
> >   * 5.2.29 Processor Properties Topology Table (PPTT)
> >   */
> >  void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
> > -                const char *oem_id, const char *oem_table_id)
> > +                const char *oem_id, const char *oem_table_id,
> > +                const CPUCaches *CPUCaches)
> >  {
> >      MachineClass *mc = MACHINE_GET_CLASS(ms);
> >      CPUArchIdList *cpus = ms->possible_cpus;
> >      int64_t socket_id = -1, cluster_id = -1, core_id = -1;
> >      uint32_t socket_offset = 0, cluster_offset = 0, core_offset = 0;
> >      uint32_t pptt_start = table_data->len;
> > +    uint32_t l3_offset = 0, priv_num = 0;
> > +    uint32_t priv_rsrc[3] = {0};
> >      int n;
> >      AcpiTable table = { .sig = "PPTT", .rev = 2,
> >                          .oem_id = oem_id, .oem_table_id = oem_table_id };
> > @@ -2024,10 +2054,11 @@ void build_pptt(GArray *table_data, BIOSLinker 
> > *linker, MachineState *ms,
> >              socket_id = cpus->cpus[n].props.socket_id;
> >              cluster_id = -1;
> >              core_id = -1;
> > +            priv_num = 0;
> >              socket_offset = table_data->len - pptt_start;
> >              build_processor_hierarchy_node(table_data,
> >                  (1 << 0), /* Physical package */
> > -                0, socket_id, NULL, 0);
> > +                0, socket_id, NULL, priv_num);
> >          }
> >
> >          if (mc->smp_props.clusters_supported && 
> > mc->smp_props.has_clusters) {
> > @@ -2035,20 +2066,44 @@ void build_pptt(GArray *table_data, BIOSLinker 
> > *linker, MachineState *ms,
> >                  assert(cpus->cpus[n].props.cluster_id > cluster_id);
> >                  cluster_id = cpus->cpus[n].props.cluster_id;
> >                  core_id = -1;
> > +                priv_num = 0;
> > +                l3_offset = table_data->len - pptt_start;
> > +                /* L3 cache type structure */
> > +                if (CPUCaches && CPUCaches->l3_cache) {
> > +                    priv_num = 1;
> > +                    build_cache_structure(table_data, 0, 
> > CPUCaches->l3_cache);
> > +                }
> >                  cluster_offset = table_data->len - pptt_start;
> >                  build_processor_hierarchy_node(table_data,
> >                      (0 << 0), /* Not a physical package */
> > -                    socket_offset, cluster_id, NULL, 0);
> > +                    socket_offset, cluster_id, &l3_offset, priv_num);
> >              }
> >          } else {
> >              cluster_offset = socket_offset;
> >          }
> >
> > +        if (CPUCaches) {
> > +            /* L2 cache type structure */
> > +            priv_rsrc[0] = table_data->len - pptt_start;
> > +            build_cache_structure(table_data, 0, CPUCaches->l2_cache);
> > +
> > +            /* L1d cache type structure */
> > +            priv_rsrc[1] = table_data->len - pptt_start;
> > +            build_cache_structure(table_data, priv_rsrc[0],
> > +                                  CPUCaches->l1d_cache);
> > +
> > +            /* L1i cache type structure */
> > +            priv_rsrc[2] = table_data->len - pptt_start;
> > +            build_cache_structure(table_data, priv_rsrc[0],
> > +                                  CPUCaches->l1i_cache);
> > +
> > +            priv_num = 3;
> Ah.  This one - whilst it's hard to derive from the ACPI spec,
> intent is that the hierarchy node should only point to the the caches
> that are nearest to that node. So here priv_num should be
> covering both the l1i and l1d but not the l2 which should only be
We can follow kunpeng arch, np.
> found by following the next level info in the other two caches.
> 
> See the example in Figure 5.15 of ACPI 6.5
> - the spec doesn't 'enforce' it because the original text
>  was vague so that would be backwards compatability issue,
> but does include
> "Only the head of the list needs to be listed as a resource by
> a processor node (and counted toward Number of Private Resources")).
> Take that as a strong hint!
> 
> 
> > +        }
> >          if (ms->smp.threads == 1) {
> >              build_processor_hierarchy_node(table_data,
> >                  (1 << 1) | /* ACPI Processor ID valid */
> >                  (1 << 3),  /* Node is a Leaf */
> > -                cluster_offset, n, NULL, 0);
> > +                cluster_offset, n, priv_rsrc, priv_num);
> >          } else {
> >              if (cpus->cpus[n].props.core_id != core_id) {
> >                  assert(cpus->cpus[n].props.core_id > core_id);
> > @@ -2063,7 +2118,7 @@ void build_pptt(GArray *table_data, BIOSLinker 
> > *linker, MachineState *ms,
> >                  (1 << 1) | /* ACPI Processor ID valid */
> >                  (1 << 2) | /* Processor is a Thread */
> >                  (1 << 3),  /* Node is a Leaf */
> > -                core_offset, n, NULL, 0);
> > +                core_offset, n, priv_rsrc, priv_num);
> >          }
> >      }
> >
> > diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> > index ff2a310270..2dd949f41e 100644
> > --- a/include/hw/acpi/aml-build.h
> > +++ b/include/hw/acpi/aml-build.h
> > @@ -234,6 +234,29 @@ struct CrsRangeSet {
> >      GPtrArray *mem_64bit_ranges;
> >  } CrsRangeSet;
> >
> > +enum CacheType {
> > +    DATA_CACHE,
> > +    INSTRUCTION_CACHE,
> > +    UNIFIED_CACHE
> > +};
> > +
> > +typedef
> > +struct CPUCacheInfo {
> > +    enum CacheType type;      /* Cache Type*/
> > +    uint32_t size;            /* Size of the cache in bytes */
> > +    uint32_t sets;            /* Number of sets in the cache */
> > +    uint8_t associativity;    /* Cache associativity */
> > +    uint8_t attributes;       /* Cache attributes */
> 
> Incorporates the type.  I would avoid duplication by having a couple more
> enums to cover the other flags in here rather than having to fill type
> in 2 places.
This struct is almost identical to the acpi guest os, it gives great 
readability.
> 
> > +    uint16_t line_size;       /* Line size in bytes */
> > +} CPUCacheInfo;
> > +
> > +typedef
> > +struct CPUCaches {
> > +        CPUCacheInfo *l1d_cache;
> > +        CPUCacheInfo *l1i_cache;
> > +        CPUCacheInfo *l2_cache;
> > +        CPUCacheInfo *l3_cache;
> > +} CPUCaches;
> >
> >  /*
> >   * ACPI 5.0: 6.4.3.8.2 Serial Bus Connection Descriptors
> > @@ -490,7 +513,8 @@ void build_slit(GArray *table_data, BIOSLinker *linker, 
> > MachineState *ms,
> >                  const char *oem_id, const char *oem_table_id);
> >
> >  void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
> > -                const char *oem_id, const char *oem_table_id);
> > +                const char *oem_id, const char *oem_table_id,
> > +                const CPUCaches *CPUCaches);
> >
> >  void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
> >                  const char *oem_id, const char *oem_table_id);




reply via email to

[Prev in Thread] Current Thread [Next in Thread]