Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions

qemu-ppc
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions

From:	David Gibson
Subject:	Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions
Date:	Thu, 17 Dec 2020 14:47:33 +1100
On Wed, Dec 16, 2020 at 05:01:29AM -0300, Gustavo Romero wrote:
> Hi David,
> 
> Thanks a lot for the review. Please find my comments inline.
> 
> On 10/8/20 9:43 PM, David Gibson wrote:
> > On Mon, Oct 05, 2020 at 01:03:13AM -0300, Gustavo Romero wrote:
> > > From: Michael Roth <mdroth@linux.vnet.ibm.com>
> > 
> > Probably a good idea to CC future spins to Richard Henderson
> > <rth@twiddle.net> - by knowledge of how TCG works is only middling.
> 
> OK.

Well, I said that at the time, but rth@twiddle.net seems to have been
bouncing for a while now.  I'm not sure what Richard's current
preferred address is.

> > > Some prefixed instructions (Type 0 and 1, e.g. 8-byte Load/Store or 8LS),
> > > have a completely seperate namespace for their primary opcodes.
> > > 
> > > Other prefixed instructions (Type 2 and 3, e.g. Modified Load/Store or 
> > > MLS)
> > > actually re-use existing opcodes to provide a modified variant. We could
> > > handle these by extending the existing opcode handlers to check for the
> > > prefix and handle accordingly, but in some cases it is cleaner to define
> > > seperate handlers for these in their own table entry, and avoids the need
> > > to add error-handling to existing handlers for cases where they are called
> > > for a prefixed Type 2/3 instruction but don't implement the handling for
> > > them. In the future we can re-work things to support both approaches if
> > > cases arise where that seems warranted.
> > > 
> > > Then we have the normal non-prefixed instructions.
> > > 
> > > To handle all 3 of these cases we extend the table size 3x, with normal
> > > instructions indexed directly by their primary opcodes, Type 0/1 indexed 
> > > by
> > > primary opcode + PPC_CPU_PREFIXED_OPCODE_OFFSET, and Type 2/3 indexed by
> > > primary opcode + PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET.
> > 
> > Is there an advantage to having one table with magic offsets, rather
> > than 3 separately declared tables?
> 
> Not that I can see in terms of size or performance. However since we
> are dealing here with the PO (primary opcode) for the prefixed insns
> as well I think it makes sense to have it related to the table that
> already handles the PO for the normal instruction. So the offset are
> not really magic, but 64 (PPC_CPU_PREFIXED_OPCODE_OFFSET) and 128
> (PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET), because the slots for the
> normal opcodes is triplicated.

Sure, I see why the offsets are what they are - it's basically just 3
64 entry tables appended together.  I still see no point in appending
them together, rather than naming them separately.

> In addition to Michael's comment above I've add the following comment
> in the code for v2:
> 
>  982 /*
>  983  * The primary op-code field extracted from the instruction (PO) is used 
> as the
>  984  * index in the top-level QEMU opcode table. That table is then further 
> used to
>  985  * find the proper handler, or a pointer to another table (OPC1, OPC2 
> etc). For
>  986  * the normal opcodes (PO field of normal insns) the 64 first entries 
> are used.
>  987  *
>  988  * Prefixed instructions can be divided into two major groups of 
> instructions:
>  989  * the first group is formed by type 0 and 1, and the second group is by 
> type 2
>  990  * and 3. Opcodes (PO) related to prefixed type 0/1 can have the same 
> opcodes
>  991  * as the normal instructions but don't have any semantics related to 
> them. But
>  992  * since it's an primary opcode anyway it was decided to simply extend 
> the
>  993  * top-level table that handles the primary opcodes for the normal 
> instructions.
>  994  * On the other handle, prefixed instructions type 2/3 work like 
> modifiers for
>  995  * existing normal instructions. In that case it would be possible to 
> reuse the
>  996  * handlers for the normal instructions, but that would require changing 
> the
>  997  * normal instruction handler in a way it would have to check, for 
> instance, on
>  998  * which ISA it has to take care of prefixes insns, and that is already 
> done by
>  999  * GEN_HANDLER_* helpers when registering a new instruction, where there 
> is a
> 1000  * a field to inform on which ISA the instruction exists. Hence, it was 
> decided
> 1001  * to add another 64 entries for specific handler for the prefixed 
> instructions
> 1002  * of type 2/3. Thus the size of top-level PO lookup table is 3*64 = 
> 192, hence
> 1003  * representing three namespaces for decoding PO: for normal insns, for 
> type 0/1
> 1004  * and for type 2/3.
> 1005  */
> 
> 
> > > Various exception/SRR handling changes related to prefixed instructions 
> > > are
> > > yet to be implemented. For instance, no alignment interrupt is generated 
> > > if
> > > a prefixed instruction crosses the 64-byte boundary; no SRR bit related to
> > > prefixed instructions is set on any interrupt, like for FP unavailable
> > > interrupt, data storage interrupt, etc.
> > > 
> > > For details, please see PowerISA v3.1, section 1.6.3 and chapter 7,
> > > particularly the new changes regarding the prefixed instructions.
> > > 
> > > Signed-off-by: Michael Roth <mroth@lamentation.net>
> > > [ gromero: - comments clean up and updates
> > >             - additional comments in the commit log ]
> > > Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
> > > ---
> > >   target/ppc/cpu.h                |   4 +-
> > >   target/ppc/internal.h           |  14 +++
> > >   target/ppc/translate.c          | 167 +++++++++++++++++++++++++++++++-
> > >   target/ppc/translate_init.c.inc |  11 ++-
> > >   4 files changed, 190 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > > index 766e9c5c26..c5de33572c 100644
> > > --- a/target/ppc/cpu.h
> > > +++ b/target/ppc/cpu.h
> > > @@ -979,8 +979,10 @@ struct ppc_radix_page_info {
> > >   #define PPC_TLB_EPID_LOAD 8
> > >   #define PPC_TLB_EPID_STORE 9
> > > -#define PPC_CPU_OPCODES_LEN          0x40
> > > +#define PPC_CPU_OPCODES_LEN          0xc0
> > >   #define PPC_CPU_INDIRECT_OPCODES_LEN 0x20
> > > +#define PPC_CPU_PREFIXED_OPCODE_OFFSET 0x40
> > > +#define PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET 0x80
> > 
> > Might be worth including some of that description from the commit
> > message as a comment here (or on the table declaration itself).
> 
> Yes. Fixed for v2.
> 
> > >   struct CPUPPCState {
> > >       /* Most commonly used resources during translated code execution 
> > > first */
> > > diff --git a/target/ppc/internal.h b/target/ppc/internal.h
> > > index 15d655b356..d03d691a44 100644
> > > --- a/target/ppc/internal.h
> > > +++ b/target/ppc/internal.h
> > > @@ -129,6 +129,7 @@ EXTRACT_SHELPER(SIMM5, 16, 5);
> > >   EXTRACT_HELPER(UIMM5, 16, 5);
> > >   /* 4 bits unsigned immediate value */
> > >   EXTRACT_HELPER(UIMM4, 16, 4);
> > > +
> > >   /* Bit count */
> > >   EXTRACT_HELPER(NB, 11, 5);
> > >   /* Shift count */
> > > @@ -146,6 +147,19 @@ EXTRACT_HELPER(TO, 21, 5);
> > >   EXTRACT_HELPER(CRM, 12, 8);
> > > +/*
> > > + * Instruction prefix fields
> > > + *
> > > + * as per PowerISA 3.1 1.6.3
> > > + */
> > > +
> > > +/* prefixed instruction type */
> > > +EXTRACT_HELPER(PREFIX_TYPE, 24, 2);
> > > +/* 1-bit sub-type */
> > > +EXTRACT_HELPER(PREFIX_ST1, 23, 1);
> > > +/* 4-bit sub-type */
> > > +EXTRACT_HELPER(PREFIX_ST4, 20, 4);
> > > +
> > >   #ifndef CONFIG_USER_ONLY
> > >   EXTRACT_HELPER(SR, 16, 4);
> > >   #endif
> > > diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> > > index fedb9b2271..96c2997d3f 100644
> > > --- a/target/ppc/translate.c
> > > +++ b/target/ppc/translate.c
> > > @@ -180,6 +180,8 @@ struct DisasContext {
> > >       uint32_t flags;
> > >       uint64_t insns_flags;
> > >       uint64_t insns_flags2;
> > > +    uint32_t prefix;
> > > +    uint32_t prefix_subtype;
> > >   };
> > >   /* Return true iff byteswap is needed in a scalar memop */
> > > @@ -376,6 +378,12 @@ GEN_OPCODE(name, opc1, opc2, opc3, inval, type, 
> > > PPC_NONE)
> > >   #define GEN_HANDLER_E(name, opc1, opc2, opc3, inval, type, type2)       
> > >       \
> > >   GEN_OPCODE(name, opc1, opc2, opc3, inval, type, type2)
> > > +#define GEN_HANDLER_E_PREFIXED(name, opc1, opc2, opc3, inval, type, 
> > > type2)    \
> > > +GEN_OPCODE_PREFIXED(name, opc1, opc2, opc3, inval, type, type2, false)
> > > +
> > > +#define GEN_HANDLER_E_PREFIXED_M(name, opc1, opc2, opc3, inval, type, 
> > > type2)  \
> > > +GEN_OPCODE_PREFIXED(name, opc1, opc2, opc3, inval, type, type2, true)
> > > +
> > >   #define GEN_HANDLER2(name, onam, opc1, opc2, opc3, inval, type)         
> > >       \
> > >   GEN_OPCODE2(name, onam, opc1, opc2, opc3, inval, type, PPC_NONE)
> > > @@ -395,6 +403,8 @@ typedef struct opcode_t {
> > >   #endif
> > >       opc_handler_t handler;
> > >       const char *oname;
> > > +    bool prefixed;
> > > +    bool modified;
> > >   } opcode_t;
> > >   /* Helpers for priv. check */
> > > @@ -449,6 +459,23 @@ typedef struct opcode_t {
> > >       },                                                                  
> > >       \
> > >       .oname = stringify(name),                                           
> > >       \
> > >   }
> > > +#define GEN_OPCODE_PREFIXED(name, op1, op2, op3, invl, _typ, _typ2, 
> > > _modified)\
> > > +{                                                                        
> > >      \
> > > +    .opc1 = op1,                                                         
> > >      \
> > > +    .opc2 = op2,                                                         
> > >      \
> > > +    .opc3 = op3,                                                         
> > >      \
> > > +    .opc4 = 0xff,                                                        
> > >      \
> > > +    .handler = {                                                         
> > >      \
> > > +        .inval1  = invl,                                                 
> > >      \
> > > +        .type = _typ,                                                    
> > >      \
> > > +        .type2 = _typ2,                                                  
> > >      \
> > > +        .handler = &gen_##name,                                          
> > >      \
> > > +        .oname = stringify(name),                                        
> > >      \
> > > +    },                                                                   
> > >      \
> > > +    .oname = stringify(name),                                            
> > >      \
> > > +    .prefixed = true,                                                    
> > >      \
> > > +    .modified = _modified,                                               
> > >      \
> > > +}
> > >   #define GEN_OPCODE_DUAL(name, op1, op2, op3, invl1, invl2, _typ, _typ2) 
> > >       \
> > >   {                                                                       
> > >       \
> > >       .opc1 = op1,                                                        
> > >       \
> > > @@ -525,6 +552,22 @@ typedef struct opcode_t {
> > >       },                                                                  
> > >       \
> > >       .oname = stringify(name),                                           
> > >       \
> > >   }
> > > +#define GEN_OPCODE_PREFIXED(name, op1, op2, op3, invl, _typ, _typ2, 
> > > _modified)\
> > > +{                                                                        
> > >      \
> > > +    .opc1 = op1,                                                         
> > >      \
> > > +    .opc2 = op2,                                                         
> > >      \
> > > +    .opc3 = op3,                                                         
> > >      \
> > > +    .opc4 = 0xff,                                                        
> > >      \
> > > +    .handler = {                                                         
> > >      \
> > > +        .inval1  = invl,                                                 
> > >      \
> > > +        .type = _typ,                                                    
> > >      \
> > > +        .type2 = _typ2,                                                  
> > >      \
> > > +        .handler = &gen_##name,                                          
> > >      \
> > > +    },                                                                   
> > >      \
> > > +    .oname = stringify(name),                                            
> > >      \
> > > +    .prefixed = true,                                                    
> > >      \
> > > +    .modified = _modified,                                               
> > >      \
> > > +}
> > >   #define GEN_OPCODE_DUAL(name, op1, op2, op3, invl1, invl2, _typ, _typ2) 
> > >       \
> > >   {                                                                       
> > >       \
> > >       .opc1 = op1,                                                        
> > >       \
> > > @@ -7991,6 +8034,98 @@ static bool 
> > > ppc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
> > >       return true;
> > >   }
> > > +
> > > +
> > > +/*
> > > + * Prefixed instruction sub-types
> > > + *
> > > + * as per PowerISA 3.1 1.6.3
> > > + */
> > > +enum {
> > > +    /* non-prefixed instruction */
> > 
> > Two things are labelled as "non-prefixed instruction" here..
> 
> Fixed for v2.
> 
> 
> > > +    PREFIX_ST_INVALID   = -1,
> > > +    /* non-prefixed instruction */
> > > +    PREFIX_ST_NONE      = 0,
> > > +    /* Type 00 prefixed insn sub-types */
> > 
> > I'm guessing the types here are in binary, maybe list as 0b00 0b01 etc
> > for clarity.
> 
> ISA actually uses like that, but I think it's good to clarify
> here it's a binary. Fixed for v2.
> 
> > > +    PREFIX_ST_8LS,      /* 8-byte load/store */
> > > +    PREFIX_ST_8MLS,     /* 8-byte masked load/store */
> > > +    /* Type 01 prefixed insn sub-types */
> > > +    PREFIX_ST_8RR,      /* 8-byte reg-to-reg */
> > > +    PREFIX_ST_8MRR,     /* 8-byte masked reg-to-reg */
> > > +    /* Type 10 prefixed insn sub-types */
> > > +    PREFIX_ST_MLS,      /* modified load/store */
> > > +    PREFIX_ST_MMLS,     /* modified masked load/store */
> > > +    /* Type 11 prefixed insn sub-types */
> > > +    PREFIX_ST_MRR,      /* modified reg-to-reg */
> > > +    PREFIX_ST_MMRR,     /* modified mask reg-to-reg */
> > > +    PREFIX_ST_MMIRR,    /* modified masked immediate reg-to-reg */
> > > +};
> > > +
> > > +static int32_t parse_prefix_subtype(uint32_t prefix)
> > > +{
> > > +    int32_t prefix_subtype = PREFIX_ST_INVALID;
> > > +
> > > +    /* primary opcode 1 is reserved for instruction prefixes */
> > > +    if (opc1(prefix) != 1) {
> > > +        return PREFIX_ST_NONE;
> > > +    }
> > > +
> > > +    switch (PREFIX_TYPE(prefix)) {
> > > +    case 0:
> > > +        if (PREFIX_ST1(prefix) == 0) {
> > > +            prefix_subtype = PREFIX_ST_8LS;
> > > +        } else if (PREFIX_ST1(prefix) == 1) {
> > > +            prefix_subtype = PREFIX_ST_8MLS;
> > > +        }
> > > +        break;
> > > +    case 1:
> > > +        if (PREFIX_ST4(prefix) == 0) {
> > > +            prefix_subtype = PREFIX_ST_8RR;
> > > +        } else if (PREFIX_ST4(prefix) == 8) {
> > > +            prefix_subtype = PREFIX_ST_8MRR;
> > > +        }
> > 
> > Is this fallthrough intentional?  If so it should be commented.
> 
> No. Fixed for v2.
> 
> > > +    case 2:
> > > +        if (PREFIX_ST1(prefix) == 0) {
> > > +            prefix_subtype = PREFIX_ST_MLS;
> > > +        } else if (PREFIX_ST1(prefix) == 1) {
> > > +            prefix_subtype = PREFIX_ST_MMLS;
> > > +        }
> > > +        break;
> > > +    case 3:
> > > +        if (PREFIX_ST4(prefix) == 0) {
> > > +            prefix_subtype = PREFIX_ST_MRR;
> > > +        } else if (PREFIX_ST4(prefix) == 8) {
> > > +            prefix_subtype = PREFIX_ST_MMRR;
> > > +        } else if (PREFIX_ST4(prefix) == 9) {
> > > +            prefix_subtype = PREFIX_ST_MMIRR;
> > > +        }
> > 
> > Fallthrough here doesn't matter, but probably better to have a break;
> > for consistency.
> 
> Right. Fixed in opc1_idx() as well for v2.
> 
> > > +    }
> > > +
> > > +    return prefix_subtype;
> > > +}
> > > +
> > > +static uint32_t opc1_idx(DisasContext *ctx)
> > > +{
> > > +    uint32_t table_offset = 0;
> > > +
> > > +    switch (ctx->prefix_subtype) {
> > > +    case PREFIX_ST_8LS:
> > > +    case PREFIX_ST_8MLS:
> > > +    case PREFIX_ST_8RR:
> > > +    case PREFIX_ST_8MRR:
> > > +        table_offset = PPC_CPU_PREFIXED_OPCODE_OFFSET;
> > > +        break;
> > > +    case PREFIX_ST_MLS:
> > > +    case PREFIX_ST_MMLS:
> > > +    case PREFIX_ST_MRR:
> > > +    case PREFIX_ST_MMRR:
> > > +    case PREFIX_ST_MMIRR:
> > > +        table_offset = PPC_CPU_PREFIXED_MODIFIED_OPCODE_OFFSET;
> > > +    }
> > > +
> > > +    return table_offset + opc1(ctx->opcode);
> > > +}
> > > +
> > >   static void ppc_tr_translate_insn(DisasContextBase *dcbase, CPUState 
> > > *cs)
> > >   {
> > >       DisasContext *ctx = container_of(dcbase, DisasContext, base);
> > > @@ -8004,14 +8139,40 @@ static void 
> > > ppc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
> > >       ctx->opcode = translator_ldl_swap(env, ctx->base.pc_next,
> > >                                         need_byteswap(ctx));
> > > +    /* check for prefix */
> > > +    ctx->prefix_subtype = parse_prefix_subtype(ctx->opcode);
> > > +    if (ctx->prefix_subtype == PREFIX_ST_INVALID) {
> > > +        qemu_log_mask(LOG_GUEST_ERROR, "invalid/unsupported prefix: "
> > > +                      "%08x " TARGET_FMT_lx "\n",
> > > +                      ctx->prefix, ctx->base.pc_next);
> > 
> > We should generate a 0x700 here, shouldn't we?
> 
> I don't know for sure. I could not find any document saying
> explicitly what we should do when subtype is invalid. Nor
> that an invalid subtype means an invalid instruction. In the
> past for some TM instructions the User Manual would specify
> how to treat certain invalid fields in the instructions... like
> just ignoring it or even treating like certain valid form (!).

Hm, ok.  Unless we have specific information saying the architecture,
or hw implementations do something different, I think we should
generate a 0x700.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions, Gustavo Romero, 2020/12/16
- Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions, David Gibson <=
  - Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions, Peter Maydell, 2020/12/17
Prev by Date: Re: [PATCH] configure: Fail when specified cross compiler cannot be found
Next by Date: Re: [for-6.0 v5 12/13] securable guest memory: Alter virtio default properties for protected guests
Previous by thread: Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions
Next by thread: Re: [PATCH 1/7] target/ppc: Add infrastructure for prefixed instructions
Index(es):
- Date
- Thread