emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Time to merge scratch/correct-warning-pos into master, perhaps?


From: Alan Mackenzie
Subject: Re: Time to merge scratch/correct-warning-pos into master, perhaps?
Date: Fri, 25 Feb 2022 22:29:01 +0000

Hello, Eli.

On Sat, Feb 19, 2022 at 19:02:22 +0200, Eli Zaretskii wrote:
> > Date: Sat, 19 Feb 2022 16:42:07 +0000
> > Cc: gregory@heytings.org, monnier@iro.umontreal.ca, mattiase@acm.org,
> >   larsi@gnus.org, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>
> > 
> > I haven't got any useful information out of the exercise, so far.  I
> > can't help feeling that I'm missing something.  Is there anything I ought
> > to be doing that I've not yet done?

> Maybe you should make EQ real function, with an attribute that would
> preclude its inlining.

> I have no other ideas.  Maybe someone else does.

I now have some numbers.

I've compared the versions of the master branch just before and just
after the merge of scratch/correct-warning-pos with this difference in
the new version:
(i) A binding of load-read-function was removed (this should be
  irrelevant).

, and this difference in the old version:
(i) The macro/function BASE_EQ was copied from the new version and NILP
  amended to use it.  This excludes NILP from contributing to the EQ
  measurements in the old version, just as it is in the new.

In both versions, the inlining was removed from EQ, which was then
inserted into xdisp.c as a normal function which invokes the macro
lisp_h_EQ.  This macro differs between the old and new versions.

These two versions were configured the same, without native-compilation,
and built.  A make check was run to compile (most of) the test-foo.elc
files.  Then in each the following were run:

    $ perf record -e cpu-clock make check
    $ perf report -i perf.data --tui

..  The proportions of the profiler samples in EQ were:
   (old): 0.48%
   (new): 0.86%

..  This is fairly close to the guessed factor of 2 difference.  However,
it doesn't, by itself, account for the difference in total run time
between the two versions.  The total number of events counted for these
make check runs was
  (old): 372k
  (new): 419k

..  The breakdown of samples on individual instructions in the old and new
versions of EQ is thus:

(old):
EQ  /home/acm/emacs/emacs.git/sub-master-b/src/emacs [Percent: local period]
Samples│
       │
       │
       │   Disassembly of section .text:
       │
       │   0000000000071550 <EQ>:
       │   EQ():
       │
       │   /* STOUGH, 2022-02-19 */
       │   bool
       │   EQ (Lisp_Object x, Lisp_Object y)
       │   {
       │   return lisp_h_EQ (x, y);
   946 │     cmp  %rdi,%rsi
   172 │     sete %al
       │   }
   675 │   ← ret

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(new):
EQ  /home/acm/emacs/emacs.git/sub-master-2/src/emacs [Percent: local period]
Samples│
       │
       │
       │    Disassembly of section .text:
       │
       │    000000000006d150 <EQ>:
       │    EQ():
       │    no-ops.  */
       │
       │    INLINE EMACS_INT
       │    (XLI) (Lisp_Object o)
       │    {
       │    return lisp_h_XLI (o);
  1230 │      mov    $0x1,%eax
       │
       │    /* STOUGH, 2022-02-19 */
       │    bool
       │    EQ (Lisp_Object x, Lisp_Object y)
       │    {
       │    return lisp_h_EQ (x, y);
   311 │      cmp    %rsi,%rdi
   617 │    ↓ je     76
       │      movzbl globals+0xfee,%eax
  1340 │      test   %al,%al
    81 │    ↓ je     76
       │    TAGGEDP():
       │    Equivalent to XTYPE (a) == TAG, but often faster.  */
       │
       │    INLINE bool
       │    (TAGGEDP) (Lisp_Object a, enum Lisp_Type tag)
       │    {
       │    return lisp_h_TAGGEDP (a, tag);
       │      lea    -0x5(%rdi),%eax
       │    PSEUDOVECTORP():
       │    #define MOST_NEGATIVE_FIXNUM (-1 - MOST_POSITIVE_FIXNUM)
       │
       │    INLINE bool
       │    PSEUDOVECTORP (Lisp_Object a, int code)
       │    {
       │    return lisp_h_PSEUDOVECTORP (a, code);
     5 │      test   $0x7,%al
     9 │    ↓ jne    50
       │      movabs $0x400000003f000000,%rdx
       │      mov    -0x5(%rdi),%rcx
     1 │      movabs $0x4000000006000000,%rax
       │      and    %rdx,%rcx
       │      cmp    %rax,%rcx
       │    ↓ jne    50
       │    EQ():
       │      test   $0x7,%sil
       │    ↓ jne    90
       │      cmp    %rsi,0x3(%rdi)
       │      sete   %al
       │    ← ret
       │      nop
       │    TAGGEDP():
       │    return lisp_h_TAGGEDP (a, tag);
       │50:   lea    -0x5(%rsi),%eax
       │    PSEUDOVECTORP():
       │    return lisp_h_PSEUDOVECTORP (a, code);
     4 │      test   $0x7,%al
     5 │    ↓ jne    74
       │      movabs $0x400000003f000000,%rax
       │      and    -0x5(%rsi),%rax
     1 │      movabs $0x4000000006000000,%rdx
       │      cmp    %rdx,%rax
       │    ↓ je     80
       │74:   xor    %eax,%eax
       │    EQ():
       │    }
     1 │76: ← ret
       │      nop
       │    return lisp_h_EQ (x, y);
       │80:   test   $0x7,%dil
       │    ↑ jne    74
       │      cmp    %rdi,0x3(%rsi)
       │      sete   %al
       │    ← ret
       │      xchg   %ax,%ax
       │    TAGGEDP():
       │    return lisp_h_TAGGEDP (a, tag);
       │90:   lea    -0x5(%rsi),%r8d
       │    PSEUDOVECTORP():
       │      xor    %eax,%eax
       │    return lisp_h_PSEUDOVECTORP (a, code);
       │      and    $0x7,%r8d
       │    ↑ jne    76
       │      and    -0x5(%rsi),%rdx
       │      cmp    %rcx,%rdx
       │    ↑ jne    76
       │    EQ():
       │      mov    0x3(%rdi),%rax
       │      cmp    %rax,0x3(%rsi)
       │      sete   %al
       │    ← ret
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

As yet, I don't know quite what to make of these numbers.

-- 
Alan Mackenzie (Nuremberg, Germany).



reply via email to

[Prev in Thread] Current Thread [Next in Thread]