[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Native compilation - specific optimisation surely possible?
From: |
Alan Mackenzie |
Subject: |
Re: Native compilation - specific optimisation surely possible? |
Date: |
Mon, 3 Jan 2022 11:49:17 +0000 |
On Sun, Jan 02, 2022 at 22:27:37 +0000, Andrea Corallo wrote:
> Alan Mackenzie <acm@muc.de> writes:
> > Hello, Emacs.
> > The following very short function:
> > ;; -*- lexical-binding: t -*-
> > (defun comp-test-55 (x)
> > (unless (integerp x)
> > x))
> > byte compiles to:
> > byte code for comp-test-55:
> > doc: ...
> > args: (arg1)
> > 0 dup
> > 1 integerp
> > 2 not
> > 3 goto-if-nil-else-pop 1
> > 6 dup
> > 7:1 return
> > , then on an amd-64 machine, native compiles to (annotation added by
> > me):
> > 00000000000012c0 <F636f6d702d746573742d3535_comp_test_55_0>:
> > Setup of the function:
> > 12c0: 55 push %rbp
> > 12c1: 53 push %rbx
> > 12c2: 48 89 fb mov %rdi,%rbx
> > 12c5: 48 83 ec 08 sub $0x8,%rsp
> > 12c9: 48 8b 05 18 2d 00 00 mov 0x2d18(%rip),%rax # 3fe8
> > <freloc_link_table@@Base-0x240>
> > 12d0: 48 8b 28 mov (%rax),%rbp
> > fixnump:
> > 12d3: 8d 47 fe lea -0x2(%rdi),%eax
> > 12d6: a8 03 test $0x3,%al
> > 12d8: 75 26 jne 1300
> > <F636f6d702d746573742d3535_comp_test_55_0+0x40>
> > 12da: 48 8b 05 ff 2c 00 00 mov 0x2cff(%rip),%rax # 3fe0
> > <d_reloc@@Base-0x220>
> > 12e1: 48 8b 78 10 mov 0x10(%rax),%rdi
> > Nil in %rdi?:
> > 12e5: 31 f6 xor %esi,%esi
> > 12e7: ff 95 c0 27 00 00 call *0x27c0(%rbp) `eq'
> > <========================
> > 12ed: 48 85 c0 test %rax,%rax
> > 12f0: 48 0f 45 c3 cmovne %rbx,%rax
> > Tear down of the function:
> > 12f4: 48 83 c4 08 add $0x8,%rsp
> > 12f8: 5b pop %rbx
> > 12f9: 5d pop %rbp
> > 12fa: c3 ret
> > 12fb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> > bignump:
> > 1300: 8d 47 fb lea -0x5(%rdi),%eax
> > 1303: a8 07 test $0x7,%al
> > 1305: 74 09 je 1310
> > <F636f6d702d746573742d3535_comp_test_55_0+0x50>
> > 1307: 31 ff xor %edi,%edi
> > 1309: eb da jmp 12e5
> > <F636f6d702d746573742d3535_comp_test_55_0+0x25>
> > 130b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> > pseudovectorp:
> > 1310: be 02 00 00 00 mov $0x2,%esi
> > 1315: ff 55 08 call *0x8(%rbp)
> > 1318: 84 c0 test %al,%al
> > 131a: 75 be jne 12da
> > <F636f6d702d746573742d3535_comp_test_55_0+0x1a>
> > 131c: 31 ff xor %edi,%edi
> > 131e: eb c5 jmp 12e5
> > <F636f6d702d746573742d3535_comp_test_55_0+0x25>
> > .. The input parameter x (or arg1) is passed into the function in the
> > register %rdi. integerp is coded successively as fixnump followed (if
> > necessary) by bignump. The fixnump is coded beautifully in three
> > instructions.
> > I don't understand what's happening at 12da. It seems that the address
> > of a stack pointer is being loaded into %rax, from which the result of
> > `fixnump' (which was already in %rax) is loaded into %rdi.
> > But my main point is the compilation of the `not' instruction at 12e5.
> > The operand to `not' is in %rdi. It is coded up as (eq %rdi nil) by
> > loading 0 (nil) into %rsi at 12e5, then making a function call to `eq'
> > at 12e7.
> > Surely the overhead of the function call for `eq' makes this a candidate
> > for optimisation? `not' could be coded up in two instructions (test
> > %rdi,%rdi followed by a conditional jump or (faster) the cmovne which is
> > %already there).
> > `not' is presumably a common opcode in byte compiled functions. `eq'
> > surely more so. So why are we coding these up as function calls?
> > Andrea?
> Hi Alan,
> could you attach the .c file produced with `native-comp-debug' >= 2?
> Thanks
OK, here it is.
> Andrea
> PS I might be a little slow answering mails for the coming week as I'm
> on holiday :)
Not a problem - Enjoy the holiday!
--
Alan Mackenzie (Nuremberg, Germany).
freefn-636f6d702d746573742d3535_comp_test_55_0eiGSrI.c
Description: Text Data