[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Using __builtin_expect (likely/unlikely macros)
From: |
Alex Gramiak |
Subject: |
Re: Using __builtin_expect (likely/unlikely macros) |
Date: |
Tue, 16 Apr 2019 14:50:40 -0600 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) |
Paul Eggert <address@hidden> writes:
> That being said, it might make sense for a few obviously-rarely-called
> functions like 'emacs-abort' to be marked with __attribute__ ((cold)),
> so long as we don't turn this into a mission to mark all cold functions
> (which would cost us more than it would benefit). That is what GCC
> itself does, with its own functions. However, I'd like to see
> performance figures. Could you try it out on the benchmark of 'cd lisp
> && time make compile-always'?
Right, I agree that if used, they should be used sparingly. I tested
three versions a few times each with both 'make' and 'make -j4':
a) Regular Emacs master.
b) The below diff with only the _Cold attribute
c) The below diff with both _Cold and _Hot attributes
a) Normal
real 4:17.97s
user 3:57.18s
sys 20.394s
real 1:17.67s
user 4:23.78s
sys 18.888s
b) Cold
real 4:10.92s
user 3:50.34s
sys 20.178s
real 1:15.77s
user 4:16.73s
sys 18.943s
c) Hot/Cold
real 4:11.43s
user 3:51.07s
sys 19.961s
real 1:16.01s
user 4:17.63s
sys 18.662s
So not much of a difference. For some reason the Hot/Cold performed
consistently worse than Cold.
I also tested startup/shutdown with perf:
Performance counter stats for '../emacs-normal -f kill-emacs' (20 runs):
762.17 msec task-clock:u # 0.844 CPUs utilized
( +- 0.23% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
12,941 page-faults:u # 0.017 M/sec
( +- 0.01% )
2,998,322,125 cycles:u # 3.934 GHz
( +- 0.06% )
1,392,869,413 stalled-cycles-frontend:u # 46.45% frontend cycles
idle ( +- 0.15% )
982,206,843 stalled-cycles-backend:u # 32.76% backend cycles
idle ( +- 0.18% )
4,874,186,825 instructions:u # 1.63 insn per cycle
# 0.29 stalled cycles per
insn ( +- 0.01% )
1,037,929,374 branches:u # 1361.802 M/sec
( +- 0.01% )
17,930,471 branch-misses:u # 1.73% of all branches
( +- 0.16% )
1,209,539,215 L1-dcache-loads:u # 1586.960 M/sec
( +- 0.01% )
42,346,229 L1-dcache-load-misses:u # 3.50% of all L1-dcache
hits ( +- 0.05% )
9,088,647 LLC-loads:u # 11.925 M/sec
( +- 0.29% )
<not supported> LLC-load-misses:u
0.90325 +- 0.00441 seconds time elapsed ( +- 0.49% )
Performance counter stats for '../emacs.cold -f kill-emacs' (20 runs):
755.94 msec task-clock:u # 0.845 CPUs utilized
( +- 0.24% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
12,941 page-faults:u # 0.017 M/sec
( +- 0.01% )
2,976,036,365 cycles:u # 3.937 GHz
( +- 0.06% )
1,374,451,779 stalled-cycles-frontend:u # 46.18% frontend cycles
idle ( +- 0.14% )
990,227,732 stalled-cycles-backend:u # 33.27% backend cycles
idle ( +- 0.18% )
4,878,661,927 instructions:u # 1.64 insn per cycle
# 0.28 stalled cycles per
insn ( +- 0.00% )
1,038,495,525 branches:u # 1373.782 M/sec
( +- 0.00% )
17,859,906 branch-misses:u # 1.72% of all branches
( +- 0.16% )
1,209,345,531 L1-dcache-loads:u # 1599.792 M/sec
( +- 0.00% )
42,444,358 L1-dcache-load-misses:u # 3.51% of all L1-dcache
hits ( +- 0.06% )
9,204,368 LLC-loads:u # 12.176 M/sec
( +- 0.41% )
<not supported> LLC-load-misses:u
0.89430 +- 0.00217 seconds time elapsed ( +- 0.24% )
Performance counter stats for '../emacs.hot-cold -f kill-emacs' (20 runs):
761.97 msec task-clock:u # 0.845 CPUs utilized
( +- 0.20% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
12,947 page-faults:u # 0.017 M/sec
( +- 0.01% )
2,989,750,359 cycles:u # 3.924 GHz
( +- 0.04% )
1,383,312,275 stalled-cycles-frontend:u # 46.27% frontend cycles
idle ( +- 0.12% )
994,643,853 stalled-cycles-backend:u # 33.27% backend cycles
idle ( +- 0.13% )
4,879,318,990 instructions:u # 1.63 insn per cycle
# 0.28 stalled cycles per
insn ( +- 0.00% )
1,038,584,045 branches:u # 1363.022 M/sec
( +- 0.00% )
17,863,736 branch-misses:u # 1.72% of all branches
( +- 0.13% )
1,209,327,347 L1-dcache-loads:u # 1587.103 M/sec
( +- 0.00% )
42,501,374 L1-dcache-load-misses:u # 3.51% of all L1-dcache
hits ( +- 0.05% )
9,201,311 LLC-loads:u # 12.076 M/sec
( +- 0.28% )
<not supported> LLC-load-misses:u
0.90132 +- 0.00201 seconds time elapsed ( +- 0.22% )
Which again shows a slight improvement with the Cold attributes, and
still shows the hot attributes degrading performance. Perhaps I was too
overzealous with the hot tagging?
hot-cold.diff
Description: hot/cold
- Re: Using __builtin_expect (likely/unlikely macros), (continued)
- Re: Using __builtin_expect (likely/unlikely macros), Alex Gramiak, 2019/04/15
- Re: Using __builtin_expect (likely/unlikely macros), Eli Zaretskii, 2019/04/15
- Re: Using __builtin_expect (likely/unlikely macros), Alex Gramiak, 2019/04/16
- Re: Using __builtin_expect (likely/unlikely macros), Eli Zaretskii, 2019/04/16
- Re: Using __builtin_expect (likely/unlikely macros), Alex Gramiak, 2019/04/16
- Re: Using __builtin_expect (likely/unlikely macros), Paul Eggert, 2019/04/15
- Re: Using __builtin_expect (likely/unlikely macros), Stefan Monnier, 2019/04/16
- Re: Using __builtin_expect (likely/unlikely macros), Paul Eggert, 2019/04/16
- Re: Using __builtin_expect (likely/unlikely macros), Alex Gramiak, 2019/04/16
- Re: Using __builtin_expect (likely/unlikely macros), Paul Eggert, 2019/04/16
- Re: Using __builtin_expect (likely/unlikely macros),
Alex Gramiak <=
- Re: Using __builtin_expect (likely/unlikely macros), Alex Gramiak, 2019/04/16
- Re: Using __builtin_expect (likely/unlikely macros), Stefan Monnier, 2019/04/16
- Re: Using __builtin_expect (likely/unlikely macros), Konstantin Kharlamov, 2019/04/16
- Re: Using __builtin_expect (likely/unlikely macros), Paul Eggert, 2019/04/18
- Re: Using __builtin_expect (likely/unlikely macros), Konstantin Kharlamov, 2019/04/18
- Re: Using __builtin_expect (likely/unlikely macros), Andy Moreton, 2019/04/18
- Re: Using __builtin_expect (likely/unlikely macros), Paul Eggert, 2019/04/18
- Re: Using __builtin_expect (likely/unlikely macros), Andy Moreton, 2019/04/18
- Re: Using __builtin_expect (likely/unlikely macros), Paul Eggert, 2019/04/18
- Re: Using __builtin_expect (likely/unlikely macros), Alex Gramiak, 2019/04/19