[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: profiling results
From: |
Rik |
Subject: |
Re: profiling results |
Date: |
Tue, 13 Aug 2019 10:11:23 -0700 |
I finally got one of the profiling tools to work well enough to identify
some of the hotspots. I ended up using the Linux Kernel tool 'perf'.
perf record -g -p <PID>
When running the benchmark bm.toeplitz.orig.m, I find that tree_evaluator
and tree_index_expression::lvalue seem to be time consuming routines.
Children Self Samples Command Shared
Object Symbol
- 81.31% 0.27% 160 QThread
liboctinterp.so.7.0.0 [.] octave::tree_evaluator::visit_simple_assignment
- 81.04%
octave::tree_evaluator::visit_simple_assignment
+ 50.98%
octave::tree_evaluator::evaluate
+ 19.84%
octave::tree_index_expression::lvalue
+ 6.86%
octave::octave_lvalue::assign
+ 0.72%
octave::octave_lvalue::~octave_lvalue
0.51%
octave::octave_lvalue::octave_lvalue
If, instead of a callgraph, I look directly at which functions are
consuming the most time it does seem that there is a lot of time spent
allocating/freeing memory and creating/destroying class objects.
Overhead Samples Command Shared Object Symbol
+ 8.87% 5253 QThread libc-2.27.so [.]
cfree@GLIBC_2.2.5
+ 5.44% 3217 QThread libc-2.27.so [.]
malloc
+ 5.31% 3140 QThread liboctinterp.so.7.0.0 [.]
octave_value::operator=
+ 4.95% 2926 QThread liboctgui.so.5.0.0 [.]
octave_value::~octave_value
+ 3.05% 1804 QThread libc-2.27.so [.]
_int_malloc
+ 2.61% 1543 QThread liboctgui.so.5.0.0 [.]
Array<std::__cxx11::basic_string<char, std::char_traits<char>, std:
+ 2.41% 1427 QThread liboctgui.so.5.0.0 [.]
octave_value_list::octave_value_list
+ 2.32% 1365 QThread liboctgui.so.5.0.0 [.]
Array<octave_value>::~Array
--Rik
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: profiling results,
Rik <=