--- Begin Message ---
Subject: |
[Bug c++/14563] octave built under Cygwin very slow |
Date: |
24 Mar 2004 09:52:20 -0000 |
------- Additional Comments From paulthomas2 at wanadoo dot fr 2004-03-24
09:52 -------
Subject: Re: octave built under Cygwin very slow
bangerth at dealii dot org wrote:
>------- Additional Comments From bangerth at dealii dot org 2004-03-14 20:33
>-------
>One of the things you could do is to compile the octave version you
>use with both 3.2.3 and 3.3.3, in each case with profiling information
>(i.e. with -pg) and check the output to see whether there is one single
>function that now takes significantly longer (and thus moved up in
>the output of gprof, in the list of functions sorted by execution time).
>
>If there should be one such function, it might be worth to compile just this
>one function (or file) with the other compiler, to verify that it indeed
>is the problem. This way we could at least localize the problem a little
>better. This would also be of great value to us, since we have no clue
>about the octave functions, and it is hard for us to look at profiling
>output without knowing what all these functions do etc.
>
>If you have identified the function that has the slowdown (assuming that it
>is a single function), then one would rip it out of the program and place
>it (and whatever else it needs) into a small file where main() simply calls
>this function a number of times with dummy arguments. This way we would
>have a simpler and much smaller testcase.
>
>W.
>
>
>
Ben Diedrich and I have done as you suggested. It has to be said that
neither of us have tried profiling anything, let alone a code as large
and as complicated as octave. I did the gcc-3.2 and Ben did the
gcc-3.3. Our machines are nearly identical, both with Windows 2000 and
with the same version of Cygwin.
We ran a small test programme in octave that shows a factor of
approximately 6 difference in execution time, when octave is built with
gcc-3.2 (~1.8-2s) or gcc-3.3 (10-12s). When compiled and linked with
-pg set, the execution time increased to 9s with gcc-3.2 (Ben, what is
the corresponding figure for gcc-3.3?). Switching off the -O2 flags,
increased the time by a further 3s but resulted in a much more detailed
profile. In the enclosed, we increased the number of loops by a factor
of 10, in order to get a reasonable resolution on the less frequently
visited functions. So remember that the corresponding wall-clock time
is 90s for cycle 2.
I have added the most pertinent part of the graphical output. If you
want to see the entirity of the gprof output, I can easily forward it to
you for both builds.
To first order, there is an enormous amount of unaccounted time but both
builds are more or less identical in their time in octave functions.
The only significant difference that I can see is the appearance in the
gcc-3.3 version of
<spontaneous>
[7] 22.3 5.62 0.00 _Unwind_SjLj_Register [7]
<spontaneous>
[9] 12.4 3.11 0.00 _Unwind_SjLj_Unregister [9]
which together take a significant amout of time. What are these calls
and could they be the culprits?
I will be spending a bit of time over the next few days to get a better
feel for profiling and its relationship to the execution times without
-pg set. Whilst the time spent in various routines does change in
correct proportion to the content of the octave code, the absolute
magnitudes are way out.
I hope that this helps to give you a clue. Any advice that you can
offer on the profiling would be gratefully received.
Paul Thomas
Profiles for octave-2.1.50 built using (i) gcc-3.2 and (ii) gcc-3.3
Paul Thomas and Ben Diedrich 03/23/04
Both run with test programme, entered from octave command line:
a = cputime ; tot = 0 ; x = [1:1e6] ; for i = 1:1e6 ; tot = tot + x(i) ; end ;
disp(cputime-a)
The individual lines in the graphical output can be identified with features in
the test programme and the times correspond
well, in proportion, with timing tests done in octave. The absolute times are
way out.
granularity: each sample hit covers 4 byte(s) for 0.04% of 25.18 seconds
index % time self children called name
>>>>>Built with gcc-3.2
[3] 70.7 2.33 6.14 4+16000183 <cycle 2 as a whole> [3]
0.27 2.15 1000017
tree_index_expression::rvalue(int) <cycle 2> [8]
0.29 1.06 3000033 tree_identifier::rvalue(int)
<cycle 2> [12]
0.24 0.58 1000015
tree_simple_assignment::rvalue() <cycle 2> [17]
0.22 0.51 1000022 tree_statement::eval(bool,
int, bool) <cycle 2> [19]
0.19 0.47 1000009+4
tree_binary_expression::rvalue() <cycle 2> [20]
0.28 0.24 1000003
tree_argument_list::convert_to_const_vector(octave_value const*) <cycle 2> [23]
0.21 0.26 3000032 tree_identifier::rvalue()
<cycle 2> [27]
0.14 0.23 1000015
tree_simple_assignment::rvalue(int) <cycle 2> [32]
0.17 0.13 1000006
tree_statement_list::eval(bool, int) <cycle 2> [36]
0.09 0.17 1000005
make_value_list(tree_argument_list*, string_vector const&, octave_value const*)
<cycle 2> [40]
0.15 0.09 1000014
tree_index_expression::rvalue() <cycle 2> [45]
0.02 0.19 1000000
tree_simple_for_command::do_for_loop_once(octave_lvalue&, octave_value const&,
bool&) <cycle 2> [47]
0.06 0.06 1
tree_simple_for_command::eval() <cycle 2> [64]
0.00 0.00 2
octave_user_function::do_multi_index_op(int, octave_value_list const&) <cycle
2> [184]
0.00 0.00 3
octave_value::do_multi_index_op(int, octave_value_list const&) <cycle 2> [208]
0.00 0.00 2
tree_parameter_list::convert_to_const_vector(tree_va_return_list*) <cycle 2>
[242]
0.00 0.00 2 tree_if_command::eval() <cycle
2> [1079]
0.00 0.00 2 tree_if_command_list::eval()
<cycle 2> [1084]
0.00 0.00 2 tree_if_clause::eval() <cycle
2> [1077]
---
>>>>>Built with gcc-3.3
[3] 42.7 3.07 7.68 3+16000180 <cycle 2 as a whole> [3]
0.36 2.66 1000017
tree_index_expression::rvalue(int) <cycle 2> [10]
0.36 1.27 3000032 tree_identifier::rvalue(int)
<cycle 2> [14]
0.41 0.88 1000015
tree_simple_assignment::rvalue() <cycle 2> [17]
0.25 0.66 1000021 tree_statement::eval(bool,
int, bool) <cycle 2> [21]
0.13 0.64 1000009+4
tree_binary_expression::rvalue() <cycle 2> [25]
0.26 0.31 1000015
tree_simple_assignment::rvalue(int) <cycle 2> [29]
0.29 0.28 1000003
tree_argument_list::convert_to_const_vector(octave_value const*) <cycle 2> [30]
0.27 0.29 3000032 tree_identifier::rvalue()
<cycle 2> [31]
0.30 0.13 1000005
tree_statement_list::eval(bool, int) <cycle 2> [35]
0.17 0.17 1000005
make_value_list(tree_argument_list*, string_vector const&, octave_value const*)
<cycle 2> [44]
0.05 0.24 1000000
tree_simple_for_command::do_for_loop_once(octave_lvalue&, octave_value const&,
bool&) <cycle 2> [49]
0.10 0.10 1000014
tree_index_expression::rvalue() <cycle 2> [63]
0.12 0.06 1
tree_simple_for_command::eval() <cycle 2> [69]
0.00 0.00 2
octave_user_function::do_multi_index_op(int, octave_value_list const&) <cycle
2> [186]
0.00 0.00 2
tree_parameter_list::convert_to_const_vector(tree_va_return_list*) <cycle 2>
[233]
0.00 0.00 2
octave_value::do_multi_index_op(int, octave_value_list const&) <cycle 2> [1058]
0.00 0.00 2 tree_if_command::eval() <cycle
2> [1068]
0.00 0.00 2 tree_if_command_list::eval()
<cycle 2> [1074]
0.00 0.00 2 tree_if_clause::eval() <cycle
2> [1066]
>>>>Lines that do not appear in the version built with gcc-3.2
<spontaneous>
[7] 22.3 5.62 0.00 _Unwind_SjLj_Register [7]
<spontaneous>
[9] 12.4 3.11 0.00 _Unwind_SjLj_Unregister [9]
<spontaneous>
[11] 10.2 2.56 0.00 operator new(unsigned int) [11]
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14563
------- You are receiving this mail because: -------
You reported the bug, or are watching the reporter.
--- End Message ---