exporting patch: # HG changeset patch # User Daniel Kraft # Date 1313764807 -7200 # Node ID fb0f40cd8752b86dee500338bf8f447f7be8856a # Parent 9db4e9b352d6a3e158f26503723def0ff907e77d Add an example profiler session to the manual. debug.txi: New section which walks through an example profiler session. diff -r 9db4e9b352d6 -r fb0f40cd8752 doc/interpreter/debug.txi --- a/doc/interpreter/debug.txi Fri Aug 19 16:39:09 2011 +0200 +++ b/doc/interpreter/debug.txi Fri Aug 19 16:40:07 2011 +0200 @@ -36,6 +36,7 @@ * Debug Mode:: * Call Stack:: * Profiling:: +* Profiler Example:: @end menu @node Entering Debug Mode @@ -224,3 +225,195 @@ also found out that it is recursive. @DOCSTRING(profshow) + address@hidden Profiler Example address@hidden Profiler Example + +Below, we will give a short example of a profiler session. +See also @ref{Profiling} +for the documentation of the profiler functions in detail. +Consider the code: + address@hidden +global N A; + +N = 300; +A = rand (N, N); + +function xt = timesteps (steps, x0, expM) + global N; + + if (steps == 0) + xt = NA (N, 0); + else + xt = NA (N, steps); + x1 = expM * x0; + xt(:, 1) = x1; + xt(:, 2 : end) = timesteps (steps - 1, x1, expM); + endif +endfunction + +function foo () + global N A; + + initial = @@(x) sin (x); + x0 = (initial (linspace (0, 2 * pi, N)))'; + + expA = expm (A); + xt = timesteps (100, x0, expA); +endfunction + +function fib = bar (N) + if (N <= 2) + fib = 1; + else + fib = bar (N - 1) + bar (N - 2); + endif +endfunction address@hidden example + +If we execute the two ``main'' functions, we get: + address@hidden +tic; foo; toc; address@hidden Elapsed time is 2.37338 seconds. + +tic; bar (20); toc; address@hidden Elapsed time is 2.04952 seconds. address@hidden example + +But this does not give much information about where this time is spent; +for instance, whether the single call to @code{expm} is more expensive or +the recursive time-stepping itself. To get a more detailed picture, +we can use the profiler. + address@hidden +profile on; +foo; +profile off; + +data = profile ('info'); +profshow (data, 10); address@hidden example + +This prints a table like: + address@hidden + # Function Attr Time (s) Calls +--------------------------------------------- + 7 expm 1.034 1 + 3 binary * 0.823 117 + 41 binary \ 0.188 1 + 38 binary ^ 0.126 2 + 43 timesteps R 0.111 101 + 44 NA 0.029 101 + 39 binary + 0.024 8 + 34 norm 0.011 1 + 40 binary - 0.004 101 + 33 balance 0.003 1 address@hidden example + +The entries are the individual functions which have been executed (only the +10 most important ones), together with some information for each of them. +The entries like @samp{binary *} denote operators, while other +entries are ordinary functions --- they include both built-ins like @code{expm} +and our own routines (for instance @code{timesteps}). From this profile, we can +immediately deduce that @code{expm} uses up the largest proportion of the +processing time, even though it is only called once. The second expensive +operation is the matrix-vector product in the routine @code{timesteps}. +(We only know it is the binary multiplication operator, but fortunately this +operator appears only at one place in the code and thus we know which +occurance takes so much time. If there were multiple places, we would have +to use the hierarchical profile to find out the exact place which +uses up the time --- which is not covered in this example.) + +But timing is not the only information available from the profile --- the +attribute column shows us that @code{timesteps} calls itself recursively. +This may not be that exciting in this example (since it's clear anyway), +but could be helpful in a more complex setting. If you wonder why +there appears a @samp{binary \} in the output, we can easily shed some light +on that, too. Note that @code{data} is an Octave structure which contains +the field @code{FunctionTable}. This stores the ``raw data'' for the profile +shown. The number in the first column of the table gives the index under which +the shown function can be found there. Looking up @code{data.FunctionTable(41)} +gives: + address@hidden + scalar structure containing the fields: + + FunctionName = binary \ + TotalTime = 0.18765 + NumCalls = 1 + IsRecursive = 0 + Parents = 7 + Children = [](1x0) address@hidden example + +Here we see the information from the table again, but have additional fields address@hidden and @code{Children}. Those are both arrays, which contain +the indices of functions which have directly called the function in question +(which is entry 7, @code{expm}, in this case) or been called by it (no +functions). Hence, the backslash operator has been used internally +by @code{expm}. + +Ok, now let's take a look at @code{bar}. For this, we start a fresh +profiling session (@code{profile on} does this, the old data is removed before +the profiler is restarted): + address@hidden +profile on; +bar (20); +profile off; + +profshow (profile ('info')); address@hidden example + +This gives: + address@hidden + # Function Attr Time (s) Calls +------------------------------------------------------- + 1 bar R 2.091 13529 + 2 binary <= 0.062 13529 + 3 binary - 0.042 13528 + 4 binary + 0.023 6764 + 5 profile 0.000 1 + 8 false 0.000 1 + 6 nargin 0.000 1 + 7 binary != 0.000 1 + 9 __profiler_enable__ 0.000 1 address@hidden example + +Unsurprisingly, @code{bar} is also recursive. It has been called 13,529 times +in the course of recursively calculating the Fibonacci number in a +not-that-optimal way, and most of the time was spent in @code{bar} itself. + +Finally, let's say we want to profile the execution of both @code{foo} +and @code{bar} together. Since we already have the runtime data collected +for @code{bar}, we can restart the profiler without clearing the existing +data and collect the missing statistics about @code{foo}. This is done by: + address@hidden +profile resume; +foo; +profile off; + +profshow (profile ('info'), 10); address@hidden example + +As you can see in the table below, now we have both profiles ``mixed together''. + address@hidden + # Function Attr Time (s) Calls +--------------------------------------------- + 1 bar R 2.091 13529 + 16 expm 1.122 1 + 12 binary * 0.798 117 + 46 binary \ 0.185 1 + 45 binary ^ 0.124 2 + 48 timesteps R 0.115 101 + 2 binary <= 0.062 13529 + 3 binary - 0.045 13629 + 4 binary + 0.041 6772 + 49 NA 0.036 101 address@hidden example