help-gsl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-gsl] gsl performance


From: Simon Zehnder
Subject: Re: [Help-gsl] gsl performance
Date: Sun, 6 Oct 2013 22:38:33 +0200

I write between the lines.

On Oct 6, 2013, at 10:04 PM, onefire <address@hidden> wrote:

> I am aware of those. I tried the following: 
> 1) Call gsl_set_error_handler_off() to turn off the error handler
> 2) Compile with -DHAVE_INLINE  and -DGSL_RANGE_CHECK_OFF
> 3) Link to a different cblas (actually I tried openblas, mkl and gslcblas).
> 
> However, my most interesting experiment was to modify my minimization 
> algorithms to accept dynamic arrays. Originally, my function was using 
> std::array to store the values of the independent variables when minimizing a 
> multidimensional function. My experiment did mimic my use case: solve the 
> same problem millions of times (to be precise, a million times). When I used 
> dynamic arrays, my algorithm became much slower because of all the calls to 
> malloc/new, and free/delete. So I did modify my algorithm to accept a 
> gsl_vector. Guess what happened? My library became slower than gsl! I am not 
> sure if this is because my Nelder-Mead implementation is different than gsl's 
> (which I did not look at) or if its just the extra overhead of making the 
> wrapper.
> 
Here a profiling could give clearer insights - if its possible - about the gsl 
functions that take the most time. 
> My point is: The huge differences that I observed were because of gsl_vector 
> which is not efficient for small arrays (stack memory is faster and you avoid 
> the costs to malloc/free). So I think that the gsl minimization algorithms 
> could have a version that uses stack arrays. I should not have to pay for the 
> cost of dynamic allocation if I am minimizing a function of 5 variables (a 
> typical version of the Nelder-Mead algorithm would require 6 points, 
> including the function evaluations that would require an array of 36 
> elements, which is fine for the stack) . I think that gsl could have 
> alternative implementations that use stack memory (with warnings to not use 
> those if the problem is big).
> 
Interesting! Indeed in most statistical problems I am involved in around 6 
parameters is also a good average. Physicists and Engineers may have larger 
parameter vectors. The main target of a library like GSL is an inherent 
consistency of its objects and as it defines the gsl_vector and uses it 
everywhere in its methods, consistency is reached. Performance is something 
that is not its main target I would guess. Providing a function though, that 
works with STL objects - commonly used by C++ developers is not a bad idea and 
- as you tell - it improves performance. Especially embedding GSL into 
self-developed code becomes easier with such a solution, as the step from own 
code to GSL-methods would be more 'fluently'. For another performance benchmark 
you could try the Nelder-Mead from nlopt. This works with std::vectors. 

> The other issue is: the implementation of gsl_vector just seems inefficient 
> to me. Looking at the code, it seems like a single vector requires 3 calls to 
> malloc and free (one for the data, one for the gsl_block, and one for the 
> gsl_vector itself). The manual states that the block is there for 
> "consistency", and I can see how memory management becomes easier with it. 
> But it seems to be a case of generality at the expense of performance. Also, 
> the stride and the owner flag are part of the gsl_vector object to make it 
> work with gsl_views, but then people who never need views pay the performance 
> price anyway. 
> 
Again, I would assume the main developing target was inherent consistency. 

> All of these issues are not likely to matter for big vectors, but they do 
> make a large difference when you are dealing with small objects.
> 
> Am I the only one who thinks like this?
> 
> Gilberto  
> 
> PS: As a side note, I prefer Eigen over Armadillo. In my experience, the 
> former is almost always faster.   
> 
Thanks for the suggestion, I heard about it but never tried it. I will give it 
a shot.
> 
> On Sun, Oct 6, 2013 at 5:09 AM, Simon Zehnder <address@hidden> wrote:
> Hi Gilbaerto,
> 
> congrats on your performance results!
> 
> A first guess of the worse performance of the gsl library would be exception 
> throwing. The GSL is made for 'users' and this means, that a user has to be 
> informed about the kind of exception he encounters. This can be left out if 
> you have your own code and you know your own code. If something goes wrong, 
> you know where to look in your code where the error occurred and what could 
> be the reasons. A 'user' just using a library could be helpless when he 
> encounters and error and does not know where it occurred. So checking and 
> error catching could be a reason, that makes the gsl less performant but more 
> appropriate for 'easy' usage.
> 
> I use a lot the C++ linear algebra library Armadillo. This library has for 
> example two different element accessors: One, '()' which makes checkings and 
> the second 'at()' with no checkings. Performance increases sometimes 
> tremendously when using the 'at()' method.
> 
> Best
> 
> Simon
> 
> 
> On Oct 6, 2013, at 12:44 AM, onefire <address@hidden> wrote:
> 
> > Hi all,
> >
> > I am creating a personal library for my C++ projects and to evaluate its
> > performance I decided to compare it to gsl and, to my surprise, my library
> > is much faster than gsl in many cases. For example, my Nelder-Mead
> > implementation can be 5 or 10 faster for some functions.
> >
> > This has been bothering me a lot because:
> > 1) My tests are correct, and I checked the results against output from
> > Scipy and Octave.
> > 2) I am certainly not smarter than the many people who worked on gsl over
> > the years.
> > 3) I think I know how to use gsl "properly".
> >
> > Sorry I do not have an example, but my library is not polished enough for
> > me to publish it yet.
> >
> > What I am really intrigued about  is that I googled around and it seems
> > that many people noticed (before me) that many gsl implementations are
> > inefficient. I also found some posts on the mailing lists with things like
> > "gsl is much slower than Matlab" and responses like "gsl is not meant to be
> > the fastest".
> >
> > These things lead me to believe that gsl is "slow" by design, If this is
> > the case, can someone explain to me why this is so?
> >
> > Gilberto
> >
> > PS: Much of the gain with C++ comes from better inlining, which can be
> > specially important with things like function optimization (it is hard, if
> > not impossible, to inline when there is a funcyion pointer passed around).
> > This is why std::sort can be much faster than lqsort. However, I am
> > confident that it is possible to write faster implementations (in C) than
> > some of the ones in gsl.
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]