[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SV: [Help-gsl] On conjugate gradient algorithms in multidimentional

From: Max Belushkin
Subject: Re: SV: [Help-gsl] On conjugate gradient algorithms in multidimentional minimisation problems.
Date: Tue, 29 Nov 2005 23:47:23 +0100
User-agent: Thunderbird 1.5 (X11/20051025)

James, thank you, I will certainly give it a go in the next couple of days, and will let you know how it works out - most of the minimization problems in my field (theoretical nuclear physics) are very sensitive to some parameters, and very insensitive to others - and that can "switch around" at some points in parameter space - some parameters that "used to be important" become next to irrelevant in that region, while the others pick up in strength. Thank you very much for the link, and, as mentioned, I will update you on the results!

James Bergstra wrote:
Two things:

Hmm, from what I read on the conjugate gradient algorithms, it's supposed to [basically] be a non-orthogonal version of the steepest descent. But that picks the direction just based on the current gradient, which is usually bad. Conjugate gradient picks the direction based on the current gradient AND the previous gradient. Which means it

CG algorithms work by repeated line-minimization in R^n.  The minimizer picks a
point, your function comes back with the gradient at that point, and then the
minimizer finds a step-size that minimizes the value of your function along that
line.  If your function is more nonlinear in some directions than in others,
then this algorithm might not work very well.


However, I never heard it only varies one parameter at a time - if that is the case, and is the reason for the plateaus, I really need to go searching for a better minimization routine, as this would not be suitable with so many parameters, which, yes, have very different influence on the overall chi squared.

If this is the case with your function (or even if it is not) I'd appreciate it
if you tried out the minimization routine called

that is implemented in my (recently-cleaned) libmsl

I've only ever used it on a single type of minimization (neural network gradient
descent) and I'd appreciate it if you let me know if it works well on your
function too.  It is designed to simultaneously optimize many dimensions in
which the non-linearity in different dimensions are very different.  It does
this by continually estimating the diagonal of the Hessian, to make more
efficient use of the gradient.

On Tue, Nov 29, 2005 at 10:47:17PM +0100, Max Belushkin wrote:
"knows" about the first step direction at any step, which, I guess, is why it's good to restart it from time to time.

Søren Christensen wrote:
Hello Max,
To the part two of your question.
I am no matematician either, but could it be that when you see the mentioned plateau it is not actually a plateau, but one (or more) of your parameters has a very small influence on the RSQ? Eg several order of magnitudes smaller than others? 19.9809->....<keep basically idle
for some 100s of iterations>->19.97->19.5->

is really
19.9809->....<19.9809000040 -> 19.9809000036 -> 19.9809000003 .....

I am not too much into the details of the minimizing, but i seem to recall it minimizes over one parameter at a time.

Is that a possibility?


Fra: Max Belushkin
Sendt: on 30-11-2005 06:50
Til: address@hidden
Emne: [Help-gsl] On conjugate gradient algorithms in multidimentional minimisation problems.

   Hello list,

I apologize in advance for this rather long email, it is a collective description of the common problem I have failed to gain an understanding of, having tried for many months now.

Over the last couple of years, I have made some observations on conjugate gradient algorithms that I would like to seek some advice about. This experience is related to gsl-1.6 and gsl-1.7, as of late.

-- Part 1 --

The typical problem at hand is a parameter space of 20 dimensions or more. That is, a chi-squared function is built, which is to be minimized, that depends on 20 or more independent parameters passed to the minimization routine.

The most common problem is the minimizer eventually bailing out with an error of "iterations not progressing toward a solution" (the exact message may be a bit different, I do not remember off-hand). Thus, I took up to restarting the minimizer with gsl_multimin_fdfminimizer_restart. However, it still does not help. Here is what happens. If we take a toy sample from the examples, and modify it a little, we get:
do {
 status = gsl_multimin_fdfminimizer_iterate(s);
if (status) {printf ("Restarting minimizer point 1\n"); gsl_fdfminimizer_restart(s); goto tryagain;};
 status = gsl_multimin_test_gradient (s->gradient, 1e-3);
 if (status == GSL_SUCCESS) { printf ("converged to minimum\n"); break;};
if (status!=GSL_CONTINUE) { gsl_fdfminimizer_restart(s); printf ("Restarted minimizer point 2\n"); status = GSL_CONTINUE;};
} while (status == GSL_CONTINUE && iter < 10000);

what happens is that we will always sit on "Restarting minimizer point 1", and status will always be "iterations not progressing toward a solution". Removing the goto has the same effect, only I [without any proof, just a feeling] believe it should also ruin the calls to test_gradient in some way.

Now, if we remove the restart at point 1 and break there, then take the parameter vector we ended up with, plug it in, and start the program all over again, it will run for another 5000 iterations or so, then bail out with the same error at the same point. Repeat. And repeat...

At this point, I was somehow thinking it may be an issue with my parameter space. However, recently, my colleague, who has some code in Fortran, asked me to come up with a C library employing GSL routines that would minimize his problem, which is completely different from mine, but the number of parameters is also large, around 15. So I came up with a module that could be linked with a Fortran program, and the exact same phenomenon appears there.

I must admit I am no huge expert in the mathematics underlying the methods (Fletcher-Reeves, Polak-Ribiere and Broyden-Fletcher-Goldfarb-Shanno give the exact same results, up to an arbitrary constant in terms of the number of iterations), but I do know they have to be restarted from time to time. I did things like "if (iter % 50) gsl_fdfminimizer_restart(s)". No effect. So why does restarting the iteration all over (restarting the program) from that point works, and sometimes reduces the chi squared quite significantly over several such runs? And is there any way to force the minimizer to keep going, or have any more control over the convergence process?

-- Part 2 --

This is somewhat related to Part 1. If I look at the convergence by the value chi squared takes per iteration, a typical [example] picture looks like this:
50->30->20.1->20.0->19.99->19.981->19.9809->....<keep basically idle
for some 100s of iterations>->19.97->19.5->19.0->18.3->
17.8->17.79->....<hang here for another century>...->...

The iteration process shows huge plateaus. What can this be an indication of? This is also very common and is seen on a wide range of problems, thus I am assuming this is a well known fact. Perhaps someone has a solution of speeding up the iteration and not sitting on those plateaus where nothing seems to be happening much?

I apologize once again for the long email, and would be most grateful for any help on this subject.


Help-gsl mailing list

Help-gsl mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]