[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: SV: [Help-gsl] On conjugate gradient algorithms in multidimentional
From: |
James Bergstra |
Subject: |
Re: SV: [Help-gsl] On conjugate gradient algorithms in multidimentional minimisation problems. |
Date: |
Tue, 29 Nov 2005 17:33:52 -0500 |
User-agent: |
Mutt/1.4.2.1i |
Two things:
1.
> Hmm, from what I read on the conjugate gradient algorithms, it's
> supposed to [basically] be a non-orthogonal version of the steepest
> descent. But that picks the direction just based on the current
> gradient, which is usually bad. Conjugate gradient picks the direction
> based on the current gradient AND the previous gradient. Which means it
CG algorithms work by repeated line-minimization in R^n. The minimizer picks a
point, your function comes back with the gradient at that point, and then the
minimizer finds a step-size that minimizes the value of your function along that
line. If your function is more nonlinear in some directions than in others,
then this algorithm might not work very well.
2.
> However, I never heard it only varies one parameter at a time - if
> that is the case, and is the reason for the plateaus, I really need to
> go searching for a better minimization routine, as this would not be
> suitable with so many parameters, which, yes, have very different
> influence on the overall chi squared.
If this is the case with your function (or even if it is not) I'd appreciate it
if you tried out the minimization routine called
msl_multimin_fdfminimizer_deltabardelta
that is implemented in my (recently-cleaned) libmsl
http://savannah.nongnu.org/projects/libmsl
I've only ever used it on a single type of minimization (neural network gradient
descent) and I'd appreciate it if you let me know if it works well on your
function too. It is designed to simultaneously optimize many dimensions in
which the non-linearity in different dimensions are very different. It does
this by continually estimating the diagonal of the Hessian, to make more
efficient use of the gradient.
On Tue, Nov 29, 2005 at 10:47:17PM +0100, Max Belushkin wrote:
> "knows" about the first step direction at any step, which, I guess, is
> why it's good to restart it from time to time.
>
>
> SÃ¸ren Christensen wrote:
> >Hello Max,
> >To the part two of your question.
> >I am no matematician either, but could it be that when you see the
> >mentioned plateau it is not actually a plateau, but one (or more) of
> >your parameters has a very small influence on the RSQ? Eg several order
> >of magnitudes smaller than others? 19.9809->....<keep basically idle
> >for some 100s of iterations>->19.97->19.5->
> >
> >is really
> >19.9809->....<19.9809000040 -> 19.9809000036 -> 19.9809000003 .....
> >->19.97->19.5->
> >
> >I am not too much into the details of the minimizing, but i seem to
> >recall it minimizes over one parameter at a time.
> >
> >Is that a possibility?
> >
> >Cheers
> >Soren
> >
> >
> >
> >Fra: Max Belushkin
> >Sendt: on 30-11-2005 06:50
> >Til: address@hidden
> >Emne: [Help-gsl] On conjugate gradient algorithms in multidimentional
> >minimisation problems.
> >
> >
> > Hello list,
> >
> > I apologize in advance for this rather long email, it is a collective
> >description of the common problem I have failed to gain an understanding
> >of, having tried for many months now.
> >
> > Over the last couple of years, I have made some observations on
> >conjugate gradient algorithms that I would like to seek some advice
> >about. This experience is related to gsl-1.6 and gsl-1.7, as of late.
> >
> >-- Part 1 --
> >
> > The typical problem at hand is a parameter space of 20 dimensions or
> >more. That is, a chi-squared function is built, which is to be
> >minimized, that depends on 20 or more independent parameters passed to
> >the minimization routine.
> >
> > The most common problem is the minimizer eventually bailing out with
> >an error of "iterations not progressing toward a solution" (the exact
> >message may be a bit different, I do not remember off-hand). Thus, I
> >took up to restarting the minimizer with
> >gsl_multimin_fdfminimizer_restart. However, it still does not help. Here
> >is what happens. If we take a toy sample from the examples, and modify
> >it a little, we get:
> >do {
> >tryagain:
> > iter++;
> > status = gsl_multimin_fdfminimizer_iterate(s);
> > if (status) {printf ("Restarting minimizer point 1\n");
> >gsl_fdfminimizer_restart(s); goto tryagain;};
> > status = gsl_multimin_test_gradient (s->gradient, 1e-3);
> > if (status == GSL_SUCCESS) { printf ("converged to minimum\n"); break;};
> > if (status!=GSL_CONTINUE) { gsl_fdfminimizer_restart(s); printf
> >("Restarted minimizer point 2\n"); status = GSL_CONTINUE;};
> >} while (status == GSL_CONTINUE && iter < 10000);
> >
> > what happens is that we will always sit on "Restarting minimizer point
> >1", and status will always be "iterations not progressing toward a
> >solution". Removing the goto has the same effect, only I [without any
> >proof, just a feeling] believe it should also ruin the calls to
> >test_gradient in some way.
> >
> > Now, if we remove the restart at point 1 and break there, then take
> >the parameter vector we ended up with, plug it in, and start the program
> >all over again, it will run for another 5000 iterations or so, then bail
> >out with the same error at the same point. Repeat. And repeat...
> >
> > At this point, I was somehow thinking it may be an issue with my
> >parameter space. However, recently, my colleague, who has some code in
> >Fortran, asked me to come up with a C library employing GSL routines
> >that would minimize his problem, which is completely different from
> >mine, but the number of parameters is also large, around 15. So I came
> >up with a module that could be linked with a Fortran program, and the
> >exact same phenomenon appears there.
> >
> > I must admit I am no huge expert in the mathematics underlying the
> >methods (Fletcher-Reeves, Polak-Ribiere and
> >Broyden-Fletcher-Goldfarb-Shanno give the exact same results, up to an
> >arbitrary constant in terms of the number of iterations), but I do know
> >they have to be restarted from time to time. I did things like "if (iter
> >% 50) gsl_fdfminimizer_restart(s)". No effect. So why does restarting
> >the iteration all over (restarting the program) from that point works,
> >and sometimes reduces the chi squared quite significantly over several
> >such runs? And is there any way to force the minimizer to keep going, or
> >have any more control over the convergence process?
> >
> >-- Part 2 --
> >
> > This is somewhat related to Part 1. If I look at the convergence by
> >the value chi squared takes per iteration, a typical [example] picture
> >looks like this:
> >50->30->20.1->20.0->19.99->19.981->19.9809->....<keep basically idle
> >for some 100s of iterations>->19.97->19.5->19.0->18.3->
> >17.8->17.79->....<hang here for another century>...->...
> >
> > The iteration process shows huge plateaus. What can this be an
> >indication of? This is also very common and is seen on a wide range of
> >problems, thus I am assuming this is a well known fact. Perhaps someone
> >has a solution of speeding up the iteration and not sitting on those
> >plateaus where nothing seems to be happening much?
> >
> > I apologize once again for the long email, and would be most grateful
> >for any help on this subject.
> >
> > Thanks!
> >
> >
> >
> >_______________________________________________
> >Help-gsl mailing list
> >address@hidden
> >http://lists.gnu.org/mailman/listinfo/help-gsl
> >
>
>
> _______________________________________________
> Help-gsl mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/help-gsl
--
james bergstra
http://www-etud.iro.umontreal.ca/~bergstrj