help-gsl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-gsl] 1st attempt at fitting a Gaussian function with GSL.


From: David Eric Miller
Subject: Re: [Help-gsl] 1st attempt at fitting a Gaussian function with GSL.
Date: Wed, 21 Dec 2005 16:16:03 -0600
User-agent: Opera M2/8.50 (Linux, build 1358)

Thank you Joakim,

Let me begin by restating your instructions as I understood them.
(These are parts A and B below.)  And then attempt to continue with
the fit.

  Please note that the "code" below is full of syntax errors.
The attachment is much cleaner.  It nearly complies except for the
error:

 g++ -Wall -L/usr/local/lib testGauss.cc -lgsl -lgslcblas -lm -o FitGauss

testGauss.cc: In function `int FitGauss(data*, double*, double*, double&)':
 testGauss.cc:194: `gsl_matirx_get' undeclared (first use this
function)
 testGauss.cc:194: (Each undeclared identifier is reported only
once for each function it appears in.)


Which I do not understand.




#include <stdlib.h>
#include <stdio.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_vector.h>
#include <gsl/gsl_blas.h>
#include <gsl/gsl_multifit_nlin.h>

The fitting setup:
  A.) uses a data structure of the following form:

struct data {
  size_t n;             // Number of data points.
  double *xPos;        // The {x,y,y error} data set.
  double *yVal;
  double *yErrors;
};


  B.) uses a gaussian function of form:  //gsl_multifit_function
    1.)
int gaussian_f (
  const gsl_vector *x, // Vector of fit parameters.
  void *params, // Pointers to the data struct members.
  gsl_vector *differences  // Pointer to fit residuals.
  )

    2.) The body of which calculates the residuals.

{
  size_t n = ((struct data *)params)->n;
  double *xPos = ((struct data *)params)->xPos;
  double *yVal = ((struct data *)params)->yVal;
  double *sigma = ((struct data *) params)->yErrors;

  double Amplitude = gsl_vector_get (x, 0);
  double mean = gsl_vector_get (x, 1);
  double std_dev = gsl_vector_get (x, 2);
  double scale = -0.5 / (std_dev * std_dev); //scale factor in exp.

  for (size_t i = 0; i < n; i++)
    { // My Gaussian distribution is NOT normalized.
      /* Model Yi =
         Amplitude * exp(-(t - mean)*(t - mean) / (2. * variance)) */
      double t = xPos[i] - mean;
      double Yi = Amplitude * exp (scale * t * t);
      gsl_vector_set (differences, i, (Yi - yVal[i])/sigma[i]);
    }

  return GSL_SUCCESS;
}





  C.) So, then is it correct that to use derivatives I need to
define a function of type "gsl_multifit_function_df"?
    1.) Where gsl_multifit_function_df is of following form:

int gaussian_df (
  const gsl_vector *x, // Vector of fit parameters.
  void *params, // Pointers to the data struct members.
  gsl_matrix *J  // Pointer to returned Jacobian Matrix.
  )

    2.) The body returns the Jacobian Matrix as below.
{
  size_t n = ((struct data *)params)->n;
  double *yVal = ((struct data *)params)->yVal;
  double *sigma = ((struct data *) params)->yErrors;

  double Amplitude = gsl_vector_get (x, 0);
  double mean = gsl_vector_get (x, 1);
  double std_dev = gsl_vector_get (x, 2);
  double scale = -0.5 / (std_dev * std_dev); // scale factor in exp.

  for (size_t i = 0; i < n; i++)
  {
    double t = yVal[i] - mean;
    double df_dA = exp( t * t / scale ) / sigma[i];
    double df_dMean =
      Amplitude * t * df_dA / std_dev / std_dev ;
    double df_dStd_dev =
      df_dMean * t / std_dev;
    gsl_matrix_set (J, i, 0, df_dA);
    gsl_matrix_set (J, i, 1, df_dMean);
    gsl_matrix_set (J, i, 2, df_dStd_dev);
  }
  return GSL_SUCCESS;
}


  D.) And finally, I need a gsl_multifit_function_fdf function
    which wraps up the two functions above.

int gaussian_fdf (const gsl_vector * x, void *params,
                 gsl_vector * f, gsl_matrix * J)
{
  gaussian_f (x, params, f);
  gaussian_df (x, params, J);

  return GSL_SUCCESS;
}


  E.) I can now try to set up the actual fit minimization block.
    This block is unclear to me.  But I'll make a rough stab at it.

int FitGauss (data *d, double fitParameter[3],
              double fitParamError[3], double &chi2) {
  const gsl_multifit_fdfsolver_type *T;
  gsl_multifit_fdfsolver *s;

  int status;
  size_t iter = 0;
  const size_t n =  ((struct data *)d)->n;
  const size_t p = 3;

  gsl_matrix *covar = gsl_matrix_alloc (p, p);
  double *xPos=((struct data *)d)->xPos;
  double *yVal=((struct data *)d)->yVal;
  gsl_multifit_function_fdf f;

  // Calculate good parameter guess.
  // guessMean = Data Average.
  // guessStd_dev = rms.
  // guessAmplitude = Sum / (Sqrt(pi) * rms).
  double sumXY, sumXXY, integral, XY;

  for ( size_t i = 0; i < n ; i++){
    sumXY   += XY = xPos[i] * yVal[i];
    sumXXY  += xPos[i] * XY;
    integral += yVal[i];
  }
  double guessMean = sumXY / integral;
  double guessStd_dev = sqrt((sumXXY / integral)
                        - guessMean*guessMean);
  double guessAmplitude = integral * (M_2_SQRTPI * 0.5)
                          / guessStd_dev;

  double x_init[3] = { guessAmplitude, guessMean, guessStd_dev };

  gsl_vector_view x = gsl_vector_view_array (x_init, p);
  const gsl_rng_type * type;
  gsl_rng * r;

  gsl_rng_env_setup();

  type = gsl_rng_default;
  r = gsl_rng_alloc (type);

  f.f = &gaussian_f;
  f.df = &gaussian_df;
  f.fdf = &gaussian_fdf;
  f.n = n;
  f.p = p;
  f.params = &d;

  T = gsl_multifit_fdfsolver_lmsder;
  s = gsl_multifit_fdfsolver_alloc (T, n, p);
  gsl_multifit_fdfsolver_set (s, &f, &x.vector);

  print_state (iter, s);

  do {
      iter++;
      status = gsl_multifit_fdfsolver_iterate (s);

      printf ("status = %s\n", gsl_strerror (status));

      print_state (iter, s);

      if (status)
        break;

      status = gsl_multifit_test_delta (s->dx, s->x, 1.e-4, 1.e-4);
  }
  while (status == GSL_CONTINUE && iter < 500);

  gsl_multifit_covar (s->J, 0.0, covar);

#define FIT(i) gsl_vector_get(s->x, i)
#define ERR(i) sqrt(gsl_matrix_get(covar,i,i))

  printf ("Amplitude= %.5f +/- %.5f\n", FIT(0), ERR(0));
  printf ("Mean     = %.5f +/- %.5f\n", FIT(1), ERR(1));
  printf ("Std_Dev  = %.4f +/- %.5f\n", FIT(2), ERR(2));

  double chi = gsl_blas_dnrm2(s->f);
  chi2 = chi * chi / static_cast<double>(n - p);
  printf("chisq/dof = %g\n",  chi2);

  printf ("status = %s\n", gsl_strerror (status));

  fitParameter[0] = gsl_vector_get(s->x, 0);
  fitParameter[1] = gsl_vector_get(s->x, 1);
  fitParameter[2] = gsl_vector_get(s->x, 2);

  //fitParamError[0] = sqrt(gsl_matirx_get(covar, 0, 0));
  //fitParamError[1] = sqrt(gsl_matirx_get(covar, 1, 1));
  //fitParamError[2] = sqrt(gsl_matirx_get(covar, 2, 2));

  gsl_multifit_fdfsolver_free (s);
  return 0;
}


  F.) And a simple test program which fills the data struct
and then calls FitGauss.

// Test program.
#include <iostream>
#define N_POINTS (5)
int main(void){

  data d;
  double fitParameter[3];
  double fitParamError[3];
  double chi2;

  size_t n = N_POINTS;
  double xPos[N_POINTS] = {0.,1.,2.,3.,4.};
  double yVal[N_POINTS] = {0.,1.,2.,1.,0.};
  double yErrors[N_POINTS] = {.1,.2,.3,.2,.1};

  d.n = n;
  d.xPos = xPos;
  d.yVal = yVal;
  d.yErrors = yErrors;

  FitGauss(&d,fitParameter,fitParamError,chi2);

  std::cout<<fitParameter[0]<<std::endl;
  std::cout<<fitParameter[1]<<std::endl;
  std::cout<<fitParameter[2]<<std::endl;
  std::cout<<fitParamError[0]<<std::endl;
  std::cout<<fitParamError[1]<<std::endl;
  std::cout<<fitParamError[2]<<std::endl;
  std::cout<<chi2<<std::endl;

}


I am compiling with the command:
g++ -L/usr/local/lib -ggdb -Wall testGauss.cc -lgsl -lgslcblas -lm -o FitGauss


So, my problems:

Problem #1:
I am having trouble passing and recasting the data struct.
When main passes &d into FitGauss(*d,...) all works well.

But when FitGauss passes the data through gsl_multifit_fdfsolver_set
into gaussian_f(..., void * params,...) all of the values are wrong.

I am not familiar with the internals of the gsl calls so I need
some help understanding how to make this work.


Problem #2:
If I uncomment the lines:

  //fitParamError[0] = sqrt(gsl_matirx_get(covar, 0, 0));
  //fitParamError[1] = sqrt(gsl_matirx_get(covar, 1, 1));
  //fitParamError[2] = sqrt(gsl_matirx_get(covar, 2, 2));


Then I get the following compiler error, which I don't understand.

testGauss.cc: In function `int FitGauss(data*, double*, double*, double&)':
testGauss.cc:187: `gsl_matirx_get' undeclared (first use this function)
testGauss.cc:187: (Each undeclared identifier is reported only once for each
   function it appears in.)


Problem #3:
I'm sure there are several more problems I haven't discovered
yet.  If you spot any please let me know.


Thank you yet again!

-dave miller








On Mon, 19 Dec 2005 02:44:20 -0600, Joakim Hove <address@hidden> wrote:

"David Eric Miller" <address@hidden> writes:

Hello.

I need to fit a Gaussian function to an array of data points.

The example given in the gsl manual is a bit too advanced for
me to bootstrap (although I'm sure it will help after I get
started).

Any help is very much appreciated.

Hello,

by looking at your code it seems to me that yoy have done the
"classical" mistake of confusing the variable 'x' in the gsl example
with the argument variable x. The variable denoted 'x' in the gsl
example is 'the unknown', and in this case that is the parameters you
want to find.


/*start of quick and dirty gaussian example ....*/

struct data {
   double *x;
   double *y;
   double *error;
}


/*
   Gaussian function form:

   Y(x) = 1/sqrt(2*pi*sigma)*exp(-(x - mu)*(x - mu)/(2*sigma*sigma))

   With parmeters mu and sigma.

*/


/*
   This function has three arguments:

   1. The first argument x is the vector of unknown parameters mu and
      sigma, which we are trying to determine. The first thing we do
      with this vector is to extract the values and store them in
      local variables mu and sigma (purely for convenience).

   2. The second argument params, is a pointer to user data. This can
      in principle point to *anyting*, but you will typically use it
      to pass in the x and y (and error?) values used when fitting.

      Since this is a void pointer, you will typically have to cast it
      to something else before extracting the relevant data.

   3. The third argument is actually output, this is a vector, where
      each element is the difference between the "measured" value of
      y, and the value estimated with the current estimate of the
      parameters.

*/

int gaussian_f(const gsl_vector *x, void *params, gsl_vector *f) {
   const double mu    = gsl_vector_get(x , 0);
   const double sigma = gsl_vector_get(x , 1);
   const  size_t = f->size;
   double *xdata = ((struct data *)params)->x;
   double *ydata = ((struct data *)params)->y;
   double *error = ((struct data *)params)->error;
  size_t i;

   for (i=0; i < n; i++)
gsl_vector_set(f , i , (ydata[i] - 1/(sqrt(2*pi*sigma))*exp(-0.5*gsl_pow_2((xdata[i] - mu)/sigma))));
}


...
...
...

OK - I hope this was of some help. Feel free to ask further questions.


Joakim




Attachment: test_gauss.cc
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]