[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-gsl] Multidimensional linear fit (and principal component anal

From: Philipp Klaus Krause
Subject: Re: [Help-gsl] Multidimensional linear fit (and principal component analysis, covariance matrix)
Date: Sun, 14 Dec 2008 14:16:07 +0100
User-agent: Mozilla-Thunderbird (X11/20081018)

Liam Healy schrieb:
> On Sat, Dec 13, 2008 at 3:53 PM, Philipp Klaus Krause <address@hidden> wrote:
>> I want to do a least squares fit of a line in 3 or 4-dimensional space
>> to 16 data points.
>> I looked at the manual, it seems gsl provides least squares linear fits
>> only for onedimensional stuff.
> ??

Somehow I can't see how I could use that for my problem. The y there is
still a vector of doubles, while I have points in three- or
fourdimensional space. There is that matrix X which allows fitting to
any number of functions, while I just want a line.

Thus I've tried implementing the PCA using gsl. It works fine, but
currently takes a bit more than 7 seconds to call pca() a million times
on my system. Since I'm a newbie to gsl I suppose there is a way to
improve performance. What would you suggest? Dimension will typically be
3 or 4.

struct odata
        int dimension;
        gsl_matrix *covariance_matrix;
        gsl_eigen_symmv_workspace *workspace;
        gsl_vector *eigenvalues;
        gsl_matrix *eigenvectors;
        gsl_vector *mean;
        gsl_matrix *mean_substracted_points;

static void create_pca(struct odata *data, const int dimension)
        #warning TODO: Check for lack of memory in this function.
        data->dimension = dimension;
        data->covariance_matrix = gsl_matrix_alloc(dimension, dimension);
        data->eigenvalues = gsl_vector_alloc(dimension);
        data->eigenvectors = gsl_matrix_alloc(dimension, dimension);
        data->mean = gsl_vector_alloc(dimension);
        data->mean_substracted_points = gsl_matrix_alloc(dimension, 16);
        data->workspace = gsl_eigen_symmv_alloc(dimension);

static void destroy_pca(struct odata *data)

// Do PCA, principal component can be found in first column of
static void pca(const double *points, struct odata *data)
        int i;

        for(i = 0; i < data->dimension; i++)
                gsl_vector_set(data->mean, i, gsl_stats_mean(points + i,
data->dimension, 16));

        // Get mean-substracted data into matrix mean_substracted_points.
        for(i = 0; i < 16; i++)
                gsl_vector_const_view point_view = 
+ i * data->dimension, data->dimension);
                gsl_vector_view mean_substracted_point_view =
gsl_matrix_column(data->mean_substracted_points, i);
                gsl_vector_sub(&mean_substracted_point_view.vector, data->mean);

        // Compute Covariance matrix
        gsl_blas_dgemm(CblasNoTrans, CblasTrans, 1.0 / 16.0,
data->mean_substracted_points, data->mean_substracted_points, 0.0,

        // Get eigenvectors, sort by eigenvalue.
        gsl_eigen_symmv(data->covariance_matrix, data->eigenvalues,
data->eigenvectors, data->workspace);
        gsl_eigen_symmv_sort(data->eigenvalues, data->eigenvectors,


reply via email to

[Prev in Thread] Current Thread [Next in Thread]