help-gsl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-gsl] Questions about the code of some functions in cblas imple


From: Brian Gough
Subject: Re: [Help-gsl] Questions about the code of some functions in cblas implementation
Date: Tue, 23 Jun 2009 17:40:23 +0100
User-agent: Wanderlust/2.14.0 (Africa) Emacs/22.1 Mule/5.0 (SAKAKI)

At Thu, 18 Jun 2009 23:42:57 +0200,
José Luis García Pallero wrote:
 
> No loop unrolling: 0.005 s
> Loop unrolling: 0.6 s
>
>         for(i=0;i<n;i++)
>         {
>             a = i*i+i;
>         }
>     }

I think the program below is probably more realistic for this
case. Given the huge difference between the two results I suspect that
the compiler is able to overoptimise the simple case above.  Maybe you
could compare this or the actual function.

#include <stdlib.h>
#include <time.h>
#include <stdio.h>

int
main (int argc, char *argv[])
{
  int n = 0, i = 0, j, m;
  double *a, *x;
  double t0, t1, t2;
  double A = 3, B = 2;

  n = atoi (argv[1]);
  m = atoi (argv[1]);
  a = malloc (sizeof (double) * n);
  x = malloc (sizeof (double) * n);

  t0 = clock ();
  {
    for (j = 0; j < m; j++)
      for (i = 0; i < n; i++)
        {
          a[i] = A * x[i] + B;
        }
  }

  t1 = clock ();

  {
    for (j = 0; j < m; j++)
      for (i = 0; i < n; i += 4)
        {
          a[i] = A * x[i] + B;
          a[i + 1] = A * x[i + 1] + B;
          a[i + 2] = A * x[i + 2] + B;
          a[i + 3] = A * x[i + 3] + B;
        }
  }

  t2 = clock ();
  printf ("operations = %g\n", (double) (n * m));
  printf ("plain loop = %g\n", t1 - t0);
  printf ("fancy loop = %g\n", t2 - t1);
  return 0;
}




reply via email to

[Prev in Thread] Current Thread [Next in Thread]