discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss-gnuradio] *much* faster filtering


From: Eric Blossom
Subject: [Discuss-gnuradio] *much* faster filtering
Date: Tue, 10 May 2005 15:41:13 -0700
User-agent: Mutt/1.5.6i

Do to some fine assembly language hacking by Stephane Fillod, we now
have SSE and 3DNow! versions of the guts of the "fcc" and "ccf" FIR
filters.  "fcc" is float input, complex output, complex taps.  "ccf"
is complex input, complex output, float taps.  The "ccf" variant is
especially handy when working with the usrp, since we're generally
dealing with complex baseband data.

The new code is more than 8 times faster on the P4!

----------------------------------------------------------------


Pentium M (1.4 GHz)

address@hidden tests]$ ./benchmark_dotprod_fcc
   generic: taps:  256  input: 4e+07  cpu: 110.310  taps/sec:  9.283e+07
       SSE: taps:  256  input: 4e+07  cpu:  22.379  taps/sec:  4.576e+08
address@hidden tests]$ ./benchmark_dotprod_ccf
   generic: taps:  256  input: 4e+07  cpu: 118.765  taps/sec:  8.622e+07
       SSE: taps:  256  input: 4e+07  cpu:  22.093  taps/sec:  4.635e+08
address@hidden tests]$ ./benchmark_dotprod_fff
   generic: taps:  256  input: 4e+07  cpu:  16.966  taps/sec:  6.035e+08
       SSE: taps:  256  input: 4e+07  cpu:  11.194  taps/sec:  9.148e+08

Athlon 1800+ MP (1.5 GHz)

address@hidden tests]$ ./benchmark_dotprod_fcc
   generic: taps:  256  input: 4e+07  cpu: 106.544  taps/sec:  9.611e+07
    3DNow!: taps:  256  input: 4e+07  cpu:  17.698  taps/sec:  5.786e+08
       SSE: taps:  256  input: 4e+07  cpu:  21.805  taps/sec:  4.696e+08
address@hidden tests]$ ./benchmark_dotprod_ccf
   generic: taps:  256  input: 4e+07  cpu: 102.456  taps/sec:  9.994e+07
    3DNow!: taps:  256  input: 4e+07  cpu:  16.247  taps/sec:  6.303e+08
       SSE: taps:  256  input: 4e+07  cpu:  21.743  taps/sec:   4.71e+08
address@hidden tests]$ ./benchmark_dotprod_fff
   generic: taps:  256  input: 4e+07  cpu: 13.662  taps/sec:  7.495e+08
    3DNow!: taps:  256  input: 4e+07  cpu:  8.252  taps/sec:  1.241e+09
       SSE: taps:  256  input: 4e+07  cpu:  9.982  taps/sec:  1.026e+09


P4 (1.7 GHz)

address@hidden tests]$ ./benchmark_dotprod_fcc
   generic: taps:  256  input: 4e+07  cpu: 144.956  taps/sec:  7.064e+07
       SSE: taps:  256  input: 4e+07  cpu:  18.968  taps/sec:  5.399e+08
address@hidden tests]$ ./benchmark_dotprod_ccf
   generic: taps:  256  input: 4e+07  cpu: 152.732  taps/sec:  6.705e+07
       SSE: taps:  256  input: 4e+07  cpu:  18.525  taps/sec:  5.528e+08
address@hidden tests]$ ./benchmark_dotprod_fff
   generic: taps:  256  input: 4e+07  cpu:  18.059  taps/sec:   5.67e+08
       SSE: taps:  256  input: 4e+07  cpu:   6.792  taps/sec:  1.508e+09




reply via email to

[Prev in Thread] Current Thread [Next in Thread]