discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Re: FM data available


From: Eric Blossom
Subject: Re: [Discuss-gnuradio] Re: FM data available
Date: Wed, 19 Feb 2003 16:56:03 -0800
User-agent: Mutt/1.4i

> Some new numbers (SSE still WIP):
> 
> (address@hidden:examples)$ ./benchmark_dotprod_SCC
>    generic: taps:  256  input: 4e+07 cpu: 107.360  taps/sec:  9.538e+07
>  3DNow!Ext: taps:  256  input: 4e+07  cpu: 28.640  taps/sec:  3.575e+08
>     3DNow!: taps:  256  input: 4e+07  cpu: 50.920  taps/sec:  2.011e+08
> 
> Remark: SCC is half the speed of FFF dotprod because Complex taps are 
> involved.

Thanks!

> The best solution, which I implemented already, is to not use conver_FS,
> but make use of float_dotprod_xxx, and cast the result to a short.
> Unfortunately, the GrRealFIRfilter speed boost (even with 529 coeffs)
> is not as noticeable as GrFreqXlatingFIRfilter (which runs at high
> sampling rate).

I'm not sure if you're using the CVS tree or the tarball, but in any
case, the GrRealFIRfilter interface is deprecated.  We're on the path
to minimize the use of templates, and instead are using routines which
have names of the form GrFIRfilterXXX where XXX encodes the input
type, output type and tap type.  XXX are drawn for 'C' Complex, 'F',
Float, 'S' short and 'I' int.  E.g., GrFIRfilterFFF has float input,
output and taps.  It already has go fast code for 3DNow and SSE, and
it's automatically selected for you.

I'm sure we'll be able to integrate your work.  Keep on charging on.

We're minimizing the use of templates around the filter code because
when using them it's *very* hard to come up with a clean way to add
machine specific speed ups.  With the new way, there's an abstract
class the specs the interface, a generic (machine independent)
concrete class that always works, and optionally other concrete
implementations that are selected at run time based on the machine
you're on.  Some of the selection will be at compile time (e.g., x86
vs PPC), while others are check at runtime.  E.g., Athlon vs P4.

I'll write all this up in the next week or so.  Right now I'm actually
reworking this code to clean it up, and am machine generating all the
generic cases, as well as all the common glue that holds it all
together.

Likewise, I envision a similar change with GrFreqXlatingFIRfilter
where it changes from being a template to something that looks like
GrFreqXlatingFIRfilterS or GrFreqXlatingFIRfilterC for versions that
take short or float inputs.  Internally they always have complex taps,
and likewise, the output is always complex.

FYI, the reason GrFreqXlatingFIRfilter hasn't been fixed yet is that
it wasn't need in the HDTV code.  Squeaky wheel...  In any event, I'm
putting together some demos for CodeCon this weekend, so now the
narrow band stuff is higher on the priority list.

> BTW, is there a mean to gather some stats about the various sig
> process figures in the chain? This would to know where most of the CPU cycles 
> are spent.

Not currently that directly relates to which module.  However, I've
used oprofile with good success.  You can tell for instance how much
CPU is being consumed by all FIR's combined, then it's a matter of
computing the bandwidth * taps product for each one and allocating the
appropriate piece of the CPU time.

> Wouldn't it be possible for anyone with a microtune board to tune to 144MHz
> and then sweep the ham band?  Some samples around 125MHz would be neat too.

I can do this, but I don't have much of an antenna.  It'll probably
have to wait till next week until I get the demos under control.

Eric




reply via email to

[Prev in Thread] Current Thread [Next in Thread]