discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] [ghostop14/gr-lfast] Is gr-lfast now faster than


From: GhostOp14
Subject: Re: [Discuss-gnuradio] [ghostop14/gr-lfast] Is gr-lfast now faster than gr-clenabled? (#1)
Date: Thu, 27 Apr 2017 15:56:13 -0400


---------- Forwarded message ----------
From: GhostOp14 <address@hidden>
Date: Thu, Apr 27, 2017 at 3:52 PM
Subject: Re: [ghostop14/gr-lfast] Is gr-lfast now faster than gr-clenabled? (#1)
To: ghostop14/gr-lfast <address@hidden>


No, the Costas loop on GPU's actually performs pretty poorly due to the algorithm's sequential calculations.  The only way I've found to code it for OpenCL is as a single task-based kernel call so it really only executes like a standard CPU routine on 1 GPU core, not in parallel like one would like, so the performance is pretty low for that block (It drops to less than 2 Msps even on an NVIDIA 1070 card versus 34+Msps for the gr-lfast version on an i7-6700) and the OpenCL performance didn't change much varying the data size.  So far the best performance I've gotten out of the Costas Loops is in gr-lfast the the optimized code. 

For gr-clenabled, there's a tool that installs called test-clenabled and you can pass it a parameter for the data size and it'll take the timing measurements for both the OpenCL version and CPU version so you can run tests on your hardware with any sizes you'd like to test.

Also, when you get gr-clenabled running it'll create 2 separate gnuradio groups.  The OpenCL-Accelerated group are the blocks that actually run faster on the GPU's since the calculations could be done in parallel.  Those in the OpenCL-Enabled group function in OpenCL but their performance is generally worse than the native CPU blocks. 

I'm also pushing some updates tonight to it to clean up some of the processing, but no major performance updates in this pass.

On Thu, Apr 27, 2017 at 1:52 PM, kurtulmehtap <address@hidden> wrote:

With the new improvements, Is gr-lfast now faster than gr-clenabled for the costas loop for block sizes faster than 8192?
Can you add performance measurements for extremely large blocks (like 2^20)


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]