discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Speed Optimization and Application for ATSC Recei


From: Joshua Lilly
Subject: Re: [Discuss-gnuradio] Speed Optimization and Application for ATSC Receivers
Date: Fri, 11 Mar 2016 22:56:15 -0500

Hey Andy, 
Thanks for the reply. I will take another look at the code I think I know what 
to do now. I will make sure the mailing list is included from now on. 

Thanks again.
Josh

> On Mar 11, 2016, at 10:20 AM, Andy Walls <address@hidden> wrote:
> 
> Hi Josh:
> 
> I misread your question.  See my additional answer below
> 
>> On Fri, 2016-03-11 at 02:34 +0000, Joshua Lilly wrote:
>> Hey Andy,
>> 
>> Just had a quick question about item number two on this list.
>> 
>> 
>> 
>> 2. For an immediate performance increase for most users, add a new
>> gnuradio/gr-blocks/grc/blocks_add_const_xx.xml to the build that
>> allows
>> users to select the faster, non-vector version of the add const block
>> from the GUI.
>> 
>> 
>> After reading through the tweaked python script it looked like the
>> add_const_xx block should consist of the add_const_ss block? However,
>> if that is the case isn't this already taken care of with the add_xx
>> block?
> 
> No.  add_xx adds multiple input streams together.  add_const_vxx adds a
> constant to the input stream.
> 
> Drop both types of add blocks in the flowgraph within the GRC GUI, and
> you will immediately see the difference.
> 
> Regards,
> Andy
> 
>> 
>> 
>> Thanks for your help,
>> 
>> Josh
> 
> 
> 
>> 
>> On Mar 06, 2016, at 01:08 PM, Andy Walls <address@hidden>
>> wrote:
>> 
>>> On Sun, 2016-03-06 at 08:49 -0500, address@hidden
>>> wrote:
>>>> Message: 5
>>>> Date: Sun, 06 Mar 2016 06:45:13 +0000 (GMT)
>>>> From: Joshua Lilly
>>> 
>>> 
>>>> Hello,
>>>> My name is Josh and I am interested in getting involved in GNU
>>>> radio.
>>>> Specifically, I would like to work on the above project idea for
>>>> google summer of code 2016 by implementing Viterbi and demux
>>>> algorithms in volk and testing the speed improvements. I have
>>>> experience with python, c/c++, boost, and profiling with valgrind.
>>>> I
>>>> currently have read the getting involved page, compiled the code,
>>>> I am
>>>> working my way through some of the tutorials, and I have read
>>>> through
>>>> the code in volk. Even if I don't get accepted to google summer of
>>>> code, I would still like to get involved in fixing bugs, or
>>>> something
>>>> since this seems like a really awesome project.
>>> 
>>> Hi Josh:
>>> 
>>> I'm only a kibitzer when it comes to the project, so I can't say
>>> anything about GSoC acceptance.
>>> 
>>> 
>>>> If it isn't too much to ask could someone point me to a nice
>>>> beginner
>>>> bug to fix in order to get my hands in the code?
>>> 
>>> However I can give you (and anyone who wants it) a relevant beginner
>>> +intermediate thing to get your hands in the code. The
>>> "intermediate"
>>> part comes from your request to play in volk, which I don't consider
>>> stuff for beginners.
>>> 
>>> So we'll start with a very conceptually simple thing to improve:
>>> adding
>>> constant(s) to a sample stream. Specifically measuring and improving
>>> the performance of the add_const_vXX and add_const_XX blocks in
>>> gnuradio/gr-blocks/lib.
>>> 
>>> See the attached GRC flowgraph and hand-tweaked
>>> add_const_performance.py
>>> python script.
>>> 
>>> 
>>> 1. Measure the baseline performance of both the add_const_vss and
>>> add_const_ss blocks at the high sample rate of 160 Msps.
>>> 
>>> $ ps -eLo pcpu,pid,tid,cls,rtprio,pcpu,comm
>>> 
>>> shows the add_const_vss or add_const_ss thread hovering around 70%
>>> and
>>> 57% repsectively.
>>> 
>>> For meaningful measurements you must run the flowgraph RT prioirty.
>>> 
>>> 
>>> 2. For an immediate performance increase for most users, add a new
>>> gnuradio/gr-blocks/grc/blocks_add_const_xx.xml to the build that
>>> allows
>>> users to select the faster, non-vector version of the add const
>>> block
>>> from the GUI.
>>> 
>>> 
>>> 3. Measure the baseline of where the most CPU is being consumed in
>>> these
>>> blocks.
>>> You can use perf tools or oprofile tools or whatever works for you. 
>>> For meaningful measurements you must run the flowgraph RT priority.
>>> Odds are, it's the block's work() function that is consuming most of
>>> the
>>> CPU.
>>> 
>>> 
>>> 4. Create volk kernels to replace the main operations in the work()
>>> functions of these blocks, if you can. Since adding a constant is so
>>> simple, and ORC is very good about optimizing simple things, the
>>> volk
>>> implementations should include an ORC implementation if possible.
>>> Odds
>>> are the ORC implementation will beat hand-written SIMD versions for
>>> x86
>>> processors. Use volk_profile to prove my guess about ORC right or
>>> wrong. :)
>>> 
>>> 
>>> 5. Create volk-ized versions of the add_const blocks and remeasure
>>> their
>>> performance. How much improvement did you get?
>>> 
>>> 
>>> 6. Don't forget to add QA tests for the new volk functions.
>>> 
>>> 
>>> As an alternate to the above:
>>> 
>>> 1. Improve the performance of the nlog10_ff block by using log2,
>>> algebra, volk, and skipping the add of k at the end, if k == 0.0.
>>> 
>>> 2. Create a new approx_nlog10_ff block by taking advantage of the
>>> fact
>>> that the log2 exponent in IEEE floats can be obtained with a mask
>>> and
>>> shift operation. Don't forget to add a GRC .xml file for the block
>>> and
>>> QA test code.
>>> 
>>>> Thank you,
>>>> Josh
>>> 
>>> 
>>> Regards,
>>> Andy
> 
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]