[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Discuss-gnuradio] Re: TPB update
From: |
'Eric Blossom' |
Subject: |
[Discuss-gnuradio] Re: TPB update |
Date: |
Thu, 11 Nov 2010 18:40:51 -0800 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Fri, Nov 12, 2010 at 10:05:28AM +1100, Balint Seeber wrote:
> Dear Eric,
>
> I realised I was actually getting ahead of myself regarding scenario (1),
> because - of course - the sample rate means nothing in terms of timing if it
> is not a synchronous graph, and as I stated I didn't use Throttle. So the
> behaviour in (1) is expected. would you agree?
Yes.
> Still not sure about (3) though. Did the graph make it through okay?
>
> Thanks very much once again,
>
> Balint
>
Using the single graph (the one you sent me):
Running case (1):
htop shows it burning 95% of one core and 25% of another.
Seems reasonable to me. (On my 8-core Xeon)
I started oprofile, ran the flow graph for a while (> 10s), then
looked at the output of opreport:
$ opreport --long-filenames --symbols -t 0.5 >/tmp/report
It gives the report below, which isn't surprising. That is, 57% of
the samples are in ccomplex_dotprod_sse (the innerloop of the
gr_fir_ccc_simd_filter, used by the resampler), and 16% are in
gr_sig_source_c::work (generating the complex sinusoid).
The cycles chargable to the resampler include ccomplex_dotprod_sse,
gr_fir_ccc_simd_filter, and gr_rational_resampler_base_ccc, which
comes out to ~69%.
(It's normalized to total samples counted)
CPU: Core 2, speed 3000.07 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask
of 0x00 (Unhalted core cycles) count 100000
samples % app name symbol name
17535154 57.4244 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0
ccomplex_dotprod_sse
4966060 16.2629 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0
gr_sig_source_c::work(int, std::vector<void const*, std::allocator<void const*>
>&, std::vector<void*, std::allocator<void*> >&)
2909663 9.5286 /no-vmlinux /no-vmlinux
2490431 8.1557 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0
gr_fir_ccc_simd::filter(std::complex<float> const*)
1094391 3.5839 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0
gr_rational_resampler_base_ccc::general_work(int, std::vector<int,
std::allocator<int> >&, std::vector<void const*, std::allocator<void const*>
>&, std::vector<void*, std::allocator<void*> >&)
235207 0.7703 /lib64/libpthread-2.12.1.so pthread_mutex_lock
Running case (3):
htop shows it burning 95% of TWO cores and 25% of another.
Also seems reasonable to me. One core for each of the two rational
resamplers, and 25% for the rest.
$ opreport --long-filenames --symbols -t 0.5 >/tmp/report3
CPU: Core 2, speed 3000.07 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask
of 0x00 (Unhalted core cycles) count 100000
samples % app name symbol name
3931690 63.0917 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0
ccomplex_dotprod_sse
611059 9.8056 /no-vmlinux /no-vmlinux
557861 8.9520 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0
gr_fir_ccc_simd::filter(std::complex<float> const*)
550223 8.8294 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0
gr_sig_source_c::work(int, std::vector<void const*, std::allocator<void const*>
>&, std::vector<void*, std::allocator<void*> >&)
248420 3.9864 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0
gr_rational_resampler_base_ccc::general_work(int, std::vector<int,
std::allocator<int> >&, std::vector<void const*, std::allocator<void const*>
>&, std::vector<void*, std::allocator<void*> >&)
55851 0.8962 /lib64/libpthread-2.12.1.so pthread_mutex_lock
31423 0.5042 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0
gr_tpb_detail::notify_upstream(gr_block_detail*)
In this case, it's about 76% from the two rational resamplers, 9% for
the sig gen, and 1.5% scheduler overhead (pthread_mutex_lock and
notify_upstream). In reality, the ticks in the kernel should be
charged towards overhead too.
Is there any chance that you had some kind of power control or
frequency scaling going on? If it's a laptop, be sure that it's in
"performance mode" and not "I want the battery to last a long time
mode"
Remember that Amdahl's Law gives the maximum speedup within a given
graph. https://secure.wikimedia.org/wikipedia/en/wiki/Amdahl%27s_law
In any case, I think that you'll find a combination of htop and
oprofile should help shed some light on where the cycles are being
burned.
Eric
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Discuss-gnuradio] Re: TPB update,
'Eric Blossom' <=