|
From: | devin kelly |
Subject: | Re: [Discuss-gnuradio] Segfault with volk_32fc_32f_dot_prod_32fc_a_avx |
Date: | Tue, 6 Dec 2016 13:06:42 -0500 |
Honestly, my money would be on GCC 4.8.5 being less buggy than 6.2, but that's a separate topic.You can configure VOLK to not use this protokernel and there's some documentation on how to do it here: http://gnuradio.org/doc/doxygen/volk_guide.html#volk_ tuning This is fairly concerning though... are you able to consistently trigger a segfault or is it a seemingly random event that you can't trigger?On Tue, Dec 6, 2016 at 11:48 AM, devin kelly <address@hidden> wrote:OK, I tried a brand new GR/Volk install and still had the same problem. So no problem with re-linking Volk to GR. Could this be an issue with Volk on GCC 4.8.5? The newest GCC is 6.2 so 4.8.5 is pretty old, though the newest for Red Hat 7. Is there any way to reconfigure volk to not use volk_32fc_32f_dot_prod_32fc_a_avx? Here's volk-config-info:
$ volk-config-info --all --prefix --cc --cflags --avail-machines --machine --alignment --malloc
/local_disk/spectrum_challenge
cc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software see the source for copying conditions. There is NO
warranty not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
/usr/bin/cc::: -Wall
/usr/bin/c++::: -Wall
generic_orc:::GNU:::-g -Wall
sse2_64_mmx_orc:::GNU:::-g -Wall -m64 -mmmx -msse -msse2
sse3_64_mmx_orc:::GNU:::-g -Wall -m64 -mmmx -msse -msse2 -msse3
ssse3_64_mmx_orc:::GNU:::-g -Wall -m64 -mmmx -msse -msse2 -msse3 -mssse3
sse4_a_64_mmx_orc:::GNU:::-g -Wall -m64 -mmmx -msse -msse2 -msse3 -msse4a -mpopcnt
sse4_1_64_mmx_orc:::GNU:::-g -Wall -m64 -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1
sse4_2_64_mmx_orc:::GNU:::-g -Wall -m64 -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mpopcnt
avx_64_mmx_orc:::GNU:::-g -Wall -m64 -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mpopcnt -mavx
avx2_64_mmx_orc:::GNU:::-g -Wall -m64 -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mpopcnt -mavx -mfma -mavx2
generic_orc;sse2_64_mmx_orc;sse3_64_mmx_orc;ssse3_64_mmx_orc ;sse4_a_64_mmx_orc;sse4_1_64_ mmx_orc;sse4_2_64_mmx_orc;avx_ 64_mmx_orc;avx2_64_mmx_orc;
generic_orc;sse2_64_mmx_orc;sse3_64_mmx_orc;ssse3_64_mmx_orc ;sse4_1_64_mmx_orc;sse4_2_64_ mmx_orc;avx_64_mmx_orc;avx2_ 64_mmx_orc;
avx2_64_mmx_orc
Alignment in bytes: 32
Used malloc implementation: posix_memalignThanks again for any help,DevinOn Fri, Dec 2, 2016 at 10:04 AM, Marcus Müller <address@hidden> wrote:Oh, that's pretty interesting! Well, no misconfiguration should segfault, so I'm a bit stumped at the moment.
On 12/01/2016 06:14 PM, devin kelly wrote:
DevinI also should have mentioned that the filter works OK for a while then segfaults. A couple of packets always pass through the clock sync block I'm using before I get the segfault. Finally, the segfault occurs in the polyphase clock sync block, do you think I could have mis-configured the block in some way that will get me this error? I think the PF clock sync block is pretty popular so if there's a bug in that block that's causing this I'd be surprised.Marcus,Thanks for taking the time. It is possible I re-installed a new version of VOLK. I'll try a fresh build and see what that gets me.
On Thu, Dec 1, 2016 at 11:47 AM, Marcus Müller <address@hidden> wrote:
______________________________Hi Devin,
I don't think it's a kernel problem – all your calculations happen in userland, and the kernel has not much to say with respect to the instructions used.
The most common reason for this kind of misbehaviour is in fact a problem with how the application (in this case, your GNU Radio application's block) calls into the library function (in this case the VOLK dot product).
Is it possible that for some reason, GNU Radio used a previous version of VOLK when you linked it, and then the new version of VOLK was installed?
Best regards,
Marcus
On 12/01/2016 05:23 PM, devin kelly wrote:
It looks like aPtr (0x7fea5c3014c0) is somehow not valid. GR passes this pointer to VOLK so maybe it's a GR problem?Hello,I'm having a problem with the above VOLK function segfaulting. I don't think I'm passing any incorrect values to VOLK. My problem could be that I'm on RHEL7 with (obviously) an older kernel:
$ uname -a
Linux 520842-mitll 3.10.0-327.10.1.el7.x86_64 #1 SMP Sat Jan 23 04:54:55 EST 2016 x86_64 x86_64 x86_64 GNU/Linux
I'm on VOLK 1.3 and GR 3.7.10.1.
it segfaults here:
https://github.com/gnuradio/volk/blob/maint/kernels/volk/vol k_32fc_32f_dot_prod_32fc.h#L11 9
I've copied the output of a GDB session and my CPU info below.
Thanks for any help,
Devin
Program terminated with signal 11, Segmentation fault.
#0 0x00007fea7b1bd8b7 in _mm256_load_ps (__P=0x7fea5c3014c0) at /usr/lib/gcc/x86_64-redhat-linux/4.8.5/include/avxintrin.h:8 35
835 return *(__m256 *)__P;
Missing separate debuginfos, use: debuginfo-install python-2.7.5-48.el7.x86_64
(gdb) bt
#0 0x00007fea7b1bd8b7 in volk_32fc_32f_dot_prod_32fc_a_avx (__P=0x7fea5c3014c0) at /usr/lib/gcc/x86_64-redhat-lin ux/4.8.5/include/avxintrin.h:8 35
#1 0x00007fea7b1bd8b7 in volk_32fc_32f_dot_prod_32fc_a_avx (result=0x3665160, input=0x7fea5c3014c0, taps=0x3671a00, num_points=47) at /local_disk/gr_3.7.10.1_src/vo lk/kernels/volk/volk_32fc_32f_ dot_prod_32fc.h:119
#2 0x00007fea6661d88f in gr::filter::kernel::fir_filter_ccf::filter(std::complex<floa t> const*) () at /local_disk/gr_3.7.10.1/lib64/ libgnuradio-filter-3.7.10.1.so .0.0.0
#3 0x00007fea66c01d01 in gr::digital::pfb_clock_sync_ccf_impl::general_work(int, std::vector<int, std::allocator<int> >&, std::vector<void const*, std::allocator<void const*> >&, std::vector<void*, std::allocator<void*> >&) ()
at /local_disk/gr_3.7.10.1/lib64/libgnuradio-digital-3.7.10.1.s o.0.0.0
#4 0x00007fea7b73fe10 in gr::block_executor::run_one_iteration() () at /local_disk/gr_3.7.10.1/lib64/ libgnuradio-runtime-3.7.10.1.s o.0.0.0
#5 0x00007fea7b781120 in gr::tpb_thread_body::tpb_thread_body(boost::shared_ptr<gr::b lock>, int) () at /local_disk/gr_3.7.10.1/lib64/ libgnuradio-runtime-3.7.10.1.s o.0.0.0
#6 0x00007fea7b774821 in boost::detail::function::void_function_obj_invoker0<gr::thre ad::thread_body_wrapper<gr::tp b_container>, void>::invoke(boost::detail::f unction::function_buffer&) () at /local_disk/gr_3.7.10.1/lib64/ libgnuradio-runtime-3.7.10.1.s o.0.0.0
#7 0x00007fea7b725ef0 in boost::detail::thread_data<boost::function0<void> >::run() () at /local_disk/gr_3.7.10.1/lib64/ libgnuradio-runtime-3.7.10.1.s o.0.0.0
#8 0x00007fea7a22427a in thread_proxy () at /lib64/libboost_thread-mt.so.1.53.0
#9 0x00007fea960f3dc5 in start_thread () at /lib64/libpthread.so.0
#10 0x00007fea9571973d in clone () at /lib64/libc.so.6
(gdb) print __P
$1 = (const float *) 0x7fea5c3014c0
(gdb) print *__P
Cannot access memory at address 0x7fea5c3014c0
(gdb) print *(__m256 *)__P
Cannot access memory at address 0x7fea5c3014c0
(gdb) f 1
#1 volk_32fc_32f_dot_prod_32fc_a_avx (result=0x3665160, input=0x7fea5c3014c0, taps=0x3671a00, num_points=47) at /local_disk/gr_3.7.10.1_src/vo lk/kernels/volk/volk_32fc_32f_ dot_prod_32fc.h:119
119 a0Val = _mm256_load_ps(aPtr);
(gdb) info locals
number = 0
sixteenthPoints = 2
res = {-1.30492652e+29, 0.0779444203}
realpt = 0x7fea57ffde50
imagpt = 0x7fea57ffde54
aPtr = 0x7fea5c3014c0
bPtr = 0x3671a00
a0Val = {-0.656753004, -0.658071458, -0.760932922, -0.762304127, -0.869615495, -0.869560063, -0.887507021, -0.885902643}
a1Val = {-0.744178772, -0.742508531, -0.437728733, -0.437706977, -0.0328192525, -0.0346645005, 0.376206338, 0.374125361}
a2Val = {0.711783648, 0.711464763, 0.931477308, 0.933318734, 1.01744843, 1.01973152, 0.954917312, 0.955377996}
a3Val = {0.734342158, 0.732418418, 0.374049634, 0.371605545, -0.0585254543, -0.0588675328, -0.461206883, -0.458686352}
b0Val = {0.0023738991, 0.0023738991, -0.00534401694, -0.00534401694, 0.00242348039, 0.00242348039, 0.00727195293, 0.00727195293}
b1Val = {-0.0158917159, -0.0158917159, 0.00614725193, 0.00614725193, 0.0485430211, 0.0485430211, -0.22138992, -0.22138992}
b2Val = {0, 0, 0.22138992, 0.22138992, -0.0485430211, -0.0485430211, -0.00614725193, -0.00614725193}
b3Val = {0.0158917159, 0.0158917159, -0.00727195293, -0.00727195293, -0.00242348039, -0.00242348039, 0.00534401694, 0.00534401694}
x0Val = {0.0023738991, -0.00534401694, 0.00242348039, 0.00727195293, -0.0158917159, 0.00614725193, 0.0485430211, -0.22138992}
x1Val = {0, 0.22138992, -0.0485430211, -0.00614725193, 0.0158917159, -0.00727195293, -0.00242348039, 0.00534401694}
x0loVal = {0.0023738991, 0.0023738991, -0.00534401694, -0.00534401694, -0.0158917159, -0.0158917159, 0.00614725193, 0.00614725193}
x0hiVal = {0.00242348039, 0.00242348039, 0.00727195293, 0.00727195293, 0.0485430211, 0.0485430211, -0.22138992, -0.22138992}
x1loVal = {0, 0, 0.22138992, 0.22138992, 0.0158917159, 0.0158917159, -0.00727195293, -0.00727195293}
x1hiVal = {-0.0485430211, -0.0485430211, -0.00614725193, -0.00614725193, -0.00242348039, -0.00242348039, 0.00534401694, 0.00534401694}
c0Val = {-0.00155906542, -0.00156219525, 0.00406643841, 0.00407376606, -0.00210749614, -0.0021073618, -0.00645390945, -0.0064422423}
c1Val = {0.0118262777, 0.011799735, -0.00269082887, -0.00269069499, -0.00159314566, -0.00168271956, -0.0832882896, -0.082827583}
c2Val = {0, 0, 0.206219688, 0.206627354, -0.0493900217, -0.0495008491, -0.00587011734, -0.00587294903}
c3Val = {0.0116699571, 0.0116393855, -0.00272007124, -0.00270229811, 0.000141835291, 0.000142664314, -0.00246469746, -0.00245122775}
dotProdVal0 = {0, 0, 0, 0, 0, 0, 0, 0}
dotProdVal1 = {0, 0, 0, 0, 0, 0, 0, 0}
dotProdVal2 = {0, 0, 0, 0, 0, 0, 0, 0}
dotProdVal3 = {0, 0, 0, 0, 0, 0, 0, 0}
dotProductVector = {0.0218032673, 0.0217418969, 0.204074427, 0.204509094, -0.0519821495, -0.0521854945, -0.0983558819, -0.097870864}
(gdb) print *aPtr
Cannot access memory at address 0x7fea5c3014c0
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 61
Model name: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
Stepping: 4
CPU MHz: 2038.664
BogoMIPS: 5187.61
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 4096K
NUMA node0 CPU(s): 0-3
_______________________________________________ Discuss-gnuradio mailing list address@hidden https://lists.gnu.org/mailman/ listinfo/discuss-gnuradio _________________ Discuss-gnuradio mailing list address@hidden https://lists.gnu.org/mailman/ listinfo/discuss-gnuradio _______________________________________________ Discuss-gnuradio mailing list address@hidden https://lists.gnu.org/mailman/ listinfo/discuss-gnuradio
_______________________________________________
Discuss-gnuradio mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
_______________________________________________
Discuss-gnuradio mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
[Prev in Thread] | Current Thread | [Next in Thread] |