guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: guile 2.2.3 crashing on osx 10.11?


From: Dan Kegel
Subject: Re: guile 2.2.3 crashing on osx 10.11?
Date: Mon, 1 Jan 2018 10:25:46 -0800

So, for completeness, here's why guile was crashing instantly in gmp
on some machines for me.

If you build and run everything on the same machine, none of this is
likely to affect you; this is only for the case where each library is
built by a random machine from a buildbot pool.

I had been paying attention to the AVX1.0 divide across our buildbot fleet,
and arranged to segregate MacPro5,1 machines off in their own pool,
to avoid crashes from vxorps (an AVX instruction) in ImageMagick;
here's the output of
    sysctl -n hw.model machdep.cpu.brand_string machdep.cpu.features
on our machines, with boring bits removed:

MacPro5,1      E5620       SMX     SSE4.2
MacBookPro8,2  i7-2720QM   SMX     SSE4.2 x2APIC       XSAVE OSXSAVE
       TSCTMR AVX1.0
Macmini6,2     i7-3615QM           SSE4.2 x2APIC       XSAVE OSXSAVE
       TSCTMR AVX1.0 RDRAND F16C
MacBookPro9,1  i7-3720QM   SMX     SSE4.2 x2APIC       XSAVE OSXSAVE
       TSCTMR AVX1.0 RDRAND F16C
MacBookPro11,2 i7-4870HQ   SMX FMA SSE4.2 x2APIC MOVBE XSAVE OSXSAVE
SEGLIM64 TSCTMR AVX1.0 RDRAND F16C
MacBookPro11,2 i7-4960HQ   SMX FMA SSE4.2 x2APIC MOVBE XSAVE OSXSAVE
SEGLIM64 TSCTMR AVX1.0 RDRAND F16C
MacBookPro11,5 i7-4980HQ   SMX FMA SSE4.2 x2APIC MOVBE XSAVE OSXSAVE
SEGLIM64 TSCTMR AVX1.0 RDRAND F16C

MULX is a nice Intel instruction that was introduced in 2013 in
non-low-end Haswell chips.
https://software.intel.com/sites/default/files/m/f/7/c/36945 says one
detects it like this:
CPUID.(EAX=07H, ECX=0H):EBX.BMI2[bit 8]: if 1 indicates the processor
supports the second group of advanced bit manipulation extensions
(BZHI, MULX, PDEP, PEXT, RORX, SARX, SHLX, SHRX);
http://publicclu2.blogspot.com/2013/05/flags-in-x86-linuxs-proccpuinfo.html
clarifies that, on Linux, /proc/cpuinfo will contain the string BMI2
if MULX is present.  This evidently is also true of sysctl -n
machdep.cpu.leaf7_features on mac, which says:

MacPro5,1      E5620
MacBookPro8,2  i7-2720QM
Macmini6,2     i7-3615QM  SMEP ERMS RDWRFSGS
MacBookPro9,1  i7-3720QM  SMEP ERMS RDWRFSGS
MacBookPro11,2 i7-4870HQ  SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1
HLE AVX2 BMI2 INVPCID RTM
MacBookPro11,2 i7-4960HQ  SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1
HLE AVX2 BMI2 INVPCID RTM FPU_CSDS
MacBookPro11,5 i7-4980HQ  SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1
 AVX2 BMI2 INVPCID     FPU_CSDS

which means I have three basic groups:
1) no AVX: (MacPro5,1; circa 2010)
2) AVX but no BMI2 (circa 2011-2012)
3) AVX and BMI2 (MacBookPro11.2, 11.5; circa 2013-2015)

I'm sure other people will have different needs, but for me,
segregating shared build machines into those three pools -- and/or
sticking with two pools and disabling use of MULX in gmp -- should
avoid the crashes I saw due to ImageMagick and GMP cpu specific
instruction assumptions.

It'd be nice if gmp and imagemagick were more agile about cpu feature
detection, and did (more of) it at runtime, but that's life.
- Dan

On Sun, Dec 31, 2017 at 5:53 PM, Dan Kegel <address@hidden> wrote:
> On Sat, Dec 30, 2017 at 3:31 PM, Matt Wette <address@hidden> wrote:
>>> On Dec 30, 2017, at 2:32 PM, Dan Kegel <address@hidden> wrote:
>>> osx 10.11, though, crashes when I just evaluate (display (version)),
>>> or sometimes while building.
>>
>> I have not seen that on macOS before,  but previously ran into other issues. 
>>  This may help to chase it down:
>>
>> build with use
>>         CFLAGS=-g LDFLAGS=-g ./configure --disable-shared --prefix=/opt/local
>>
>> in meta/gdb-uninstalled-guile, change:
>>     gdb --args ${top_builddir}/libguile/guile "$@"
>> to
>>     lldb -- ${top_builddir}/libguile/guile "$@"
>>
>> and, IIRC, run meta/gdb-installed-guile
>
> Thanks.  Also had to do
>    sudo /usr/sbin/DevToolsSecurity --enable
>
> Here's a backtrace:
>
> * thread #1: tid = 0x1628f8, 0x00000001003e95be
> libgmp.10.dylib`__gmpn_mul_1 + 94, queue = 'com.apple.main-thread',
> stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
>     frame #0: 0x00000001003e95be libgmp.10.dylib`__gmpn_mul_1 + 94
> libgmp.10.dylib`__gmpn_mul_1:
> ->  0x1003e95be <+94>:  mulxq  (%rsi), %rbx, %rax
>
> Guess what?  This machine is a i7-3720QM (in a Macbook Pro 9,1), which
> doesn't support MULX.  (It's 2012 Ivy Bridge, which is just
> pre-Haswell.)
>
> $ gobjdump -d libgmp.dylib | grep mulx
> confirms the presence of the mulx instruction.
>
> So my gmp was built wrong for this machine.  (There was a related
> bugfix for low-end cpus in gmp 6.1.1, but I've got 6.1.2, and no
> low-end cpus.)
>
> Bit of a mystery, then, but nothing to do with guile.
> - Dan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]