bug-guile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#26854: guile 2.2.x has broken prebuilt/32-bit-big-endian


From: George Koehler
Subject: bug#26854: guile 2.2.x has broken prebuilt/32-bit-big-endian
Date: Wed, 11 Dec 2019 18:34:15 -0500

Hello, GNU Guile community.

I believe that the files in guile-2.2.x/prebuilt/32-bit-big-endian are
broken.  This causes a reproducible crash when a 32-bit-big-endian
system tries to build guile 2.2.x from source code.  Because of this
crash, OpenBSD powerpc has no guile2 package.

Matthew Hull started a discussion on the OpenBSD ports list:
https://marc.info/?l=openbsd-ports&m=157550856819188&w=2

We have PowerPC Macintosh hardware running OpenBSD.  This seems to be
the same bug as #26854, which had PowerPC hardware running Mac OS X.

I worked around the problem in Guile 2.2.6 by moving away
prebuilt/32-bit-big-endian so the build doesn't use the prebuilt
files; but the "bootstrap" part of the build is slow.  I suspect that
a little-endian system wrote the prebuilt files, but
modules/system/vm/assembler.scm is missing a byte-swap.

The crash is in guile-2.2.6/libguile/vm-engine.c "call":

=lines 566 to 573
      if (SCM_LIKELY (SCM_PROGRAM_P (FP_REF (0))))
        ip = SCM_PROGRAM_CODE (FP_REF (0));
      else
        ip = (scm_t_uint32 *) vm_apply_non_program_code;

      APPLY_HOOK ();

      NEXT (0);
=end

`ip` gets a bad pointer to unmapped memory from SCM_PROGRAM_CODE, then
"NEXT (0);" tries to read ip[0] and crashes with SIGSEGV.

I found code that puts a bad pointer in the program object, in
vm-engine.c "make-closure":

=lines 1652 to 1654
      closure = scm_inline_words (thread, scm_tc7_program | (nfree << 16),
                                  nfree + 2);
      SCM_SET_CELL_WORD_1 (closure, ip + offset);
=end

I had modified the code to read *(ip + offset), so it crashed.  Then I
loaded the core dump in GDB.  `ip` was (scm_t_uint32 *) 0xcf1ea3b8 and
`offset` was -1005191168.  GDB can't access *0xcf1ea3b8 because it was
in an mmap(2) file, and the core dump didn't include this mapping.
In ktrace(1), the file was somewhere under prebuilt/32-bit-big-endian.

`offset` -1005191168 is 0xc4160000.  This looks like the wrong byte
order.  The correct value might be 0x000016c4 = 5828.  This would make
more sense, if ip + offset should be inside the file!

modules/system/vm/assembler.scm can byte-swap values when it emits
bytecode for a different-endian machine.  If a little-endian machine
wrote the prebuilt/32-bit-big-endian files, and assembler.scm forgot
to swap `offset`, then it would cause this bug.

I moved away the prebuilt/32-bit-big-endian files and started a new
build without these prebuilt files.  The build ran some slow
"bootstrap" commands on my 666 MHz cpu.  The first bootstrap command
took more than 100 minutes.  The second command took just over
4 hours.  The next commands continued overnight, and the whole build
might have taken almost 24 hours.  The build passes most tests:

SKIP: test-pthread-create-secondary
FAIL: test-stack-overflow
FAIL: test-out-of-memory                                                        
==================================
2 of 38 tests failed
(1 test was not run)

Because the bootstrap is so slow, I would like future versions of
Guile to include correct prebuilt/32-bit-big-endian files, but I don't
know how to make such files.

-- 
George Koehler <address@hidden>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]