[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gcl-devel] Re: possible GCL/Windows compiler bug
From: |
Matt Kaufmann |
Subject: |
Re: [Gcl-devel] Re: possible GCL/Windows compiler bug |
Date: |
Wed, 13 Oct 2004 11:39:59 -0500 |
Thanks. I basically understand, but please see questions following "I can do
this", below.
Cc: address@hidden, address@hidden
From: Camm Maguire <address@hidden>
Date: 13 Oct 2004 12:14:53 -0400
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
Content-Type: text/plain; charset=us-ascii
X-SpamAssassin-Status: No, hits=-2.6 required=5.0
X-UTCS-Spam-Status: No, hits=-332 required=180
Greetings!
Matt Kaufmann <address@hidden> writes:
> Hi
>
> Here is some information on the bug. I started up ACL2 under gdb as
follow,
> using an ACL2 build in which SGC was never turned on.
>
> gdb ../../nosgc-saved_acl2.gcl.exe
> r
>
> Then I submitted the following three commands (these are close to the
simplest
> ones I can imagine that invoke the bug):
>
> (f-put-global 'safe-mode t state)
> :q
> (ACL2_*1*_ACL2::MATCH-CLAUSE 'DCL '(& . &)
> '(T))
>
> Here is gdb output. Please let me know if you need anything else. You
> mentioned "--enable-debug passed to gcl" -- Do I need to rebuild gcl, or is
> this a gcl command-line option? If I need to rebuild gcl, I'd appreciate
any
> tips (I've rarely done this and never with mingw).
>
> I'd also be _very_ interested in knowing if the backtrace below rules out
the
> possibility of an ACL2 bug.
>
Don't know yet.
> GNU gdb 5.2.1
> Copyright 2002 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you
are
> welcome to change it and/or distribute copies of it under certain
conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "i686-pc-mingw32"...
> (gdb) r
> Starting program:
c:\matt\acl2\v2-9\books\misc/../../nosgc-saved_acl2.gcl.exe
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x102dbbbc in ?? ()
> (gdb) bt
OK, so your fault address is 0x102dbbbc. Look into your build log,
and find the start address at which this function was loaded:
e.g.
=============================================================================
(defun foo (x) x)
FOO
>(compile 'foo)
Compiling gazonk0.lsp.
End of Pass 1.
End of Pass 2.
OPTIMIZE levels: Safety=0 (No runtime error checking), Space=0, Speed=3
Finished compiling gazonk0.lsp.
Loading gazonk0.o
start address -T 0x842bf60 Finished loading gazonk0.o
^^^^^^^^^
#<compiled-function FOO>
=============================================================================
Then at the gdb break above, do
(gdb) p/x *(char *)<start_address>@1024
(gdb) p/x *((char *)<start_address>+1024)@1024
etc. until you pass over 0x102dbbbc, and send me the output.
Then dump the source of your function into a file, compile-file with
:c-file t, and send me the .c and the objdump -d of the .o
I can do this, but note that if I (re-)compile the function after starting up
ACL2 then the problem goes away. Do you want me to compile in the same ACL2
session, after the p/x steps above, or should I re-start gdb and ACL2?
Is "objdump -d foo.o" something I type inside gdb? Will gdb then print some
output to the screen, and that's what you want?
Hope this is not too tedious, but if we can complete this, it should
let us know what is going on.
As for debugging, to get complete information, one must rebuilt gcl
with --enable-debug given at gcl compile time. There should be a file
readme.mingw with instructions. However, in this case, we know the
failure to be in an acl2 function loaded in the heap. You can
probably save time therefore by requilding acl2 only with
compiler::*c-debug* set to t. While you are at it, also set
compiler::*default-c-file* to t. If the fault persists, we can then
pinpoint it exactly to a line in the C.
OK, I'll rebuild ACL2 as suggested (I already set compiler::*default-c-file*
but I'll set compiler::*c-debug* as well). My laptop is at home, so it won't
be till tonight.
Thanks --
-- Matt
Take care,
> #0 0x102dbbbc in ?? ()
> #1 0x0041acab in quick_call_sfun ()
> #2 0x00419953 in eval ()
> #3 0x0041a93d in fLeval ()
> #4 0x0042c4aa in c_apply_n ()
> #5 0x004421c1 in IapplyVector ()
> #6 0x0041897a in funcall ()
> #7 0x0051e95e in LI1 ()
> #8 0x0041ac32 in quick_call_sfun ()
> #9 0x00418912 in funcall ()
> #10 0x00442329 in IapplyVector ()
> #11 0x00419e3d in fLfuncall ()
> #12 0x0042c4aa in c_apply_n ()
> #13 0x004421c1 in IapplyVector ()
> #14 0x0041897a in funcall ()
> #15 0x00419953 in eval ()
> #16 0x004184c3 in funcall ()
> #17 0x00419953 in eval ()
> #18 0x004184c3 in funcall ()
> #19 0x004027a1 in main ()
> (gdb)
>
> Thanks --
> -- Matt
> Cc: address@hidden, address@hidden
> From: Camm Maguire <address@hidden>
> Date: 12 Oct 2004 17:26:51 -0400
> User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
> Content-Type: text/plain; charset=us-ascii
> X-SpamAssassin-Status: No, hits=-2.6 required=5.0
> X-UTCS-Spam-Status: No, hits=-332 required=180
>
> Greetings! Bugs like this are often elusive. I suggest seeing if you
> can reproduce simply with sgc off. If so, rerun the error under gdb,
> and send me the backtrace. If not, leave sgc on in the gdb run, do
> the 'handle SIGSEGV' to avoid stopping at every fault, then break
> at sgbc.c:1626 and segmentation_catcher, run, reproduce the error, and
> send the backtrace. Try to find the address of the fault, and if
> possible, cross reference it with one of the addresses printed out on
> object module load.
>
> After this, to save time, try building acl2 again *somewhere else,
> i.e. save this build*, with --enable-debug passed to gcl. See if the
> bug persists. If so, more info will be available from the above when
> run here.
>
> Take care,
>
> Matt Kaufmann <address@hidden> writes:
>
> > Here is some additional information that I think will affect how you
want to
> > proceed.
> >
> > First of all, the offending function, ACL2_*1*_ACL2::MATCH-CLAUSE,
isn't
> > defined in basis.lisp (unlike ACL2::MATCH-CLAUSE, which _is_ defined
in
> > basis.lisp). Rather, the definition, def, of
ACL2_*1*_ACL2::MATCH-CLAUSE is
> > generated and compiled on the fly during the build, with (eval def)
and
> > (eval `(compile ',(cadr def))).
> >
> > So, loading basis.o presumably won't help. Instead, I hacked the
build
> > procedure so that for ACL2_*1*_ACL2::MATCH-CLAUSE, we write out a file
> > my-debug.lisp:
> >
> >
===============================================================================
> > (in-package "ACL2")
> >
> > (DEFUN ACL2_*1*_ACL2::MATCH-CLAUSE (X PAT FORMS)
> > (COND
> > ((F-GET-GLOBAL 'SAFE-MODE *THE-LIVE-STATE*)
> > (RETURN-FROM ACL2_*1*_ACL2::MATCH-CLAUSE
> > (MV-LET (TESTS BINDINGS)
> > (ACL2_*1*_ACL2::MATCH-TESTS-AND-BINDINGS X PAT NIL NIL)
> > (LIST (IF (ACL2_*1*_COMMON-LISP::NULL TESTS) T
> > (CONS 'AND
> > (ACL2_*1*_COMMON-LISP::REVERSE TESTS)))
> > (CONS 'LET
> > (CONS (ACL2_*1*_COMMON-LISP::REVERSE
> > BINDINGS)
> > FORMS)))))))
> > (MATCH-CLAUSE X PAT FORMS))
> >
===============================================================================
> >
> > Instead of (eval `(compile ',(cadr def))), the hacked build code does
the
> > following during the build:
> >
> > (compile-file "my-debug.lisp" :c-file t :h-file t)
> > (load "my-debug")
> >
> > I figured that this is what you'd need in order to carry out your
plan.
> >
> > Jared Davis graciously ran this experiment. Unfortunately, with that
small
> > change in the build procedure the problem goes away. (I suppose we
could
> > change ACL2 to do this for all functions, but it seems unfortunate to
do all
> > that unnecessary file io, and I wonder if that would mask some other
issue.)
> >
> > So, how would you like to proceed?
> >
> > Thanks --
> > -- Matt
> > Cc: address@hidden, address@hidden
> > From: Camm Maguire <address@hidden>
> > Date: 12 Oct 2004 11:59:34 -0400
> > User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
> > Content-Type: text/plain; charset=us-ascii
> > X-SpamAssassin-Status: No, hits=-2.6 required=5.0
> > X-UTCS-Spam-Status: No, hits=-322 required=180
> >
> > Greetings!
> >
> > Matt Kaufmann <address@hidden> writes:
> >
> > > Hi --
> > >
> > > Thanks very much for the quick reply! I have some questions.
> > >
> > > When I tried gdb on ACL2/linux saved_acl2.gcl (built with GCL
2.6.5), and executed
> > >
> > > gdb linux-gcl-saved_acl2.gcl
> > >
> > > then I got the following unfortunate result (where I edited out
the pathname):
> > >
> > > GNU gdb 5.3
> > > Copyright 2002 Free Software Foundation, Inc.
> > > GDB is free software, covered by the GNU General Public
License, and you are
> > > welcome to change it and/or distribute copies of it under
certain conditions.
> > > Type "show copying" to see the conditions.
> > > There is absolutely no warranty for GDB. Type "show warranty"
for details.
> > > This GDB was configured as "i686-pc-linux-gnu"...(no debugging
symbols found)...
> > > (gdb) b fasload
> > > Function "fasload" not defined.
> > > (gdb) r
> > > Starting program: .../linux-gcl-saved_acl2.gcl
> > > (no debugging symbols found)...(no debugging symbols
found)...(no debugging symbols found)...
> > > Program received signal SIGSEGV, Segmentation fault.
> > > 0x400c5d45 in memset () from /lib/libc.so.6
> > > (gdb)
> > >
> >
> > Forgot you have sgc on. Do 'handle SIGSEGV nostop noprint' here
and
> > continue with 'c'
> >
> > > Is it necessary to build ACL2 with some special settings somehow
in order to
> > > get debug info?
> > >
> >
> > In general, yes, but we can save time by checking quickly if we can
> > pinpoint the error in the mentioned function. In general, we build
> > gcl with --enable-debug, but you can also get part of the way by
> > setting compiler::*c-debug* to t.
> >
> > BTW, looking at the C source, I noticed native gcl support for
set-mv
> > and mv-ref. Nice to see these two projects so closely linked.
> >
> > If/when you rebuild, please also do so with
compiler::*default-c-file*
> > set to t so we can keep the generated C source just in case. I
> > strongly doubt it is any different than what I can generate under
> > Linux.
> >
> > > In 4), how do you submit a Lisp LOAD command inside gdb? Also,
in 2), how do
> >
> > Once you type 'r', you will have a lisp prompt.
> >
> > > we arrange that r will run the particular commands that
triggered the break?
> > >
> >
> > Starting in 3) issue commands to lisp as normal to trigger the
> > error. 'r' just starts acl2.
> >
> > > General issue: Recall that when we re-compile the match-clause
function, the
> > > error goes away. Doesn't that suggest that your approach won't
trigger the
> > > error in 8)?
> > >
> >
> > Missed this somehow. If it does not trigger the error, then
trigger
> > it from gdb running acl2/lisp as you know how, send the backtrace,
> > send objdump -d basis.o, find the address (printed) where basis.o
was
> > loaded, find out the address where the fault occurs, and 'p/x
*(char
> > *)<basis.o load address>@1024', 'p/x *((char *)<basis.o load
> > address>+1024)@1024', etc. until you print out the address of the
> > fault. You may have difficulty locating the fault address with sgc
> > on. See if you can retain the error with sgc turned off. If not,
> > then 'b sgbc.c:1626' and 'cond 1 fault_count > 300' (assuming the
> > breakpoint just created was numbered 1, and 'b
segmentation_catcher'.
> > You can then see the fault address in the backtrace (bt) printed
under
> > gdb.
> >
> > Take care,
> >
> > > Thanks --
> > > -- Matt
> > > Cc: address@hidden, "Mike Thomas" <address@hidden>,
> > > address@hidden
> > > From: Camm Maguire <address@hidden>
> > > Date: 12 Oct 2004 10:40:23 -0400
> > > User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
> > > Content-Type: text/plain; charset=us-ascii
> > > X-SpamAssassin-Status: No, hits=-2.6 required=5.0
> > > X-UTCS-Spam-Status: No, hits=-332 required=180
> > >
> > > Greetings!
> > >
> > > OK, my suspicion lies with a difficulty in the windows
relocation
> > > code, but I'm not yet certain. Here is how to proceed:
> > >
> > > 0) run saved_acl2 under gdb
> > >
> > > 1) b fasload
> > >
> > > 2) r
> > >
> > > 3) Put your match-clause function into a separate file,
compile with
> > > compiler::*c-debug* set to t
> > >
> > > 4) (load "your_file.o")
> > >
> > > 5) gdb will break -- type 'finish'
> > >
> > > 6) Still at gdb prompt, look at start address printed by gcl
to
> > > screen. Type 'add-symbol-file <your_file.o> <address>
> > >
> > > 7) c
> > >
> > > 8) trigger the error
> > >
> > > 9) gdb will stop, print a backtrace with bt
> > >
> > > 10) In gdb, type 'p/x *(char *)<address>@1024'
> > >
> > > 11) In gdb, type 'shell', and then 'objdump -d your_file.o'
> > >
> > > 12) Send me the results
> > >
> > > Take care,
> > >
> > >
> > > Matt Kaufmann <address@hidden> writes:
> > >
> > > > [Resending -- I had a typo in the CC field that may have
prevented delivery.]
> > > >
> > > > Hi --
> > > >
> > > > Help!? Sorry to bother you with this email, but I've gone
about as far as I
> > > > know how with the problem described below (and I'll spare
you the dead ends),
> > > > which I kind of suspect is a problem with GCL/Windows, but
might instead be a
> > > > problem with ACL2. This is a rather long email; please
feel free to ask for
> > > > any clarification.
> > > >
> > > > There appears to be a problem either with GCL on Windows or
with ACL2. I'll
> > > > describe the symptom below. This symptom doesn't occur on
Linux or Sun/Solaris
> > > > for GCL, Allegro CL, CMUCL, or CLISP, and I also haven't
seen it on Linux with
> > > > Lispworks or on a Macintosh with OpenMCL; I've only seen it
on GCL/Windows.
> > > > But I realize that it still could be a subtle problem with
ACL2, so I think we
> > > > need to wait on the ACL2 release until we determine whether
or not it's a
> > > > GCL/Windows problem.
> > > >
> > > > Jared Davis was kind enough to submit the commands below
(at the end of this
> > > > email) while standing in directory books/misc/ of the ACL2
distribution, after
> > > > building ACL2 on GCL/Windows 2.6.5. The result is a hard
Lisp error if you do
> > > > *NOT* submit the compile form below: for a transcript, see
> > > > http://www.cs.utexas.edu/users/jared/test.log5 (or
equivalently, on the UTCS
> > > > file system, /u/www/users/jared/test.log5). But *with* the
compile form, the
> > > > problem goes away (see test.log4 in the same directory).
We verified that the
> > > > definition being compiled is exactly the same as the one
compiled when building
> > > > ACL2. (I'll explain how if you're interested -- we could
insert a call of
> > > > disassemble during the build if you think that would be
helpful.)
> > > >
> > > > You can see the result of :bt and :ihs on a failed run,
where some source
> > > > functions are run interpreted (but this doesn't avoid the
error since the
> > > > offending function ACL2_*1*_ACL2::MATCH-CLAUSE is still run
compiled), in
> > > > test.log in that same directory. In a moment I'll forward
you a related log
> > > > with that info (and also the result of :bl), in case you
prefer to see it by
> > > > email.
> > > >
> > > > By the way, all of the failures Jared came across during
the regression run
> > > > were during macroexpansion of ACL2 macro case-match, which
calls
> > > > match-clause-list, which calls match-clause -- actually the
ACL2 macroexpansion
> > > > mechanism causes a call of
ACL2_*1*_ACL2::MATCH-CLAUSE-LIST, which calls
> > > > ACL2_*1*_ACL2::MATCH-CLAUSE.
> > > >
> > > > Also by the way, even if you leave off the compile form
below but you add
> > > > (si::use-fast-links nil), the problem goes away. That
seems odd to me so I
> > > > thought I should mention it.
> > > >
> > > > Here are the commands after starting up ACL2. Again, omit
the compile form to
> > > > see the error -- even though the compile form should be a
no-op!
> > > >
> > > > (rebuild "defpun.lisp" 'arbitrary-tail-recursive-encap)
> > > > :q
> > > > (compile
> > > > (DEFUN ACL2_*1*_ACL2::MATCH-CLAUSE (X PAT FORMS)
> > > > (COND
> > > > ((F-GET-GLOBAL 'SAFE-MODE *THE-LIVE-STATE*)
> > > > (RETURN-FROM
> > > > ACL2_*1*_ACL2::MATCH-CLAUSE
> > > > (MV-LET (TESTS BINDINGS)
> > > > (ACL2_*1*_ACL2::MATCH-TESTS-AND-BINDINGS X
PAT NIL NIL)
> > > > (LIST (IF (ACL2_*1*_COMMON-LISP::NULL TESTS) T
> > > > (CONS 'AND
> > > >
(ACL2_*1*_COMMON-LISP::REVERSE TESTS)))
> > > > (CONS 'LET
> > > > (CONS
(ACL2_*1*_COMMON-LISP::REVERSE
> > > > BINDINGS)
> > > > FORMS)))))))
> > > > (MATCH-CLAUSE X PAT FORMS)))
> > > > (lp)
> > > > (defun remove-xargs-domain-and-measure (dcl)
> > > > (case-match dcl
> > > > (('declare ('xargs ':domain dom-expr
> > > > ':measure measure-expr
> > > > . rest))
> > > > (mv nil dom-expr measure-expr rest))
> > > > (('declare ('xargs ':gdomain dom-expr
> > > > ':measure measure-expr
> > > > . rest))
> > > > (mv t dom-expr measure-expr rest))
> > > > (& (mv nil nil 0 nil))))
> > > >
> > > > Thanks much --
> > > > -- Matt
> > > >
> > > >
> > > >
> > >
> > > --
> > > Camm Maguire
address@hidden
> > >
==========================================================================
> > > "The earth is but one country, and mankind its citizens." --
Baha'u'llah
> > >
> > >
> > > _______________________________________________
> > > Gcl-devel mailing list
> > > address@hidden
> > > http://lists.gnu.org/mailman/listinfo/gcl-devel
> > >
> > >
> > >
> >
> > --
> > Camm Maguire
address@hidden
> >
==========================================================================
> > "The earth is but one country, and mankind its citizens." --
Baha'u'llah
> >
> >
>
> --
> Camm Maguire address@hidden
>
==========================================================================
> "The earth is but one country, and mankind its citizens." --
Baha'u'llah
>
>
> _______________________________________________
> Gcl-devel mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/gcl-devel
>
>
>
--
Camm Maguire address@hidden
==========================================================================
"The earth is but one country, and mankind its citizens." -- Baha'u'llah
- [Gcl-devel] Re: possible GCL/Windows compiler bug, Camm Maguire, 2004/10/12
- [Gcl-devel] Re: possible GCL/Windows compiler bug, Matt Kaufmann, 2004/10/12
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Camm Maguire, 2004/10/12
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Matt Kaufmann, 2004/10/12
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Matt Kaufmann, 2004/10/12
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Camm Maguire, 2004/10/12
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Matt Kaufmann, 2004/10/12
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Camm Maguire, 2004/10/12
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Matt Kaufmann, 2004/10/13
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Camm Maguire, 2004/10/13
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug,
Matt Kaufmann <=
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Camm Maguire, 2004/10/13
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Matt Kaufmann, 2004/10/13
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Matt Kaufmann, 2004/10/14
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Camm Maguire, 2004/10/14
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Matt Kaufmann, 2004/10/14
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Camm Maguire, 2004/10/14
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Matt Kaufmann, 2004/10/14
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Matt Kaufmann, 2004/10/14
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Camm Maguire, 2004/10/15
- Re: [Gcl-devel] Re: possible GCL/Windows compiler bug, Matt Kaufmann, 2004/10/15