[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

cvs 1.11.5 pserver sig11 crash on FreeBSD 4.8-R

From: scott+cvsbug
Subject: cvs 1.11.5 pserver sig11 crash on FreeBSD 4.8-R
Date: Sun, 27 Jul 2003 18:09:40 +0100 (BST)

>Submitter-Id:   net
>Originator:     Scott Mitchell
>Organization:  net
>Confidential:  no
>Synopsis:      cvs pserver sig11 crash on FreeBSD 4.8-R
>Severity:      serious
>Priority:      medium
>Category:      cvs
>Class:         sw-bug
>Release:       1.11.5-FreeBSD
FreeBSD pukeko 4.8-RELEASE FreeBSD 4.8-RELEASE #0: Thu Apr  3 10:53:38 GMT 2003 
    root@freebsd-stable.sentex.ca:/usr/obj/usr/src/sys/GENERIC  i386


We recently moved our CVS repository from a FreeBSD 4.6-STABLE machine to a
brand new FreeBSD 4.8 install, on another identical machine.  The server runs
cvs in 'pserver' mode, for remote access by various Windows, Solaris, Linux
and FreeBSD clients.

We pretty soon noticed that the cvs server process was occasionally crashing
on sig11 (ie. a segfault).  The only evidence for this was in the message
log - the cvs operations always completed normally on the client side.  This
*never* happened on the old server, so I figured it had to be a hardware
problem on the new machine, or some issue with 4.8.  This was probably
happening about 1 in every 100 times the cvs server was run.

I compiled a debug version of cvs from the 4.8 sources and was able to get a
few cores, once I figured out how to make it actually dump core.  I've
attached the log of a gdb session on one of these -- all the cores I have
show the process crashing in the same place, where it's clearly trying to
follow a NULL pointer.

I've since copied the cvs binary from the 4.6 machine across to the new
server -- we've run with this for the past two weeks and had exactly zero
problems with it.

Unexpected sig11's are often a sign of bad RAM or other hardware trouble,
but I've run numerous buildworlds on this machine with no problems, so I'm
doubtful that this is a hardware issue.  Brian Behlendorf <brian@collab.net>
has reported the same problem, also with no obvious hardware-related cause.
When I can schedule some downtime on this machine, I'm going to run the vendor
memory tests and memtest86 to get a definitive answer on the RAM question.

Given that all the cores are the same, and that the only thing we've seen
fail on this machine is the 4.8 cvs code, this smells like a cvs bug to me.
We were running 1.11.1p1-FreeBSD on the old machine for ~2 years, with no
problems whatsoever, so it seems to be something that has crept in between
then and 1.11.5-FreeBSD.  I've no idea if it's in the FreeBSD extensions or
the base cvs code, so I am reporting the bug to the FreeBSD project as well.

I can provide any additional configuration details or more grovelling in the
core dumps on request...



FreeBSD bug report: http://www.freebsd.org/cgi/query-pr.cgi?pr=54854

----- Attachment #1: gdb.log -----

Script started on Wed Jul 23 11:14:55 2003
pukeko# gdb `which cvs.debug` cvs.debug.81697.core
GNU gdb 4.18 (FreeBSD)
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called
+at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line
+2627 in elfstab_build_psymtabs
Deprecated bfd_read called at
+/usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line
+933 in fill_symbuf

Core was generated by `cvs.debug'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libgnuregex.so.2...done.
Reading symbols from /usr/lib/libmd.so.2...done.
Reading symbols from /usr/lib/libcrypt.so.2...done.
Reading symbols from /usr/lib/libz.so.2...done.
Reading symbols from /usr/lib/libc.so.4...done.
Reading symbols from /usr/libexec/ld-elf.so.1...done.
#0  buf_shutdown (buf=0x0)
    at /usr/src/gnu/usr.bin/cvs/cvs/../../../../contrib/cvs/src/buffer.c:1208
1208        if (buf->shutdown)
(gdb) where
#0  buf_shutdown (buf=0x0)
    at /usr/src/gnu/usr.bin/cvs/cvs/../../../../contrib/cvs/src/buffer.c:1208
#1  0x8087e2b in server_cleanup (sig=0)
    at /usr/src/gnu/usr.bin/cvs/cvs/../../../../contrib/cvs/src/server.c:4892
#2  0x805ec67 in error_exit ()
    at /usr/src/gnu/usr.bin/cvs/cvs/../../../../contrib/cvs/src/error.c:71
#3  0x805ef27 in error (status=1, errnum=0,
    message=0x80ab4b9 "received %s signal")
    at /usr/src/gnu/usr.bin/cvs/cvs/../../../../contrib/cvs/src/error.c:212
#4  0x806daae in main_cleanup (sig=13)
    at /usr/src/gnu/usr.bin/cvs/cvs/../../../../contrib/cvs/src/main.c:395
#5  0x80926e4 in strip_trailing_slashes ()
#6  0xbfbfffac in ?? ()
#7  0x804d85a in buf_send_output (buf=0x80c1040)
    at /usr/src/gnu/usr.bin/cvs/cvs/../../../../contrib/cvs/src/buffer.c:287
#8  0x804d900 in buf_flush (buf=0x80c1040, block=1)
    at /usr/src/gnu/usr.bin/cvs/cvs/../../../../contrib/cvs/src/buffer.c:352
#9  0x8087eb7 in server_cleanup (sig=0)
    at /usr/src/gnu/usr.bin/cvs/cvs/../../../../contrib/cvs/src/server.c:5007
#10 0x80883e2 in server (argc=1, argv=0xbfbffc88)
    at /usr/src/gnu/usr.bin/cvs/cvs/../../../../contrib/cvs/src/server.c:5234
#11 0x806e636 in main (argc=1, argv=0xbfbffc88)
    at /usr/src/gnu/usr.bin/cvs/cvs/../../../../contrib/cvs/src/main.c:1028
#12 0x804a67a in _start ()
(gdb) list
1204    int
1205    buf_shutdown (buf)
1206         struct buffer *buf;
1207    {
1208        if (buf->shutdown)
1209            return (*buf->shutdown) (buf);
1210        return 0;
1211    }
(gdb) quit
pukeko# exit

Script done on Wed Jul 23 11:15:28 2003

Using the cvs binary from 4.6-R is the only workaround I've found so far.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]