bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

pfinet & ftp & hang, round two (much simpler case)


From: Marcus Brinkmann
Subject: pfinet & ftp & hang, round two (much simpler case)
Date: Thu, 7 Jun 2001 01:41:51 +0200
User-agent: Mutt/1.3.18i

Hi,

I managed to isolate the test case a bit.

What I did is to comment out all but the entry for ftpd in inetd.conf.
Then I run "inetd -d" in one screen, and initiated an ftp connection on
another.  Before logging in there, I interruped inetd with ^C, it has done
its job.

Attaching gdb and getting the thread backtraces shows some interesting
things:

*  All (or most?) io_select() calls actually came from inetd.  There were no
   io_select() calls in any thread.

* It's now glaring obvious that the hang is on io_read(), not io_select().
  I think we can elide io_select from the equation, as it doesn't seem it
  ever calls select (at this stage).  I can only see two io_select calls
  in the rpctrace and those are identified to be from inetd (one at startup,
  one after initiating the ftp connection, to make it ready for the next
  one).

* In the gdb output, we can see only 2 (two) tcp_recvmsg calls among the
  usual suspects, both waiting on their respective condition at
  the time of the hang.  Those would belong to ftpd and ftp, respectively,
  wouldn't they?

I have repeated this many times, and it is entirely reproducable.  Sometimes
the io_read() returns after detaching gdb, sometimes it does do it all by
itself, and most of the times it doesn't during the time span I gave it (not
very long).

Sidenotes:  We have probably another type of pfinet bug, where select dies
with EIEIO, however, I'd like to tackle them one at a time.  If done
carefully, the above sequence is deterministic, and results in a hang, not
in an error.

There is also some more fun for kernel hackers.  When I ran pfinet in
rpctrace, attached gdb to it (continue), and then tried to telnet, rpctrace got
an assertion failure and pfinet died with EXC_BAD_ACCESS.  When I tried to
detach gdb from the crashing pfinet, I got a panic: thread_invoke.  How cute.
I didn't check if it is reproducible, because I was tracking the other bug,
and I didn't have any kernel debugging facilities enabled.

Attached are: A somewhat messy rpctrace translog, search for [MB] for my
comments how it is to be read.  A transcript of the gdb session.

Thanks,
Marcus

-- 
`Rhubarb is no Egyptian god.' Debian http://www.debian.org brinkmd@debian.org
Marcus Brinkmann              GNU    http://www.gnu.org    marcus@gnu.org
Marcus.Brinkmann@ruhr-uni-bochum.de
http://www.marcus-brinkmann.de

Attachment: translog
Description: Text document

Attachment: ftp.out
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]