bug-grub
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "grub" command works, but GRUB boot loader hangs


From: Ben Liblit
Subject: Re: "grub" command works, but GRUB boot loader hangs
Date: Wed, 08 Aug 2001 03:22:33 -0700

Okuji Yoshinori wrote:
> Debugging is, IMO, a general process

OK, OK, you've shamed me into it.  I have added a number of debugging
print statements to GRUB's stage2.  I have only one machine at this
location, and the problems only appear when using GRUB as a boot
loader, so my compile-test cycle is painfully long.  But hopefully
some of this new information will be useful.  As before, I would
appreciate any suggestions for more specific things to look at.

Please consider the following:


Eleven second delay on first (fd1) detection
--------------------------------------------

At the GRUB prompt I type "geometry (fd" and press <tab>.  This
initiates several calls to get_diskinfo().  When get_diskinfo() is
called with drive == 1, some interesting things happen.

get_diskinfo() calls get_diskinfo_standard(1, ...), which immediately
returns error code 96.  I find the 96 noteworthy, because this same
call returns error code 1 for drive values 2 - 7.  For drive 1 and
drive 1 only, it returns 96.

Since get_diskinfo_standard(1, ...) returned nonzero, get_diskinfo()
now tries calling get_diskinfo_floppy(1, ...).  This call takes eleven
seconds to return, and ultimately returns error code 1.  Recall that
this machine has only one floppy drive; the failure to detect an (fd1)
drive is correct, but the eleven second delay is at the very least
undesirable.

Calls to get_diskinfo() for drive numbers 2 - 7 execute quickly and
return the expected error codes without delay.  The <tab> completion
observes that only one floppy disk has been detected and completes my
command as "geometry (fd0)".


Detection of (fd1) on subsequent probes
---------------------------------------

I now erase my command line and do the same thing again: type
"geometry (fd" and press <tab>.  As before, this initiates several
calls to get_diskinfo().  And again, when get_diskinfo() is called
with drive == 1, some interesting things happen.

This time around, the call to get_diskinfo_standard(1, ...) returns
without delay and reports no error.  It claims to have detected a
drive with C/H/S geometry of 40/2/9.

Probes for drives 2 - 7 correctly fail as before.  The completion code
offers me a choice of (fd0) and (fd1).

All subsequent probes remain the same.  They continue to detect an
(fd1) with 40/2/9 C/H/S geometry.  Recall that this machine has only
one floppy drive; correct behavior would be to not detect an (fd1)
drive.


Questionable geometry detected for (hd0)
----------------------------------------

I'm not convinced that the following is actually incorrect behavior.
I report it here for the sake of completeness.

I type "geometry (hd0,0)" and press <enter>.  This eventually calls
get_diskinfo().  From the debugging statements I have there, I can
tell the following:

   - get_diskinfo_int13_extensions() returns no error, and claims
     C/H/S geometry of 0/0/0 with 17873039 total sectors

   - get_diskinfo_standard() returns no error, and claims C/H/S
     geometry of 1023/255/63

Please note that "fdisk -l" claims that this same drive has C/H/S
geometry of 1112/255/63, which does not agree with the information
returned by get_diskinfo_standard().  The Linux kernel claims that
this disk has 17873040 total sectors, exactly one more than the total
claimed by get_diskinfo_int13_extensions().


Hang on access of (hd0,0)
-------------------------

I type "geometry (hd0,0)" and press <enter>.  The last thing that
geometry_func() does is to call real_open_partition(1).  This function
eventually enters a loop guarded by calls to next(), which calls
next_partition(), which calls next_pc_slice(), which calls rawread()
with the following arguments:

        drive == 0x80
        sector == 0
        byte_offset == 0
        byte_len == 512
        buf == (some pointer value)

Near as I can tell, the first call rawread() never returns.  I have
not yet had time to instrument rawread().  This seems like the obvious
next step, to try to determine where exactly we are getting hung up
within rawread() or something that rawread() calls.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]