bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd)


From: Ludovic Courtès
Subject: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd)
Date: Mon, 17 Oct 2022 14:51:01 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.1 (gnu/linux)

Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

> … so ‘exec_load’ is doing its job, it seems.

Turns out that may not be the case.

Here’s a *bad* mapping on the second ‘task_resume’ breakpoint (when
‘exec’ is about to start):

--8<---------------cut here---------------start------------->8---
  db> show all threads
      TASK        THREADS
    0 gnumach (f5f7cf00): 7 threads:
                0 (f5f7be18) .W..N. 0xc11dac04
                1 (f5f7bcd0) R..O..(idle_thread_continue)
                2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4
                3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c
                4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0
                5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74
                6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8
    1 ext2fs (f5f7ce40): 6 threads:
                0 (f5f7b520) R....F
                1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0
                2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0
                3 (f5f7b000) .W.O..(mach_msg_continue) 0
                4 (f67d3e20) .W.O..(mach_msg_receive_continue) 0
                5 (f67d3cd8) .W.O..(mach_msg_continue) 0
    2 exec (f5f7cd80): (f5f7b3d8) ..SO..(thread_bootstrap_return)
   db> trace
  task_resume(f593e010,fb7d9010,f5f73e80,c106972a)
  ipc_kobject_server(f593e000,3,18,0)+0x1eb
  mach_msg_trap(bffff4c0,3,18,20,8)+0x1703
  >>>>> user space <<<<<
  db> x/tbx 0xcbc 0xf5f7b3d8

  no memory is assigned to address 00000cbc
  0
  db> show map $map2
    Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5
  size=290816,resident:225280,wired=0
  version=13
     map entry 0xf625ec08: start=0x0, end=0x1000
     prot=1/7/copy, object=0x0, offset=0x0
     map entry 0xf625ebb0: start=0x1000, end=0x26000
     prot=5/7/copy, object=0xf5f6ad70, offset=0x0
      Object 0xf5f6ad70: size=0x25000, 1 references
      37 resident pages, 0 absent pages, 0 paging ops
       memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780
       uninitialized,temporary    internal,copy_strategy=0
       shadow=0x0 (offset=0x0),copy=0x0
     map entry 0xf625eb58: start=0x26000, end=0x34000
     prot=1/7/copy, object=0xf5f6ad20, offset=0x0
      Object 0xf5f6ad20: size=0xe000, 1 references
      14 resident pages, 0 absent pages, 0 paging ops
       memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82730
       uninitialized,temporary    internal,copy_strategy=0
       shadow=0x0 (offset=0x0),copy=0x0
     map entry 0xf625eb00: start=0x34000, end=0x37000
     prot=3/7/copy, object=0xf5f6acd0, offset=0x0
      Object 0xf5f6acd0: size=0x3000, 1 references
      3 resident pages,--db_more--
--8<---------------cut here---------------end--------------->8---

Compare with what a “good” mapping looks like at that same moment:

--8<---------------cut here---------------start------------->8---
  start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1]Kernel Breakpoint 
trap,
   eip 0xc1030d5b
  Breakpoint at  task_resume:     pushl   %ebp
  db> show all threads
      TASK        THREADS
    0 gnumach (f5f7cf00): 7 threads:
                0 (f5f7be18) .W..N. 0xc11dac04
                1 (f5f7bcd0) R..O..(idle_thread_continue)
                2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4
                3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c
                4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0
                5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74
                6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8
    1 ext2fs (f5f7ce40): 6 threads:
                0 (f5f7b520) R....F
                1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0
                2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0
                3 (f5f7b000) .W.O..(mach_msg_continue) 0
                4 (f67d2e20) .W.O..(mach_msg_receive_continue) 0
                5 (f67d2cd8) .W.O..(mach_msg_continue) 0
    2 exec (f5f7cd80): (f5f7b3d8) ..SO..(thread_bootstrap_return)
  db> x/tbx 0xcbc 0xf5f7b3d8
                  8
  db> show map $map2
  Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5
  size=290816,resident:229376,wired=0
  version=14
   map entry 0xf625ec08: start=0x0, end=0x1000
   prot=1/7/copy, object=0xf5f6ad70, offset=0x0
    Object 0xf5f6ad70: size=0x1000, 1 references
    1 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625ebb0: start=0x1000, end=0x26000
   prot=5/7/copy, object=0xf5f6ad20, offset=0x0
    Object 0xf5f6ad20: size=0x25000, 1 references
    37 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82730
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625eb58: start=0x26000, end=0x34000
   prot=1/7/copy, object=0xf5f6acd0, offset=0x0
    Object 0xf5f6acd0: size=0xe000, 1 references
    14 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f826e0
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625eb00: start=0x34000, end=0x37000
   prot=3/7/copy, object=0xf5f6ac80, offset=0x0
    Object 0xf5f6ac80: size=0x3000, 1 references
    3 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82690
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625eaa8: start=0xbfff0000, end=0xc0000000
   prot=3/7/copy, object=0xf5f6ac30, offset=0x0
    Object 0xf5f6ac30: size=0x10000, 1 references
    1 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82640
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
--8<---------------cut here---------------end--------------->8---

Notice that 0xcbc reads a valid relocation, where 8 = R_386_RELATIVE.

In the “bad” case, the first map entry is empty, with no associated
memory object and zero resident pages.

My reading of ‘read_exec’ is that the page is supposed to be populated
eagerly by the ‘copyout’ call here:

--8<---------------cut here---------------start------------->8---
static int
read_exec(void *handle, vm_offset_t file_ofs, vm_size_t file_size,
                     vm_offset_t mem_addr, vm_size_t mem_size,
                     exec_sectype_t sec_type)
{
  struct multiboot_module *mod = handle;

[...]

        err = vm_allocate(user_map, &start_page, end_page - start_page, FALSE);
        assert(err == 0);
        assert(start_page == trunc_page(mem_addr));

        if (file_size > 0)
        {
                err = copyout((char *)phystokv (mod->mod_start) + file_ofs,
                              (void *)mem_addr, file_size);
                assert(err == 0);
        }

[...]

        return 0;
}
--8<---------------cut here---------------end--------------->8---

There are interesting tricks in ‘copyout_retry’ to fake a page fault so
the copy can actually be made, IIUC.

Could it be that this bit isn’t quite working?

Ideas?

Problem with debugging this is that setting a breakpoint on ‘exec_load’
causes the system to boot fine (breaking on ‘task_resume’ is fine tough,
go figure…).

Ludo’.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]