bug-grub
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #58555] grub-probe not identifying ZFS root filesystem due to unsup


From: i336_
Subject: [bug #58555] grub-probe not identifying ZFS root filesystem due to unsupported features
Date: Sat, 13 Jun 2020 04:33:15 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.162 Safari/537.36

URL:
  <https://savannah.gnu.org/bugs/?58555>

                 Summary: grub-probe not identifying ZFS root filesystem due
to unsupported features
                 Project: GNU GRUB
            Submitted by: i336_
            Submitted on: Sat 13 Jun 2020 08:33:14 AM UTC
                Category: Filesystem
                Severity: Major
                Priority: 5 - Normal
              Item Group: None
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: 
        Originator Email: 
             Open/Closed: Open
                 Release: 
                 Release: Git master
         Discussion Lock: Any
         Reproducibility: Every Time
         Planned Release: None

    _______________________________________________________

Details:

Hi all,

I've just set up a simple mirrored ZFS-on-root Debian configuration with /boot
on a FAT32 partition. I'm temporarily using BIOS booting with grub-pc and a
1MB grub_boot partition, and plan to move the pool/disks to an EFI system
later.

Attempting to configure GRUB for this scenario produced a /boot/grub/grub.cfg
with entries like

  linux /vmlinuz-5.5.0-0.bpo.2-amd64 root=ZFS=/debian ro

which completely failed to boot due to the missing pool name.

Investigation revealed this was because grub-probe, called in
/etc/grub.d/10_linux to identify the ZFS root pool name
(http://git.savannah.gnu.org/cgit/grub.git/tree/util/grub.d/10_linux.in?id=6a34fdb76a07305b95e31659bc27b1d190101cbf#n76),
bailed out with "grub-probe: error: unknown filesystem".

I incidentally configured multiple zpools in this setup, and idle curiosity
revealed that grub-probe was able to detect one of the auxiliary pools:

  # grub-probe -t fs_label -d /dev/sda3
  grub-probe: error: unknown filesystem.
  # grub-probe -t fs_label -d /dev/sda4
  pool-1

Enabling verbose output revealed that zfs.c was bailing due to unsupported
pool features:

  grub-core/fs/zfs/zfs.c:2115: zap: name = org.illumos:lz4_compress, value =
1, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = com.joyent:multi_vdev_crash_dump,
value = 0, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = com.delphix:hole_birth, value = 1,
cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = com.delphix:extensible_dataset,
value = 1, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = com.delphix:embedded_data, value =
1, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = org.open-zfs:large_blocks, value =
0, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = org.zfsonlinux:large_dnode, value
= 1, cd = 0
  grub-core/kern/fs.c:78: zfs detection failed.

versus:

  grub-core/fs/zfs/zfs.c:2115: zap: name = org.illumos:lz4_compress, value =
1, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = com.joyent:multi_vdev_crash_dump,
value = 0, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = com.delphix:hole_birth, value = 1,
cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = com.delphix:extensible_dataset,
value = 0, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = com.delphix:embedded_data, value =
1, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = org.open-zfs:large_blocks, value =
0, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = org.zfsonlinux:large_dnode, value
= 0, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = org.illumos:sha512, value = 0, cd
= 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = org.illumos:skein, value = 0, cd =
0
  grub-core/fs/zfs/zfs.c:2115: zap: name = org.illumos:edonr, value = 0, cd =
0
  grub-core/fs/zfs/zfs.c:2115: zap: name = com.datto:bookmark_v2, value = 0,
cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = com.datto:encryption, value = 0,
cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = com.delphix:device_removal, value
= 0, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = , value = 0, cd = 0
  grub-core/fs/zfs/zfs.c:2115: zap: name = , value = 0, cd = 0
  pool-1
  grub-core/kern/disk.c:295: Closing `hostdisk//dev/vda'.

I did a bit of investigation, and traced that check_mos_features() calls
mzap_iterate() with the check_feature() callback, which grub_strcmp()s each
feature against a whitelist of known names
(http://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/fs/zfs/zfs.c?id=6a34fdb76a07305b95e31659bc27b1d190101cbf#n285),
and (back in mzap_iterate()) bails if any detected feature that is not known
has a value of 1. Hence my conclusion above.

Incidentally, the pools were created with identical parameters:

  zpool create -o ashift=12 -O acltype=posixacl -O canmount=off -O
compression=lz4 -O xattr=sa -O relatime=on -O dnodesize=auto -o cachefile=none
tank mirror /dev/disk/by-partlabel/... /dev/disk/by-partlabel/...  

  zpool create -o ashift=12 -O acltype=posixacl -O canmount=off -O
compression=lz4 -O xattr=sa -O relatime=on -O dnodesize=auto -o cachefile=none
pool-1 /dev/disk/by-partlabel/...

  zpool create -o ashift=12 -O acltype=posixacl -O canmount=off -O
compression=lz4 -O xattr=sa -O relatime=on -O dnodesize=auto -o cachefile=none
pool-2 /dev/disk/by-partlabel/...

(The first is a mirror across two disks, while the other two are
non-redundant, discrete pools on each disk that will provide sha256
checksumming.)

As a newcomer to ZFS, it _appears_ to me that the root pool automatically
enabled (?) the large_dnode attribute during installation because I have
dnodesize=auto. I may be incorrect. (pool-1 has not been mounted or touched
yet; I'm still setting everything up, etc.)
 
There are a couple conclusions I draw.

As a comparative niggle, grub-probe's exit status of 1 is not caught in
10_linux.

I outline my understanding of the bigger problem thusly:

- GRUB's ZFS implementation cannot handle all ZFS features, which is why there
is a root-pool / boot-pool split.
- GRUB is calling grub-probe to supply Linux's root=ZFS=... parameter.
- grub-probe is using GRUB's zfs.c known-limited implementation to examine
the root pool, which is likely to have features unsupported by GRUB enabled.

Perhaps I'm missing something here. I hope so...

Me: _looks at every working GRUB 2 ZFS-on-root setup everywhere_
Me: _looks at the above, which says that everything should be catastrophically
broken_
Me: _explodes_

Let me know what additional information I can provide. This is an effectively
blank system, so you can reply with SSH pubkeys if that would
be helpful (the only caveat being 250ms RTT to .com.au).

Thanks in advance,

David Lindsay

NB. The non-head refs in the links point to the current Git master, for
posterity.







    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?58555>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]