grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ZFS feature check too strict? Booting from a single pool alongside e


From: Vladimir 'phcoder' Serbinenko
Subject: Re: ZFS feature check too strict? Booting from a single pool alongside encrypted datasets works if we ignore check_mos_features
Date: Sun, 13 Aug 2023 17:42:39 +0200



Le dim. 13 août 2023, 12:54, digitalsignalperson <andy@digitalsignalperson.com> a écrit :

Hi, 

I've found that grub can boot from a single pool without explicitly disabling any ZFS features (as long as unimplemented features aren't active). Furthermore, disabling check_mos_features allows for booting from a pool where encrypted datasets may be present, but where grub only needs to read from unencrypted datasets. I'm wondering if a change can be made to relax these restrictions around ZFS feature checking. This would simplify many use cases and reduce the need to use multiple pools with grub.

Here's an example (zfs 2.1.12) where grub has no problem booting with kernels and initramfs from rpool/sys/BOOT/default as shown below:

zpool create -f \
    -o ashift=12 \
    -O acltype=posixacl \
    -O canmount=off \
    -O compression=lz4 \
    -O normalization=formD \
    -O relatime=on \
    -O atime=off \
    -O xattr=sa \
    -o autotrim=on \
    -O mountpoint=none \
    -o cachefile=none \
    -R ${ALTROOT} \
    ${RPOOLNAME} ${RPOOLDEV}

zfs create -o canmount=off -o mountpoint=none ${RPOOLNAME}/sys
zfs create -o canmount=noauto -o mountpoint=/ ${RPOOLNAME}/sys/ROOT/default
zfs create -o canmount=off -o mountpoint=none ${RPOOLNAME}/sys/BOOT
zfs create -o mountpoint=/boot ${RPOOLNAME}/sys/BOOT/default

I didn't think this was possible since all the ZFS-on-root guides follow the pattern of creating a bpool and rpool, where bpool will only enable specific features compatible with grub.

Let's change the dataset config so we encrypt everything except BOOT:

echo "12345678" | zfs create -o canmount=off -o mountpoint=none -o encryption=on -o keylocation=prompt -o keyformat=passphrase ${RPOOLNAME}/sys
zfs create -o canmount=off -o mountpoint=none ${RPOOLNAME}/sys/ROOT
zfs create -o canmount=noauto -o mountpoint=/ ${RPOOLNAME}/sys/ROOT/default
zfs create -o canmount=off -o mountpoint=none -o encryption=off ${RPOOLNAME}/sys/BOOT
zfs create -o canmount=noauto -o mountpoint=/boot ${RPOOLNAME}/sys/BOOT/default

Now grub will not be able to boot anything, regardless of the grub.cfg only referencing things in sys/BOOT and not the encrypted datasets. If I remember right it will print "unknown filesystem" when trying to load anything or do anything like an "ls (hd0,gpt4)" on the pool's filesystem. Also note grub-mkconfig will no longer include the `insmod zfs` and `set root=` parts, but adding them back manually or using the F2 console and looking at `set debug=zfs` will further confirm being stuck.

But a small patch fixes this. By ignoring check_mos_features() in zfs_mount(), grub-mkconfig creates a proper grub.cfg again, and I can successfully boot from this unencrypted rpool/sys/BOOT/default, meanwhile rpool/sys is the encryption root with rpool/sys/ROOT/default and anything else encrypted.

diff --git a/grub-core/fs/zfs/zfs.c b/grub-core/fs/zfs/zfs.c
index 0e195db97..c08b367bd 100644
--- a/grub-core/fs/zfs/zfs.c
+++ b/grub-core/fs/zfs/zfs.c
@@ -3705,10 +3705,11 @@ zfs_mount (grub_device_t dev)
  if (ub->ub_version >= SPA_VERSION_FEATURES &&
      check_mos_features(&osp->os_meta_dnode, ub_endian, data) != 0)
    {
-      grub_error (GRUB_ERR_BAD_FS, "Unsupported features in pool");
-      grub_free (osp);
-      zfs_unmount (data);
-      return NULL;
+      grub_dprintf ("zfs", "Ignoring unsupported features\n");
+      // grub_error (GRUB_ERR_BAD_FS, "Unsupported features in pool");
+      // grub_free (osp);
+      // zfs_unmount (data);
+      // return NULL;
    }
 
  /* Got the MOS. Save it at the memory addr MOS. */


Could such a change be accepted? Removing the check entirely seem to have no risk since everything is read-only. And as shown here, prematurely erroring out is wrong because grub could do all the reads it needed to without running into any of the unimplemented features. And other more specific errors are thrown otherwise if grub still fails.

Trying to understand in more detail the ZFS features required for reading data, I looked at what features were *disabled* on a bpool created with recommended settings for grub:

 feature@multi_vdev_crash_dump
 feature@large_dnode
 feature@sha512
 feature@skein
 feature@edonr
 feature@encryption
 feature@device_removal
 feature@obsolete_counts
 feature@bookmark_v2
 feature@redaction_bookmarks
 feature@redacted_datasets
 feature@bookmark_written
 feature@log_spacemap
 feature@livelist
 feature@device_rebuild
 feature@zstd_compress
 feature@draid

Narrowing these, I asked ChatGPT which of these are required for reading data:

Don't trust ChatGPT. At least zstd_compress is needed if enabled. Please do your own homework rather than trusting AI. 

- feature@large_dnode: Enables support for larger dnodes, which improves space efficiency and read performance. Required for reading data stored with large dnodes.
- feature@sha512: Provides support for SHA-512 hash algorithms, which are used for data integrity checks during reads and writes.
- feature@skein: Provides support for Skein hash algorithms, which are used for data integrity checks during reads and writes.
- feature@edonr: Provides support for Edon-R hash algorithms, which are used for data integrity checks during reads and writes.
- feature@encryption: Enables encryption features for data at rest. Required to read encrypted data.
- feature@log_spacemap: Optimizes the handling of log spacemaps, which can improve metadata read performance.

Out of these, the only read-related feature I think I'd ever run into is encryption, and as shown above we can just point grub to read from a non-encrypted dataset and it's happy.

Most likely encryption is actually supported, just not marked as such. See zfskey command


I was also curious about the presence of grub-core/fs/zfs/zfscrypt.c and grub's zfs code having calls to this in various places to seemingly be able to load a key and decrypt data. It has history back to 2011, but I couldn't figure out how it works or if it's dead at this point.  Hacking around a bit beyond ignoring the check_mos_features, if I try to read from an encrypted dataset the first error I think comes up is "compression algorithm inherit not supported". I also tried using zfskey and insmod zfscrypt, but didn't get past "no key for txg " amongst other errors.

-andy

_______________________________________________
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel

reply via email to

[Prev in Thread] Current Thread [Next in Thread]