bug-xorriso
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

mtree(8) specification files


From: Ivan Shmakov
Subject: mtree(8) specification files
Date: Sat, 24 Sep 2022 12:39:32 +0000

        Speaking of recording filesystem metadata in text files.  For
        well over a decade, I’ve stored sha256sum(1) output alongside
        the payload files on my ECMA 119 filesystems, so that I could
        conveniently verify the integrity of the data after writing,
        as well as whenever else I might need it; like:

$ sh -Ceuc 'for f ; do
                (cd "${f%/.sha256/*}" && sha256sum -c) < "$f" \
                    || printf %s\\n "${f}: FAIL" >&2
            done ; ' dummy.sh  \
      /media/cdrom/public/download/sites/.sha256/2022-06-05 

        Somewhat recently, I’ve however started using mtree(8)
        specifications for the purpose; e. g.:

$ sh -Ceuc 'for f ; do
                mtree -xp "${f%/.mtree/*}" < "$f" \
                    || printf %s\\n "${f}: FAIL" >&2
            done ; ' dummy.sh  \
      /media/cdrom/public/download/sites/.mtree/2022-06-05 

        The mtree(8) tool originates in BSD, but is also currently
        available in Debian (and, presumably, derivatives) [1, 2].
        (Note, however, that on BSD it’s installed under /usr/sbin,
        and thus outside of the default PATH for non-root users.)

[1] http://manpages.debian.org/sid/mtree.8
[2] http://man.netbsd.org/mtree.8

        The file is prepared with a command line similar to the
        following, subject to some manual editing afterwards:

$ mtree -R flags -K sha256 -cxp DIRECTORY | mtree -C -K sha256 > SPECFILE 

        As actually written, its contents are like:

/set type=file uid=0 gid=0 mode=0644 nlink=1 
. type=dir mode=0755 nlink=5 
./.mtree type=dir mode=0755 nlink=2 
./.mtree/2022-05-07 size=10375 time=1651916124.457132333 
sha256=e862a40225f34502ffe91d02089be912ac1dd1468387185b48b2846d913e0534 
./.mtree/2022-06-05 size=20272 
./.sha256 type=dir mode=0755 nlink=2 
./.sha256/2022-05-07 size=6147 time=1651916002.849460427 
sha256=39d7ffbbb71bd404c980af67bc84b3364026b2fbe092ec92cda705c13dabdc05 
./.sha256/2022-06-05 size=11966 time=1654459249.595422865 
sha256=8bf8e0b5a615bd1f1b89b15ac6c18f79a67db72d54c1e3aa01cdce0c258ef4a0 
./nycdn.netbsd.org type=dir mode=0755 nlink=3 time=1653849030.983417693 
./nycdn.netbsd.org/pub type=dir mode=0755 nlink=3 time=1653849030.983417693 
…
./nycdn.netbsd.org/pub/NetBSD-daily/netbsd-9/202205280320Z/source/sets/syssrc.tgz
 type=file mode=0644 nlink=1 size=59918577 time=1653710376.0 
sha256=00a6070c83b98bc3c5306f25ce5be30a6cd6dce66e7cdebfc8d5472044dcf8a7 

        Some things to note:

         1. The /set line sets the defaults for the subsequent entries
            and is part of the ‘greater’ mtree(8) format.  It’s removed
            by $ mtree -C above and readded manually.

         2. The specification file cannot feasibly contain its own
            SHA256, so the ./.mtree/2022-06-05 entry omits sha256=
            (as well as time=, for similar reasons.)

         3. The number of hard links is recorded (nlink=) in the
            specification, but on non-directories, it will /not/ be
            preserved on the resulting ECMA 119 filesystem unless
            -hardlinks on is used on xorriso(1) invocation, thus
            potentially leading to spurious verification failures.

        What might be of particular value is that the specification
        records timestamps at nanosecond resolution (although using
        an unconventional SECONDS.NS format that lacks zero-padding
        in the NS field; so that e. g. 1664007604.1 actually means
        1664007604.000000001.)  If the archive later gets extracted
        to a filesystem supporting sub-second resolution timestamps,
        those can be restored from this file.

        Personally, I find such a specification to be a more ‘visible,’
        as well as versatile, way to store metadata (whether in place,
        or in addition, to what is recorded on the filesystem proper),
        than, say, on-filesystem extended attributes.  For instance,
        such a file can readily be accessed via about whatever file
        access protocol: Rsync, NFS, scp(1), etc.

        Nevertheless, the mtree(8) format has its shortcomings.
        In particular, it doesn’t record extended attributes, ACLs,
        or Ext2+ attributes, though it records BSD FS (UFS / FFS)
        flags.  From whence is my interest in archiving filesystems
        at block level, for that’s about the only way to preserve the
        entirety of the metadata.

        With regards to xorriso(1), a feature to consider would be
        to read such a specification and apply it to the filesystem
        loaded – hopefully an option more convenient than using
        -chown, -chgrp, -chmod, -alter_date, etc., potentially for
        every signle file on the filesystem.  (Although it’s certainly
        possible to process an input mtree(8) specification into a
        sequence of such commands with a helper script; e. g., using
        Bash process substitution: $ xorriso … -options-from-file
        <(mtree-to-xorriso < specfile.mtree) …)

        To note is that a similar feature is supported by the makefs(8)
        command, originating in NetBSD, yet again currently available
        for Debian as well [3, 4]:

  -F mtree-specfile

    Use mtree-specfile as an mtree(8) ‘specfile’ specification.

    If a specfile entry exists in the underlying file system, its
    permissions and modification time will be used unless specifically
    overridden by the specfile.  An error will be raised if the type
    of entry in the specfile conflicts with that of an existing entry.

    In the opposite case (where a specfile entry does not have an
    entry in the underlying file system) the following occurs: if
    the specfile entry is marked optional, the specfile entry is
    ignored.  Otherwise, the entry will be created in the image,
    and it is necessary to specify at least the following parameters
    in the specfile: type, mode, gname, or gid, and uname or uid,
    device (in the case of block or character devices), and link
    (in the case of symbolic links).

        (Though I cannot at present vouch for the correctness of the
        makefs(8) ‘cd9660’ filesystem support.)

[3] http://manpages.debian.org/sid/makefs.8
[4] http://man.netbsd.org/makefs.8

        Another option would be to /check/ the current (in-memory)
        state of the filesystem against the specification, reporting
        mismatches, and perhaps exiting with an error.

        Yet another is to have an option to dump a portion of the
        filesystem state (i. e., not unlike -find DIR -exec lsdl)
        in mtree(8) format.

        Thoughts?

-- 
FSF associate member #7257  http://am-1.org/~ivan/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]