[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: tar is creating corrupt archives when soft links are present
From: |
Paul Eggert |
Subject: |
Re: tar is creating corrupt archives when soft links are present |
Date: |
Sat, 3 Dec 2022 11:33:53 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 |
On 2022-12-01 15:53, Dominique Martinet wrote:
The fstatat64 structs returned from bin/awk and bin/bash are truncated,
could you provide the same strace with '-v' ?
Yes, I'd also like to see the output with strace -v. Assuming that looks
good, I'd then like to see what GDB says about tar, when tar calls fstatat.
This depends on the libc but you need to build with large file support.
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 was it?
./configure should do that sort of thing automatically on a 32-bit build
(which this one apparently is). On GNU/Linux there's no need for
_LARGEFILE_SOURCE, but _FILE_OFFSET_BITS and _TIME_BITS should both be
64 in recent-enough 32-bit x86 GNU/Linux. On older versions of 32-bit
x86 GNU/Linux, _FILE_OFFSET_BITS will be 32 but _TIME_BITS will not be
defined (and GNU tar won't work on files with timestamps after the year
2038).
(the kernel has fstatat64 so it should be recent enough, but libc might
be too old, I didn't check since when these exist.)
One possibility is that tar was mis-built with its own _FILE_OFFSET_BITS
or _TIME_BITS value disagreeing with that of some library that it's
linked to. This would corrupt the data structure that 'tar' sees in its
call to fstatat. If so, even if 'strace -v' reports correct results from
fstatat64, 'tar' is seeing the wrong data.
On a Ubuntu 22.10 platform when tar is built with "./configure CC='gcc
-m32'" so it's a 32-bit build, 'strace -v' works (and tar works), but
this tar is running a new-enough kernel and C library that the fstatat
in tar's source code turns into the new statx system call.
If I do the same build on a RHEL 7.9 system I see fstat64 syscalls and
tar works fine.
This bug has the smell of perhaps running afoul of recent glibc changes
to support 64-bit file timestamps on 32-bit x86. See, for example this
October 2020 thread:
https://sourceware.org/pipermail/libc-alpha/2020-October/118623.html
If the headers in /usr/include don't match what's actually in the C
library, or the C library was misconfigured when it was built (e.g.,
configured for a newer kernel than the actual one), then when tar calls
fstatat it will get garbage data and will make mistakes based on that
garbage. This seems the most likely hypothesis here.