bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: coreutils-8.2 misc/ls-time test failure


From: Eric Blake
Subject: Re: coreutils-8.2 misc/ls-time test failure
Date: Tue, 15 Dec 2009 21:08:13 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.23) Gecko/20090812 Thunderbird/2.0.0.23 Mnenhy/0.7.6.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[adding bug-gnulib]

According to Eric Blake on 12/15/2009 7:48 PM:
> According to John Stanley on 12/15/2009 4:42 PM:
>> Basically, what's happening is that 'touch -a ..' updated ctime in
>> coreutils-7.6,
>> but does not update ctime in coreutils-8.2 (hence misc/ls-time fails).
> 
> Ouch.  That's a bug in the kernel; I can reproduce it:
> 
> $ uname -a
> Linux fencepost 2.6.26-2-xen-amd64 #1 SMP Thu Nov 5 04:27:12 UTC 2009
> x86_64 GNU/Linux
> $ touch q
> $ stat -c '%x %z' q
> 2009-12-15 21:46:33.186677568 -0500 2009-12-15 21:46:33.186677568 -0500
> $ touch -a q
> $ stat -c '%x %z' q
> 2009-12-15 21:47:15.157175384 -0500 2009-12-15 21:46:33.186677568 -0500
> $

According to strace, coreutils 6.10 used syscall_280 (which I'm assuming
is utimensat, and that strace is just behind the times compared to the
kernel); ltrace says it was via:
futimesat(0, 0, 0x7fff0568c900, 0, 3)            = 0

The newer coreutils likewise uses syscall_280, but via:

futimens(0, 0x7fff5b31a450, 0x60ebd0, 0x7fff5b31a450, 3) = 0

By comparing the results of 'touch f' and 'touch -a f', it appears that
the kernel ctime bug is only triggered when UTIME_OMIT is passed as one of
the two timestamps (which is only possible via futimens/utimensat, not
futimesat).  And that is consistent with the fact that coreutils didn't
use UTIME_OMIT until coreutils 8.1.

Also, it means that I can probably devise a way to work around the bug in
gnulib while we wait for the kernel folks to fix their bug.  However,
there's a question of the minimal number of syscalls needed to fix the
problem.  It may be that UTIME_NOW also has an impact.  My current idea:

Keep a cache variable that shows whether UTIME_OMIT works (0=unknown,
1=yes, -1=no).  If the variable is -1, then treat UTIME_OMIT the same was
as we do for futimesat (that is, call stat()/gettime() to populate the
struct timespec prior to making the syscall).  If the variable is 1, then
the kernel has been fixed.

If the variable is 0, then perform [f]stat both before and after the
utimensat call; if the times differ, set the cache variable to 1 and we're
done.  Otherwise, ctime didn't change, so also call gettime().  If gettime
is within 10 ms of the second stat, the results are inconclusive (given
that we have proven that some filesystems have a quantization boundary of
10 ms where multiple actions within that window all end up with the
timestamp), so leave the cache at 0, but re-call utimensat anyways with
the times learned by stat/gettime().  Otherwise, the current time and the
second ctime differ by more than 10 ms, so utimensat UTIME_OMIT is broken;
set cache to -1, and fix the problem by re-calling utimensat with the
times learned by stat/gettime().

Sounds quite hairy.  Any ideas for improvements?  And how best to report
this bug to the kernel folks?

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksoXS0ACgkQ84KuGfSFAYAQzACdGVTRw4Pt/CspbvpJkGUd2Fq1
vxEAnjUrLX3d2UkCi8q1Okq3H/gvGXml
=mmqQ
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]