[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: test-fdutimensat racy?
From: |
Pádraig Brady |
Subject: |
Re: test-fdutimensat racy? |
Date: |
Tue, 21 May 2013 15:02:59 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 |
On 05/21/2013 01:52 PM, Bernhard Voelker wrote:
>> 05/03/2013 10:37 AM, Bernhard Voelker wrote:
>> I.e., the numbers did not went significantly down for that non-VM system. ;-/
>
> Playing with the code a bit and testing on various VMs (where the race
> showed up most probably), it turned out that the multiplier for the nap()
> delay is not sufficient. Changing it from 1.125 to 3 seemed to make the
> tests quite stable. However, avoiding the race completely would be much
> nicer.
>
> What do you think about the following approach?
>
> Have a nice day,
> Berny
>
>
>>From 39f5cb30bb70bbf43741bcad6d30890c34412d75 Mon Sep 17 00:00:00 2001
> From: Bernhard Voelker <address@hidden>
> Date: Tue, 21 May 2013 14:42:42 +0200
> Subject: [PATCH] tests/nap.h: avoid race
>
> The recent change in nap.h (5191133e) decreased the probability of lost
> races to about a third, however such problems could still be observed
> in virtual machines and openSUSE's OBS.
>
> Instead of calulating the nap() time once and using it (together with
> a small correction multiplier), avoid the race alltogether by verifying
> on a reference file whether a timestamp difference has happened.
>
> * tests/nap.h (nap_works): Change return value to bool. Change passing
> the old file's status by value instead of by reference as this function
> does no longer update that timestamp; rename the function argument from
> st to old_st. Remove the local variables cdiff and mdiff because that
> function now returns true/false instead of the precise delay.
> (guess_delay): Remove function.
> (nap): Instead of re-using the delay which has been calulated during the
> first call, avoid the race by actually verifying that a timestamp
> difference can be observed on the current file system. Use an adaptive
> approach for the delay to minimize execution time. Assert that the
> maximum delay is <= 2 seconds.
> ---
> tests/nap.h | 103
> +++++++++++++++++++++++-------------------------------------
> 1 file changed, 40 insertions(+), 63 deletions(-)
>
> diff --git a/tests/nap.h b/tests/nap.h
> index d8dbad2..09046a8 100644
> --- a/tests/nap.h
> +++ b/tests/nap.h
> @@ -20,6 +20,7 @@
> # define GLTEST_NAP_H
>
> # include <limits.h>
> +#include <stdbool.h>
>
> /* Return A - B, in ns.
> Return 0 if the true result would be negative.
> @@ -53,86 +54,62 @@ get_stat (int fd, struct stat *st, int do_write)
> }
>
> /* Given a file whose descriptor is FD, see whether delaying by DELAY
> - nanoseconds causes a change in a file's time stamp. *ST is the
> - file's status, recently gotten. Update *ST to reflect the latest
> - status gotten. If successful, return the needed delay, in
> - nanoseconds as determined by the observed time stamps; this may be
> - greater than DELAY if we crossed a quantization boundary. If
> - unsuccessful, return 0. */
> -static int
> -nap_works (int fd, int delay, struct stat *st)
> + nanoseconds causes a change in a file's time stamp. OLD_ST is the
> + file's status, recently gotten. */
> +static bool
> +nap_works (int fd, int delay, struct stat old_st)
> {
> - struct stat old_st = *st;
> + struct stat st;
> struct timespec delay_spec;
> - int cdiff, mdiff;
> delay_spec.tv_sec = delay / 1000000000;
> delay_spec.tv_nsec = delay % 1000000000;
> ASSERT (nanosleep (&delay_spec, 0) == 0);
> - get_stat (fd, st, 1);
> + get_stat (fd, &st, 1);
>
> /* Return the greater of the ctime and the mtime differences, or
> zero if it cannot be determined, or INT_MAX if either overflows. */
> - cdiff = diff_timespec (get_stat_ctime (st), get_stat_ctime (&old_st));
> - if (cdiff != 0)
> - {
> - mdiff = diff_timespec (get_stat_mtime (st), get_stat_mtime (&old_st));
> - if (mdiff != 0)
> - return cdiff < mdiff ? mdiff : cdiff;
> - }
> - return 0;
> -}
> + if ( diff_timespec (get_stat_ctime (&st), get_stat_ctime (&old_st))
> + && diff_timespec (get_stat_mtime (&st), get_stat_mtime (&old_st)))
> + return true;
>
> -static int
> -guess_delay (void)
> -{
> - /* Try a 1-ns sleep first, for speed. If that doesn't work, try 100
> - ns, 1 microsecond, 1 ms, etc. xfs has a quantization of about 10
> - milliseconds, even though it has a granularity of 1 nanosecond,
> - and NTFS has a default quantization of 15.25 milliseconds, even
> - though it has a granularity of 100 nanoseconds, so 15.25 ms is a
> - good quantization to try. If that doesn't work, try 1 second.
> - The worst case is 2 seconds, needed for FAT. */
> - static int const delaytab[] = {1, 1000, 1000000, 15250000, 1000000000 };
> - int fd = creat (BASE "tmp", 0600);
> - int i;
> - int delay = 2000000000;
> - struct stat st;
> - ASSERT (0 <= fd);
> - get_stat (fd, &st, 0);
> - for (i = 0; i < sizeof delaytab / sizeof delaytab[0]; i++)
> - {
> - int d = nap_works (fd, delaytab[i], &st);
> - if (d != 0)
> - {
> - delay = d;
> - break;
> - }
> - }
> - ASSERT (close (fd) == 0);
> - ASSERT (unlink (BASE "tmp") == 0);
> - return delay;
> + return false;
> }
>
> /* Sleep long enough to notice a timestamp difference on the file
> - system in the current directory. Assumes that BASE is defined,
> - and requires that the test module depends on nanosleep. */
> + system in the current directory. Use an adaptive approach, trying
> + to find the smallest delay which works on the current file system
> + to make the timestamp difference appear. Assert a maximum delay of
> + 2 seconds. Assumes that BASE is defined, and requires that the test
> + module depends on nanosleep. */
> static void
> nap (void)
> {
> - static struct timespec delay;
> - if (!delay.tv_sec && !delay.tv_nsec)
> + static int fd = -1;
> + struct stat old_st;
> + static int delay = 1;
> +
> + if (-1 == fd)
> + {
> + ASSERT ((fd = creat (BASE "naptmp", 0600)) != -1); /* Never closed. */
> + ASSERT (unlink (BASE "naptmp") == 0);
> + get_stat (fd, &old_st, 0);
> + }
> + else
> {
> - int d = guess_delay ();
> -
> - /* Multiply by 1.125 (rounding up), to avoid problems if the
> - file system's clock is a bit slower than nanosleep's.
> - Ceiling it at INT_MAX, though. */
> - int delta = (d >> 3) + ((d & 7) != 0);
> - d = delta < INT_MAX - d ? d + delta : INT_MAX;
> - delay.tv_sec = d / 1000000000;
> - delay.tv_nsec = d % 1000000000;
> + ASSERT (0 <= fd);
> + get_stat (fd, &old_st, 1);
> }
> - ASSERT (nanosleep (&delay, 0) == 0);
> +
> + if (1 < delay)
> + delay = delay / 2; /* Try half of the previous delay. */
> + ASSERT (0 < delay);
> +
> + for ( ; delay <= 2000000000; delay = delay * 2)
> + if (nap_works (fd, delay, old_st))
> + return;
Excellent. This is the truncated exponential backoff
method I was referring to previously (albeit a little less general).
It's probably worth a comment that this may introduce
a total delay of up to: sum(2^n) from 0 to 30 = 2^31 - 1 = 2.1s
> +
> + /* Bummer: even the highest nap delay didn't work. */
> + ASSERT (0);
> }
>
> #endif /* GLTEST_NAP_H */
>
thanks!
Pádraig.
- Re: test-fdutimensat racy?, Paul Eggert, 2013/05/01
- Re: test-fdutimensat racy?, Bernhard Voelker, 2013/05/02
- Re: test-fdutimensat racy?, Paul Eggert, 2013/05/02
- Re: test-fdutimensat racy?, Bernhard Voelker, 2013/05/03
- Re: test-fdutimensat racy?, Bernhard Voelker, 2013/05/21
- Re: test-fdutimensat racy?, Eric Blake, 2013/05/21
- Re: test-fdutimensat racy?, Bernhard Voelker, 2013/05/21
- Re: test-fdutimensat racy?, Eric Blake, 2013/05/21
- Re: test-fdutimensat racy?, Bernhard Voelker, 2013/05/21
- Re: test-fdutimensat racy?,
Pádraig Brady <=