[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: "wait" bug, now or ever?
From: |
Larry Clapp |
Subject: |
Re: "wait" bug, now or ever? |
Date: |
Mon, 13 Oct 2008 10:11:24 -0400 |
User-agent: |
Mutt/1.5.13 (2006-08-11) |
On Sat, Oct 11, 2008 at 07:29:05PM -0600, Bob Proulx wrote:
> Larry Clapp wrote:
> > He asserts "The wait command waits until a program has *completely*
> > finished", or words to that effect.
>
> That is specious reasoning at best.
Agree.
> > Basically he says he's seen where a program does some redirection,
> > exits, and the file isn't done yet:
> >
> > some_program > some_file
> > # some_file isn't finished yet!
> > wait
> > # Now it's done!
>
> I rarely state things in absolutes and it makes me uncomfortable to
> do so but let me say that this is just plain wrong. That isn't what
> is happening. There is no basis for it. This might as well be fear
> of stepping on a sidewalk crack or skipping the number 13 or
> knocking on wood. This reads more of superstition and not of any
> actual causality.
Agree.
> > I have not cornered him on the exact circumstances in which he's
> > seen this behavior, but I certainly never have --
>
> I think I can shed some light on this behavior. I am sure that he
> is misinterpreting NFS filesystem buffering and its lack of cache
> coherency. This type of problem is common in NFS environments. But
> it has nothing to do with the above example. It occurs when
> accessing files from different hosts using different filesystem
> buffer caches. People accidentally trip into this problem very
> often when using ssh/rsh or job queue systems or anything that
> coordinates processes on different machines that access the same
> files. This is a very common problem. I feel confident that this
> is the root of the superstition and that workarounds for it are
> being applied inappropriately in other contexts.
>
> Search the web for nfs cache coherency and specifically
> close-to-open cache consistency and you should find much discussion
> of the problems. See specifically the Linux NFS FAQ.
I'll look into that, but ...
> > Here are my thoughts on this behavior, most likely first:
> >
> > - I tend to think he had some code at one point that he didn't
> > completely understand (or had forgotten some details of) that
> > was running stuff in background and he didn't realize it, and he
> > experienced this problem, and the "wait" fixed it, so he's
> > scattered "waits" around his scripts ever since.
>
> I am sure you are correct here. I am sure that this person used to
> work in an NFS environment across multiple machines and ran into NFS
> cache coherency issues. I am confident this guess is very likely.
... even assuming cache coherency problems, I don't see how the extra
"wait" would have fixed the problem.
On the other hand, as I said, I've not nailed down the exact
circumstances; I'd suspect he saw this problem at the command line,
not in a script. Just the extra time of typing "wait" might solve the
problem, leading to misguided belief in waiting unnecessarily.
In any case, the exact *cause* of the superstition doesn't matter
(though I definitely appreciate the pointer to NFS problems), I just
wanted to poll the bash community whether anyone had seen a bug like
this anywhere.
> Good luck!
>
> Bob
Thanks, Bob!
-- Larry