help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-bash] Creating an anonymous pipe for later use


From: Bob Proulx
Subject: Re: [Help-bash] Creating an anonymous pipe for later use
Date: Fri, 10 Oct 2014 12:41:58 -0600
User-agent: Mutt/1.5.23 (2014-03-12)

Russell Lewis wrote:
> They work according to spec - I'm not claiming that they are buggy.  But
> they have more pitfalls than anonymous pipes.

Well...  You used the word "robust".  That is a claim!  :-)

> Key ones I stumbled on:
> 
> 1) Need to clean up filesystem artifacts; hard to do
>    this promptly without races (removing the file
>    before it is used)

Any script that uses temporary files has the same issue.  As soon as a
program uses temporary files then it needs to set up trap handling.
As soon as a program needs trap handling then it becomes much more
complicated and is often done incorrectly. 

> 2) Can't open just one side of a pipe - open() will
>    block.  So kicking off two commands from the same
>    script, and connecting them with a named pipe, is
>    possible (but very hard to do without deadlock)

I wouldn't call it very hard to do.  I agree that it does require
careful handling.  Someone probably isn't going to stumble into the
right answer.  They will only get to it if they study it with some
care.  And even then it can be tricky.

> 3) A child process cannot open /dev/fd/0 (to get a dup
>    of the parent script's stdin) if stdin is a named
>    pipe; open() will block (on Linux).  On other *NIXes,
>    I have read that the open() will just dup() the file
>    descriptor (which is what I wanted) - Linux works
>    differently.

I haven't tried this particular thing on either Linux or a legacy Unix
kernel and so do not know.  However I have also never needed to use
either /dev/stdin or /dev/fd/0 ever either.  It was never available on
the legacy Unix systems and therefore unavailable in the environments
I was concerned running in.  And so I always used other methods.  And
so I will table this part and try it later.

> For the record, the way to kick off two commands from the same parent
> script, connected through a named pipe, without either races or deadlock, is
> as follows:
>     tmp=$(mktemp -u)
>     mkfifo $tmp
>     cmd1 >$tmp &        # opens the pipe in the child process.
>     { cmd2 & } <$tmp    # opens the pipe in the parent process.
>                         # open() blocks until both sides have
>                         # started it.
>     rm $tmp
> 
> Ick.

Yes, ick!  Because that is NOT the way to do it.  The above is unsafe.
Anyone reading the archive please do not use the above method.  For
starters the mktemp -u is literally unsafe.  It doesn't create
anything but simply prints out a randomized name.  Another problem is
that it removes the pipe at the end and doesn't use any signal
handling.  If the sequence is interrupted then it leaves lint behind.
Which isn't a problem as a single case but is a problem in a
systematic case.  I once saw this consume all inodes in a file system.
And additionally nothing is waiting for the children to finish to
handle possible errors from the children.

The best way to handle things like this is to have mktemp create a
directory using the -d option.  The directory will unique and private
to the creating process.  Then create any files in that directory.

Here is an example that I am going to type in off the top of my head
without testing.  Untested code always has bugs and I am sure the
below will too.  Test carefully.

  #!/bin/bash
  unset tmpdir
  cleanup() {
    test -n "$tmpdir" && rm -rf "$tmpdir"
  }
  trap "cleanup" EXIT
  tmpdir=$(mktemp -d) || exit 1
  mkfifo "$tmpdir/pipe"
  cmd1 > "$tmpdir/pipe" &
  cmd2 < "$tmpdir/pipe" &
  wait
  exit 0

This is help-bash and the above is nice.  But the above uses bashisms
which are not always available.  To work with dash:

  #!/bin/sh
  unset tmpdir
  cleanup() {
    test -n "$tmpdir" && rm -rf "$tmpdir"
  }
  trap "cleanup" EXIT

  # Begin dash specific trap handling.
  trap "cleanup; trap - HUP; kill -HUP $$" HUP
  trap "cleanup; trap - INT; kill -INT $$" INT
  trap "cleanup; trap - QUIT; kill -QUIT $$" QUIT
  trap "cleanup; trap - TERM; kill -TERM $$" TERM
  # End dash specific trap handling.

  tmpdir=$(mktemp -d) || exit 1
  mkfifo "$tmpdir/pipe"
  cmd1 > "$tmpdir/pipe" &
  cmd2 < "$tmpdir/pipe" &
  wait
  exit 0

A few important points.  First I didn't do any error handling in the
above.  I would actually do something more like this.  But I would
really want to test it before I suggested it. :-)

  cmd1 > "$tmpdir/pipe" &
  cmd1pid=$!
  if ! cmd2 < "$tmpdir/pipe"; then
    echo "Error: cmd2: failed" 1>&2
    kill $cmd1pid 2>/dev/null
    exit 1
  fi
  if ! wait $cmd1pid; then
    echo "Error: cmd1: failed" 1>&2
    exit 1
  fi
  exit 0

This is getting complicated just to type in and expect it to work.  I
expect the above to have a bug or two.  Test carefully.

$TMPDIR is an external to this process variable and may have been
configured to contain whitespace.  All uses need to be quoted.

The cleanup happens in only one place.  It happens in the signal
handler code and not at the bottom of the script.  This code path is
always exercised and is the normal cleanup.  (Code that has a normal
path and a not-normal path almost always has bugs of being out of
sync.  Better to keep it DRY and have only one code path.  Have the
exception handler always handle the cleanup.)

The signal handling exits correctly by having the process propagate
the signal such that the exit code is correct.  Parents that check
will see the correct result.  You can read more about this here:

  http://www.cons.org/cracauer/sigint.html

You can read more about temporary file techniques here:

  http://mywiki.wooledge.org/BashFAQ/062

Bob

P.S. If you want the children explicitly to start asynchronously then
you must create an additional reader or writer on the pipe so that
opens for write won't block.  Just make sure that the additional
reader/writer never actually reads/writes anything and it will just be
a passive bystander ensuring that the reference count is non-zero and
allowing open()s to succeed without delay.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]