info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Possible race in pserver leading to broken pipe error?


From: Derek Robert Price
Subject: Re: Possible race in pserver leading to broken pipe error?
Date: Tue, 10 Feb 2004 16:39:34 -0500
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624 Netscape/7.1

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Steve McIntyre wrote:

>On Mon, Feb 09, 2004 at 05:31:07PM -0500, Larry Jones wrote:
>
>>Steve McIntyre writes:
>>
>>>http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=24253
>>>
>>>They had a similar problem with the Debian package, and the patch
>>>listed on that page seems to fix it for them.
>>>
>>>Thoughts?
>>
>>My first impression is that it seems reasonable, but I'll have to give
>>it some thought.  Note that the patch is defective, however: In the
>>definition of unset_nonblock_fd, the ``|'' should be ``&''.
>
>
>Hmm, yes. Of course it should. I'm a little curious as to how/why this
>patch actually helps people now...


It may simply be that they put the patch in without being able to test
it because they couldn't reproduce the problem.

Anyhow, I think the diagnosis is correct.  I added a fix to the buffer
close routines back in October 2002 due to a problem with server child
processes sending a SIGPIPE's instead of a SIGCHILD intermittantly.  I
suspected a race condition but we could never nail down the cause.  The
old log messages and discussions of the problem I just reviewed would
corroborate this theory exactly.

I've attempted to rewrite this patch, and attached it, still broken,
only because I may have to leave the office in a few minutes.
remotecheck is curerntly failing with this patch because STDOUT isn't
flushing, it looks like.

Derek

Index: src/ChangeLog
===================================================================
RCS file: /cvs/ccvs/src/ChangeLog,v
retrieving revision 1.2336.2.176
diff -u -r1.2336.2.176 ChangeLog
- --- src/ChangeLog    10 Feb 2004 19:54:59 -0000    1.2336.2.176
+++ src/ChangeLog    10 Feb 2004 21:37:42 -0000
@@ -1,5 +1,15 @@
 2004-02-10  Derek Price  <address@hidden>
 
+    * server.c (do_cvs_command): Have the server child close all the pipes
+    but the flow control pipe and wait on an EOF on the flow control pipe
+    from the parent when done to avoid a race condition that could
+    otherwise generate a SIGPIPE for the parent before the SIGCHILD when
+    the other pipes were so full after a child exited that the parent
+    attempted to write a stop byte to the flow control pipe.
+    (Original report from <address@hidden>.)
+
+2004-02-10  Derek Price  <address@hidden>
+
     * buffer.c (stdio_buffer_shutdown): Add a helpful comment.
 
 2004-02-09  Derek Price  <address@hidden>
Index: src/server.c
===================================================================
RCS file: /cvs/ccvs/src/server.c,v
retrieving revision 1.284.2.15
diff -u -r1.284.2.15 server.c
- --- src/server.c    2 Feb 2004 15:12:11 -0000    1.284.2.15
+++ src/server.c    10 Feb 2004 21:37:46 -0000
@@ -2293,13 +2293,18 @@
 #  define SERVER_LO_WATER (1 * 1024 * 1024)
 # endif /* SERVER_LO_WATER */
 
+
+
+/* Protos */
 static int set_nonblock_fd PROTO((int));
+static int set_block_fd PROTO((int));
+
+
 
 /*
- - * Set buffer BUF to non-blocking I/O.  Returns 0 for success or errno
+ * Set buffer FD to non-blocking I/O.  Returns 0 for success or errno
  * code.
  */
- -
 static int
 set_nonblock_fd (fd)
      int fd;
@@ -2314,8 +2319,29 @@
     return 0;
 }
 
+
+
+/*
+ * Set buffer FD to non-blocking I/O.  Returns 0 for success or errno
+ * code.
+ */
+static int
+set_block_fd (fd)
+     int fd;
+{
+    int flags;
+
+    flags = fcntl (fd, F_GETFL, 0);
+    if (flags < 0)
+    return errno;
+    if (fcntl (fd, F_SETFL, flags & ~O_NONBLOCK) < 0)
+    return errno;
+    return 0;
+}
 #endif /* SERVER_FLOWCONTROL */
- -
+
+
+
 static void serve_questionable PROTO((char *));
 
 static void
@@ -2792,11 +2818,33 @@
     /* For now we just discard partial lines on stderr.  I suspect
        that CVS can't write such lines unless there is a bug.  */
 
- -    /*
- -     * When we exit, that will close the pipes, giving an EOF to
- -     * the parent.
- -     */
     buf_free (protocol);
+
+    /* Close the pipes explicitly in order to send an EOF to the parent,
+     * then wait for the parent to close the flow control pipe.  This
+     * avoids a race condition where a child which dumped more than the
+     * high water mark into the pipes could complete its job and exit,
+     * leaving the parent process to attempt to write a stop byte to the
+     * closed flow control pipe, which earned the parent a SIGPIPE, which
+     * it normally only expects on the network pipe and that causes it to
+     * exit with an error message, rather than the SIGCHILD that it knows
+     * how to handle correctly.
+     */
+    /* Let exit() close STDIN - it's from /dev/null anyhow.  */
+    close (STDERR_FILENO);
+    close (STDOUT_FILENO);
+    close (protocol_pipe[1]);
+#ifdef SERVER_FLOWCONTROL
+    if (set_block_fd (flowcontrol_pipe[0]) == 0)
+    {
+      char junk;
+      while (read (flowcontrol_pipe[0], &junk, 1) != 0);
+    }
+    /* FIXCVS: No point in printing an error message with error(),
+     * as STDERR is already closed, but perhaps this could be syslogged?
+     */
+#endif
+
     exit (exitstatus);
     }

 


- --
                *8^)

Email: address@hidden

Get CVS support at <http://ximbiot.com>!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
Comment: Using GnuPG with Netscape - http://enigmail.mozdev.org

iD8DBQFAKU+VLD1OTBfyMaQRAtEiAKDbNV+2iKX9S3RTauc8tWPamdO6OACfWypb
saBnvEDyoGbu/7Omvx+kjys=
=c1ER
-----END PGP SIGNATURE-----






reply via email to

[Prev in Thread] Current Thread [Next in Thread]