bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] progname: don't segfault when argv is NULL


From: Bruno Haible
Subject: Re: [PATCH] progname: don't segfault when argv is NULL
Date: Sat, 5 Dec 2009 14:08:47 +0100
User-agent: KMail/1.9.9

Hi Jim,

> Just 3 days ago I received a private report of sleep segfaulting.
> Unfortunately, it was not reproducible.

If a user gets a segfault, there is a bug somewhere.

Question #1 is: Where is the bug?

You already answered that: it's in the parent process.

Question #2 is: Where should the bug be fixed/caught/avoided?
  1) in the parent process,
  2) in the kernel,
  3) in the child process?

The argument for fixing it in 1) is that the POSIX violation is
happening there.

The argument for adding a catch in 2), by Eric, is that some
programs are not POSIX compliant.

The argument against adding it in 3) is that
  - the user can determine its cause with strace, or with gdb,
  - it affects *all* programs, not just those that use gnulib,
  - you don't increase the overall quality of the a system if
    you put into program B a workaround for a bug in program A.


Ad 1):
> I suggest you declare those functions with the "nonnull" attribute.

Good idea. I reported this to glibc:
  <http://sourceware.org/bugzilla/show_bug.cgi?id=11056>


Ad 2):
This is indeed a kernel problem: OpenBSD 4.0 returns with error code
EFAULT if you pass NULL to execve or execv or execvp. Linux 2.6.25.20
does not. Here's the draft for a report to a linux-kernel mailing list.
Can someone please complete it for me? I am not used any more to rebuild
and install modified Linux kernels.

=========================== proposed linux-kernel report =======================
Subject: execve argv and argv[0] checking

Hi,

When a program invokes the execve system call, according to POSIX,
  1) the argv argument must be non-NULL,
  2) argv[0] must be non-NULL.

This follows from
  <http://www.opengroup.org/onlinepubs/9699919799/functions/exec.html>
which says
  "The value in argv[0] should point to a filename that is associated with
   the process being started"
and from
  <http://www.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html>
sections 3.170 and 3.367.

When argv == NULL (case 1), Linux passes a NULL argv[0] to the child process.
OpenBSD makes the execve system call fail with error EFAULT. The latter
behaviour is preferable because it shields the child process from bugs in the
parent process.

When argv[0] == NULL (case 2), Linux passes a NULL argv[0] to the child
process. It would be preferable to let execve fail with error EFAULT. Same
reason: it would shields the child process from bugs in the parent process.

Currently some child processes - which access[0] as a string, like POSIX allows
them - crash in this situation.

Here is a test program and a suggested fix.

---------------------- buggy-caller.c ----------------------
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main (int argc, char *argv[])
{
  if (argc != 3)
    return 3;
  int arg = atoi (argv[1]);
  char *prog = argv[2];
  switch (arg)
    {
    case 1:
      {
        execve (prog, (char**) 0, (char**) 0);
        perror ("execve failed");
        return 1;
      }
    case 2:
      {
        static char *zero_array[2] = { NULL, NULL };
        execve (prog, zero_array, (char**) 0);
        perror ("execve failed");
        return 1;
      }
    default:
      return 2;
    }
}
------------------------------------------------------------

diff --git a/fs/compat.c b/fs/compat.c
index 6c19040..1495cde 100644
--- a/fs/compat.c
+++ b/fs/compat.c
@@ -1503,9 +1503,15 @@ int compat_do_execve(char * filename,
        if (retval)
                goto out_file;
 
+       retval = -EFAULT;
+       if (!argv)
+               goto out;
        bprm->argc = compat_count(argv, MAX_ARG_STRINGS);
        if ((retval = bprm->argc) < 0)
                goto out;
+       retval = -EFAULT;
+       if (bprm->argc == 0)
+               goto out;
 
        bprm->envc = compat_count(envp, MAX_ARG_STRINGS);
        if ((retval = bprm->envc) < 0)
diff --git a/fs/exec.c b/fs/exec.c
index ba112bd..8932d55 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1327,9 +1327,15 @@ int do_execve(char * filename,
        if (retval)
                goto out_file;
 
+       retval = -EFAULT;
+       if (!argv)
+               goto out;
        bprm->argc = count(argv, MAX_ARG_STRINGS);
        if ((retval = bprm->argc) < 0)
                goto out;
+       retval = -EFAULT;
+       if (bprm->argc == 0)
+               goto out;
 
        bprm->envc = count(envp, MAX_ARG_STRINGS);
        if ((retval = bprm->envc) < 0)

================================================================================

Ad 3):
If you really want to do something that eases finding of such a bug in the
parent process, then - rather than sweeping the bug under the carpet - provide
an error message, like this.

*** lib/progname.c.orig 2009-12-05 14:07:04.000000000 +0100
--- lib/progname.c      2009-12-05 13:55:35.000000000 +0100
***************
*** 23,28 ****
--- 23,30 ----
  #include "progname.h"
  
  #include <errno.h> /* get program_invocation_name declaration */
+ #include <stdio.h>
+ #include <stdlib.h>
  #include <string.h>
  
  
***************
*** 44,49 ****
--- 46,60 ----
    const char *slash;
    const char *base;
  
+   /* Sanity check.  POSIX requires the invoking process to pass a non-NULL
+      argv[0].  */
+   if (argv0 == NULL)
+     {
+       fputs ("A NULL argv[0] was passed through the exec system call.\n",
+            stderr);
+       abort ();
+     }
+ 
    slash = strrchr (argv0, '/');
    base = (slash != NULL ? slash + 1 : argv0);
    if (base - argv0 >= 7 && strncmp (base - 7, "/.libs/", 7) == 0)


Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]