help-make
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: make.err:make[1]: vfork: Resource temporarily unavailable


From: Michael Muratet
Subject: Re: make.err:make[1]: vfork: Resource temporarily unavailable
Date: Fri, 24 Jul 2009 08:55:27 -0500

Mike and Paul

Thanks for the help. I am using -j 8 to run on a eight-core node. ulimit -a shows basically no limits.

I may try the kernel route and I will go back and look at the logs. I may try to repeat one of the failures to get fresh log entries. I also enabled core dumps in hopes that a c method might barf up some info.

Thanks again

Mike

On Jul 23, 2009, at 2:51 PM, Mike Shal wrote:

On 7/23/09, Michael Muratet <address@hidden> wrote:
Greetings

I am using a data processing application that uses make for its
implementation. The application is a set of python scripts that write out Makefiles and the user launches the analysis by typing make -j n target. I suspect the authors were looking for a cheap way to get parallelization. The make takes many hours to run in most cases, executing a variety of c methods and scripts. My problem comes about when make tries to launch a new thread:

make.err:make[1]: vfork: Resource temporarily unavailable

I suspect that the resource it wants is swap space, I can see that it
occasionally fills up and I am working on fixing that. But failing that, is
there a way to get make to tell me what it lacks?

I don't think there is a way to get make to give you this information,
since make doesn't get any more specific info from the kernel. The
basic gist is the kernel will return with -EAGAIN somewhere along the
way while executing sys_vfork(). This gets stuck into errno by libc (I
think), and so all make sees is a -1 return value from vfork, and
errno = EAGAIN (which corresponds to 'Resource temporarily
unavailable'). Unfortunately if there are several spots in the kernel
where it can set EAGAIN, you don't know which one specifically will
have been triggered.

If you don't mind building your own kernel and adding debug to it,
that might be one way you could figure out what's going on. Depending
on your specific version/arch, you can start by looking at
kernel/fork.c:do_fork(), which calls copy_process(), which has some
-EAGAIN returns in it. Maybe someone has a method of tracing the
existing kernel?

Of course, it's possible it will fail in a different spot everytime if
it's just running low on memory. Are you sure the old processes are
properly being waited on? What size '-j' are you running anyway?

-Mike

Michael Muratet, Ph.D.
Senior Scientist
HudsonAlpha Institute for Biotechnology
address@hidden
(256) 327-0473 (p)
(256) 327-0966 (f)

Room 4005
601 Genome Way
Huntsville, Alabama 35806









reply via email to

[Prev in Thread] Current Thread [Next in Thread]