bug-make
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gmake and ccache conspiring together in creating gremlins


From: Sam Varshavchik
Subject: Re: gmake and ccache conspiring together in creating gremlins
Date: Mon, 8 Feb 2021 17:51:02 -0500

On Mon, Feb 8, 2021 at 2:38 PM Paul Smith <psmith@gnu.org> wrote:
>
> On Mon, 2021-02-08 at 10:43 +0000, Edward Welbourne wrote:
> > Sounds to me like that's a bug: when the descriptors are closed, the
> > part of MAKEFLAGS that claims they're make's jobserver file
> > descriptors should be removed, since that's when the claim stops
> > being true.
>
> I believe there have been other similar issues reported recently.
>
> Certainly fixing MAKEFLAGS when we run without jobserver available is
> something that could be done.
>
> There is a loss of debugging information if we make this change: today
> make can detect if it was invoked in a way that _should_ expect to
> receive a jobserver context, but _didn't_ receive that context.  That
> is, if make sees that jobserver-auth is set but it can't open the
> jobserver pipes it can warn the user that most likely there's a problem
> in their environment or with the setup of their makefiles.
>
> Without this warning there's no way to know when this situation occurs.
>  It's easy to create a situation where every sub-make will create its
> own completely unique jobserver domain.  So you start the top make with
> -j4 and run 4 sub-makes; if you do it wrong then each of 4 sub-makes
> could create a new jobserver domain, and now you're running 16 jobs in
> parallel instead of 4... there's no way for make to warn you about this
> situation.

One thought occurred to me. Specifically: when make executes what it
believes to be something other than a recursive invocation of $(MAKE),
and it closes the job server pipe file descriptors for that, it can
also:

1) Add an additional parameter to MAKEFLAGS, let's call it
"--no-jobserver", and perhaps remove the --jobserver-auth parameter
completely. It might be easier just to append something there, instead
of surgically removing this.

2) Make checks for a --no-jobserver in MAKEFLAGS when it starts. If
it's there it does NOT attempt to validate the file descriptors that
are given in --jobserver-auth (if this parameter is preserved). It's a
given that they're not there:

  if (!FD_OK (job_fds[0]) || !FD_OK (job_fds[1]) || make_job_rfd () < 0)

Don't even do that. What happens right now a warning message gets
printed and make runs without a job server. This change should have
the same result, print the warning but skip the FD_OK tests.

This will result in the same warning, but it should avoid triggering
the bug that I found.

However that might cause a minor regression in LTO linking. I think
that this prevents the LTO linker's internal invocation of make from
finding that it can attach to the original make process's job server.

>From sifting through strace dumps, I see that a linker-invoked make
gets its own -j flag. It appears that the linker is courteous enough
to count how many CPUs it has and use it to construct its own -j flag.

How about this, safe approach: once --no-jobserver is there it stays
there, and gets propagated to all recursively invoked makes. If an
invoke make finds that it has both a --no-jobserver and a -j flag,
it'll warn and refuse to create its own job server, and then proceed
executing one command at a time.

This prevents an arithmetic proliferation of job worker processes if
the original job server's file descriptors get lost. Currently
recursively-invoked makes will find, and attach themselves to, an
existing job server. This is nice; but this is vulnerable to an edge
case that I think I'm hitting: a false positive involving a leaked
file descriptor. This change encourages fixing whatever's causing make
to fail to detect a recursive invocation.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]