Re: [bug #33138] .PARLLELSYNC enhancement with patch

bug-make
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug #33138] .PARLLELSYNC enhancement with patch

From:	Frank Heckenbach
Subject:	Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date:	Thu, 18 Apr 2013 19:09:06 +0200
Paul Smith wrote:

> I've applied the patch from Frank.

Thanks. I did some tests and so far everything works in my setup.

Since I was away for a day, I couldn't follow the discussion
earlier, so allow me to respond to several mails at once ...

> I changed the behavior so that the entire recipe
> is printed together, rather than each individual job being printed together,
> so that even without .ONESHELL you still have the whole recipe as a single
> output.  I can't justify to myself the need for individual job output but if
> someone really wants it please provide a reason.

I almost do. I use a mechanism to give me a progress indicator while
make is running. This mechanism does not require any special make
features and works outside of make. How it works is roughly, my
rules (at least the longer running ones) look like this (where
"[Start]" and "[End]" are in fact some different strings that do not
easily occur in normal messages):

foo.o:
        address@hidden "[Start] Compiling foo.c"
        @sleep 1 # cc -o foo.o foo.c
        address@hidden "[End] Compiling foo.c"

My program that interpretes make's output catches those special
messages and shows me a list of currently running tasks. (In
non-parallel builds, the non-silent make output serves this purpose,
you can just see the last command shown which is the one running
now. In parallel builds, this is not so; a long-running command may
have scrolled far away by subsequent shorter commands run in
parallel, so it's not easy to see what's actually running.
Furthermore, many typical command lines these days (including mine)
have so many compiler options etc. that they're hard to read by
themselves. The above messages make it much more comfortable for me
to see what's going on.)

This mechanism was unaffected by my output-sync patch, and I
expected your change broke it. In fact, it didn't, that's why I said
"almost". That's because of the "+" in the echo commands which
prevents make from considering the above a single recipe. (The
reason I use "+" is so I can run "make -n" and count those messages
before the actual run in order to get the total number that will be
produced which allows me to draw a progress bar. Not terribly
important, I admit, but in this case it incidentally saved my ass.)

So at the moment, it works for me, but it indicates there may be
legitimate reasons to want the output grouped per job rather than
per recipe.

So I'd plead to revert this bit (since one can still use .ONESHELL
if wanted). Or we could add another mode like "--output-sync=job".
Shouldn't be too hard now (if you like, I can implement it).

Paul Smith wrote:

> On Tue, 2013-04-16 at 11:30 +0300, Eli Zaretskii wrote:

> >  . why is "leaving directory FOO" displayed _before_ releasing the
> >    stderr output? it should be after, I think.
> 
> The enter/leave output needs investigation for sure.

The idea was since log_working_directory() writes its messages to
stdout, we need to wrap the stdout data with them. If stderr is the
same as stdout, they are combined, so the first pump_from_tmp_fd
handles all of them and they are properly wrapped. If they are
separate, it doesn't matter when we give the leave message, since
they go to different places anyway. So I think it's not necessary
to move the leave message after the stderr output, though it
wouldn't hurt either (just require another "if (outfd_not_empty)").

> I think we're
> doing too much here.  I would have left it alone but I think some tools
> that parse make results expect to see that output to help them figure
> out what's going on.

As I described in https://savannah.gnu.org/bugs/?33138#comment3, the
problem is how to interpret messages that contain file names when
different jobs run in different subdirectories. A typical message
looks like this:

foo.c:42: syntax error blah

In fact, also make itself gives such messages which depend on the
current directory:

Makefile:7: recipe for target 'foo-fail' failed

With no context, there's no straightforward way to find the
offending file. One could search all subdirectories for a file named
foo.c (but there may be several ones in different directories), or
leave it up to the user to find the right file. But that's not
necessary if we have proper context. As I understand, Emacs has a
"directory change tracking" mode which does just that (I don't use
it myself, but a utility which does the same).

In non-parallel builds, the normal enter/leave messages from make
provide enough context. Not so in parallel builds, especially with
target-level synchronization. Adding more of these messages helps to
establish the context. As I wrote in the abovementioned comment, we
now output rather too many messages, though trimming them better
might be nontrivial (I did the easy part by giving them only if
outfd_not_empty).

> >  . btw, why an error in acquire_semaphore and release_semaphore causes
> >    Make to bail out?  isn't it better to fall back on the normal
> >    operation mode instead?
> 
> I'll look at this.  I guess the problem is that if some instances of
> make can take the semaphore and some can't, for some reason, should we
> continue anyway?

Does it bail out now? AFAICS, it just outputs an error message (via
perror) and returns NULL which causes sync_output() to skip the
current temp files. The latter part is questionable, though. Perhaps
we should output the files anyway, unsynced. The previous message at
least tells the user something's wrong, but they don't lose any
messages. So something like:

  if (outfd_not_empty || errfd_not_empty)
    {
      void *sem = acquire_semaphore ();
      [...]
      if (sem)
        release_semaphore (sem);
    }

> Sure... but I don't see the problem.  Maybe I've lost the thread.  When
> the command starts both stdout and stdin are writing to the same
> destination.  If the command does nothing special then the output will
> be a combination of stdout and stderr in the order in which the command
> generated them, which is good.  If the command script changes one or
> both of stdin/stdout, then they won't be cojoined anymore, yes, but
> that's what the user asked for...?

I also don't see a problem here. The redirection we do applies to
the stdout/stderr fd's passed to the shell. What happens afterwards
is none of our business. If they were combined and the shell
redirects just one of them, this works as it should: The redirection
in the shell works as usual, and the other fd still points to our
temp file (now alone). It's not like stdout and stderr are tied
forever if we join them. Or the other way around, if they were
originally separate and the shell does "2>&1", then both stdout and
stderr will now point to the temp file originally set up for stdout
and mix their data properly, while the temp file created for stderr
remains empty and will be ignored by sync_output().

> > > Do we even need to lock a file?  If all that's needed is a semaphore
> > > (actually, a mutex, AFAICS), Windows has good support for that which
> > > doesn't involve files at all.
> > 
> > The descriptor-based mutex has the very slight advantage over a
> > system-wide mutex in that if a sub-make's output is redirected it now
> > has its own lock domain.

Indeed, this increases throughput a little.

And vice versa, if a user starts two separate makes on the same
stdout/stderr, they will properly group their output, even though
they don't know about each other. Though I consider this at best an
incidental feature, and probably not worth worrying about.

> > However I imagine this would happen very
> > rarely (how many makefiles run sub-makes with stdout/stderr redirected?)
> > and probably won't have much performance impact anyway.

It does happen with --output-sync=make, but indeed, it's probably a
minor impact.

> I guess this points out a potential issue with the current UNIX
> implementation as well: if you had some utility you were running as part
> of your build that was also trying to lock stdout, then it would
> interfere with make.

I don't think so, since its stdout is now our temp file which it may
lock as it pleases. (In fact, that's just what recursive makes do
with --output-sync=make.)

> That seems unlikely, but maybe to avoid this and
> to deal with the potential issue of locking not supported on non-files,
> we should just create a temporary file descriptor and pass it around,
> like we do with the jobserver FDs.

So it might still be necessary for systems that don't allow locking
non-files or don't have fcntl at all.

A global mutex still wouldn't seem too bad since it's only held for
short times to copy them temp files.

Eli Zaretskii wrote:

>  . calculation of combined_output in start_job_command will need to be
>    reimplemented for Windows, since the reliance on st_dev and st_ino
>    makes assumptions that are false on Windows.

What we need is basically "is_same_file (fd1, fd2)" (which we
probably could refactor into a separate function). Eli, can you try:

  test file1 -ef file2

where test is either a standalone utility or a built-in of bash or
any other shell. If any of these works, you can see how they do it.
Otherwise, a bit of googling turned up this page which has code (in
Python, but it seems like a thin layer around the system calls, so
it can probably easily be ported to C):

http://timgolden.me.uk/python/win32_how_do_i/see_if_two_files_are_the_same_file.html

> > From: Paul Smith <address@hidden>
> >
> > On Tue, 2013-04-16 at 11:30 +0300, Eli Zaretskii wrote:
> > >
> > >  . STREAM_OK uses fcntl and F_GETFD which both don't exist on Windows.
> > 
> > This is to test if somehow make's stdout or stderr has been closed (not
> > just redirected to /dev/null, but closed).
> 
> Right, got that; but Windows needs a different test for that, as
> there's no fcntl.

Basically we need just any call that succees for a valid file and
fails otherwise. fstat() might be a good candidate. At least on
POSIX it should succeed for any open file. If that isn't available,
maybe even lseek() (as used in FD_NOT_EMPTY) will do if we keep the
"|| errno != EBADF" test. (Seeking will fail on terminals etc., but
return a different errno.) A more far-fetched idea would be dup()
with closing the new fd on success. Or perhaps some other function ...

Eli, can you do some tests to see if you find something that works
reliably?

David Boyce wrote:

> That's not the way it is on Unix. My understanding, and I believe this
> applies to all Unix variants, is that because the original stdio FILE
> structure used an 8-bit int to hold the file descriptor, the number of
> available streams is <256 always and forever (in 32-bit variants - 64-bit
> builds fix lots of limits). I don't know of any Unix variants which don't
> allow at least 1024 descriptors, and some allow the limit to be raised
> dynamically without bound, but the limit on descriptors *associated with
> streams* is fixed at 256.

FILE is an opaque structure which should never be allocated by user
code, so its layout can be implementation specific. In particular,
on Linux the field is an int and the following test program reports
1021 open streams (open files limit 1024 minus the 3 standard fd's;
if I increase the ulimit, the programs reports correspondingly more
streams):

#include <stdio.h>
#include <errno.h>

#define N 10000

int main ()
{
  int i;
  FILE *f[N];
  for (i = 0; i < N; i++)
    {
      f[i] = fopen ("a.out", "r");
      if (!f[i])
        {
          printf ("%i: %s\n", i, strerror (errno));
          break;
        }
    }
  return 0;
}

PS: In assign_child_tempfiles(), the following declarations are no more
needed and can be removed:

  FILE *outstrm = NULL, *errstrm = NULL;
  const char *suppressed = "output-sync suppressed: ";
  char *failmode = NULL;
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [bug #33138] .PARLLELSYNC enhancement with patch, (continued)
Prev by Date: Re: [bug #33138] .PARLLELSYNC enhancement with patch
Next by Date: Re: [bug #33138] .PARLLELSYNC enhancement with patch
Previous by thread: Re: [bug #33138] .PARLLELSYNC enhancement with patch
Next by thread: Re: [bug #33138] .PARLLELSYNC enhancement with patch
Index(es):
- Date
- Thread