2014/1/12 Paul Smith <address@hidden
> On Wed, 2013-12-18 at 13:28 +0200, Eddy Petrișor wrote:
>> Could you please confirm if the general direction of the the is OK in
>> the latest patch I sent?
> Conceptually it seems OK. I'm still not jazzed about having any more
> than one output format, and I'd prefer that format to be in a
> more-or-less readable form, more like the "long" form than the others.
If that will be the only format, then it would mean always imposing a lot more more work on the information processing stage due to necessary filtering and transformations to fit a format accepted or easily parsed in a tool such as Oocalc, Gnumeric or Excel.
Although human readable seems nice and understandable, it is the least machine parseable, hence my choice for the 'simple' format to be default.
> I think the output should go in the standard make output format, so
> something like:
> make[<LVL>]: <target>: <details...>
> Or, alternatively:
> <target>[<LVL>]: <details>
> Also I think it's enough to show the start offset and the elapsed time.
> End offset is not necessary IMO.
Unfortunately, depending on the used tools, when processing the information the end time is necessary. For instance, in Microsoft Excel the only way to display graphs for intervals is via a graph designed for visualising stocks' variation (and it even forces the insertion of an extra field).
Oocalc is fine with start and stop, but for the graphs look awful and are unusable with absolute values (it scales so the entire 0-timestamp interval is visible, so a difference of a few seconds is invisible on the graph), so relative values are better here.
OTOH, relative time stamps or just durations are useful for a human eye examination, since is easy to spot offenders that way.
These are three different scenarios which I myself encountered and had in mind when designing the code the way I did.
> I'm unsure about the PID. This is the pid of the make process so I'm
> not sure what the goal is. Is it to be able to collect all the times
> together maybe?
The goal is to be able to:
- spot targets evaluated redundantly in a recursive makefile in different processes but on the same recursion level (these targets could be candidates to be moved in a parent make invocation)
- spot wasteful/needless recursive calls (e.g. several targets called in different make processes when they could be grouped in a single call)
- be able to analyse a single make invocation or a call sub-tree
> Is it necessary to dump all the output times at the end? Doing so
> requires that we increase the size of the file structure to hold the
> information, and this is already large AND the most common structure in
> memory; there's one for every single target which for non-recursive
> builds can get really big. I'm trying to keep memory usage under
> If instead of that we print the information after each target is
> complete we can shift the storage of this information out of the file
> structure and into the commands structure or similar. To me it seems
> more useful to keep the elapsed time info right next to the command
> output rather than dumping it all at the end.
I'm afraid I am missing some details of the implementation so I can't answer that question in any meaningful way.
I will have to look into the code, but if a single target does NOT have multiple commands structures, it should work.
Any pointers to the appropriate code area or suggestions would be welcome.
> Some other comments:
> 1. In general remember that GNU make code must conform to ANSI
> C89 / ISO C90. We shouldn't be using newer features of the
> language or runtime library unless we need to, and most of those
> require some kind of autoconf test.
I'm sure you had some specific code in mind when you wrote this. I assumed the build system would have the appropriate compiler options for the desired compliance level. Should I compile with '-std=c99 -pedantic-errors' to check, or do you have other options in mind?
> 2. Let's avoid float and double (and struct timeval). There's no
> reason why we can't fit enough precision in a uint32 to count
> elapsed time in milliseconds for a build: that gives 50 days or
> so. GNU make still supports running on systems where there is
> no floating point support (see the NO_FLOAT #define). Although
> I haven't tested it in a while.
I was aware of this since your first email. I wanted to know of the general idea is OK.
I will change this, too.
> 3. The use of "$" tokens in printf() statements is likely
> problematic from a portability standpoint. It seems like this
> should be relatively easy to avoid.
I'll change them into simple format specifiers, in spite of the repetitions.
> 4. If the printed string contains text then it needs to be marked
> for translation (with the "_(...)" macro).
Since the profiling info should be machine parsable, I think the only translatable string would be the long format. I will change this, although the inconsistency rubs me the wrong way.
> 5. We don't want to be using fprintf() here. All output needs to
> go through the output.c module, so that it's properly managed
> via output sync.
I was aware of the output sync issue, I will look into it.
> 6. gettimeofday is not portable. Also, it's not really the best
> option for timing things because (due to NTP etc.) it can change
> (by that I mean that if it reports 10s elapsed it might not be
Hmm, didn't thought of that.
> that 10s really elapsed). Using a monotonic clock is better,
> although that's also not portable. But if we have to be
> non-portable maybe we should try to get an accurate accounting.
> On the gripping hand maybe it's not that important to be
> absolutely accurate.
Is there a portable timing API that has us resolution? Not sure if ms resolution is enough given the speed of current system and Moore's Law predicting even faster systems.
> You mentioned something about trying to send the start time through the
> environment but I don't see any code to that effect here; how were you
> doing that?
As said in my previous mail, I wanted to avoid confusion regarding what code to review.
In reply to your newest mail:
> Sorry, I was not clear: I wasn't suggesting that it would be better to
> display the absolute time for the start time. I was wondering why we
display the start time at all. Why not just show the elapsed time, and
nothing else? That would avoid all of these issues.
However Tim makes a reasonable point in his response so if it can be
done without too much difficulty it would be good to show a relative
Also, the start time is necessary to be able to see which target was waiting for which in a graph. This way one can see a target starting later than it should logically start, making it a good avenue for performance improvement investigations.