[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug #40639] GNU Make with profiling information

From: Tim Murphy
Subject: Re: [bug #40639] GNU Make with profiling information
Date: Tue, 14 Jan 2014 15:55:11 +0000

I forgot to say that start times don't need to be absolute times -
only relative to the start of the top level gmake if possible.   That
creates a problem for submakes I suppose.

I would guess that one could put the absolute build start time in an
environment variable like MAKE_START_TIME and then use that in every
submake to get the relative start time.
I haven't looked at the patch - perhaps it's doing this?

In any case, fixed/floating point seconds since 1970 is the nicest
format to process from scripts in my experience.



On 14 January 2014 15:49, Tim Murphy <address@hidden> wrote:
> To some, using a spreadsheet might not seem like the most worthwhile
> way to visualise timing information.
> If it was me, I'd be far more concerned about whether I could write a
> script that could easily cope with all this information.  Builds with
> hundreds of thousands of targets were common for me at one point and
> nowadays I do android stuff - how much is that?  I think it's
> somewhere around 36,000.
> This scale makes spreadsheets relatively unimportant as an analysis
> tool and makes it necessary to pass information through a script to
> first extract or summarise the information to a level where humans can
> deal with it.
> Hence:
>  a)  an absolute start time and
>  b)  a duration
> ...are easy to process in scripts to reconstruct whatever form one
> needs - a spreadsheet for you and a different kind of special graph
> for me. Both examples of a profiling feature for make that I'm aware
> of already use this format to good effect.
> It's also worth trying to produce these figures as each job finishes
> and then throw them away because then the build doesn't have to finish
> before one is able to process the data. You might use it, for example
> to provide progress information.  e.g. if you keep information from a
> previous build and combine it with profiling information coming out of
> the new build you can guess how long is left.
> Regards,
> Tim
> On 14 January 2014 13:48, Eddy Petrișor <address@hidden> wrote:
>> 2014/1/12 Paul Smith <address@hidden>:
>>> On Wed, 2013-12-18 at 13:28 +0200, Eddy Petrișor wrote:
>>>> Could you please confirm if the general direction of the the is OK in
>>>> the latest patch I sent?
>>> Conceptually it seems OK.  I'm still not jazzed about having any more
>>> than one output format, and I'd prefer that format to be in a
>>> more-or-less readable form, more like the "long" form than the others.
>> If that will be the only format, then it would mean always imposing a lot
>> more more work on the information processing stage due to necessary
>> filtering and transformations to fit a format accepted or easily parsed in a
>> tool such as Oocalc, Gnumeric or Excel.
>> Although human readable seems nice and understandable, it is the least
>> machine parseable, hence my choice for the 'simple' format to be default.
>>> I think the output should go in the standard make output format, so
>>> something like:
>>>   make[<LVL>]: <target>: <details...>
>>> Or, alternatively:
>>>   <target>[<LVL>]: <details>
>>> Also I think it's enough to show the start offset and the elapsed time.
>>> End offset is not necessary IMO.
>> Unfortunately, depending on the used tools, when processing the information
>> the end time is necessary. For instance, in Microsoft Excel the only way to
>> display graphs for intervals is via a graph designed for visualising stocks'
>> variation (and it even forces the insertion of an extra field).
>> Oocalc is fine with start and stop, but for the graphs look awful and are
>> unusable with absolute values (it scales so the entire 0-timestamp interval
>> is visible, so a difference of a few seconds is invisible on the graph), so
>> relative values are better here.
>> OTOH, relative time stamps or just durations are useful for a human eye
>> examination, since is easy to spot offenders that way.
>> These are three different scenarios which I myself encountered and had in
>> mind when designing the code the way I did.
>>> I'm unsure about the PID.  This is the pid of the make process so I'm
>>> not sure what the goal is.  Is it to be able to collect all the times
>>> together maybe?
>> The goal is to be able to:
>>  - spot targets evaluated redundantly in a recursive makefile in different
>> processes but on the same recursion level (these targets could be candidates
>> to be moved in a parent make invocation)
>>  - spot wasteful/needless recursive calls (e.g. several targets called in
>> different make processes when they could be grouped in a single call)
>>  - be able to analyse a single make invocation or a call sub-tree
>>> Is it necessary to dump all the output times at the end?  Doing so
>>> requires that we increase the size of the file structure to hold the
>>> information, and this is already large AND the most common structure in
>>> memory; there's one for every single target which for non-recursive
>>> builds can get really big.  I'm trying to keep memory usage under
>>> control.
>>> If instead of that we print the information after each target is
>>> complete we can shift the storage of this information out of the file
>>> structure and into the commands structure or similar.  To me it seems
>>> more useful to keep the elapsed time info right next to the command
>>> output rather than dumping it all at the end.
>> I'm afraid I am missing some details of the implementation so I can't answer
>> that question in any meaningful way.
>> I will have to look into the code, but if a single target does NOT have
>> multiple commands structures, it should work.
>> Any pointers to the appropriate code area or  suggestions would be welcome.
>>> Some other comments:
>>>      1. In general remember that GNU make code must conform to ANSI
>>>         C89 / ISO C90.  We shouldn't be using newer features of the
>>>         language or runtime library unless we need to, and most of those
>>>         require some kind of autoconf test.
>> I'm sure you had some specific code in mind when you wrote this. I assumed
>> the build system would have the appropriate compiler options for the desired
>> compliance level. Should I compile with '-std=c99 -pedantic-errors' to
>> check, or do you have other options in mind?
>>>      2. Let's avoid float and double (and struct timeval).  There's no
>>>         reason why we can't fit enough precision in a uint32 to count
>>>         elapsed time in milliseconds for a build: that gives 50 days or
>>>         so.  GNU make still supports running on systems where there is
>>>         no floating point support (see the NO_FLOAT #define).  Although
>>>         I haven't tested it in a while.
>> I was aware of this since your first email. I wanted to know of the general
>> idea is OK.
>> I will change this, too.
>>>      3. The use of "$" tokens in printf() statements is likely
>>>         problematic from a portability standpoint.  It seems like this
>>>         should be relatively easy to avoid.
>> I'll change them into simple format specifiers, in spite of the repetitions.
>>>      4. If the printed string contains text then it needs to be marked
>>>         for translation (with the "_(...)" macro).
>> Since the profiling info should be machine parsable, I think the only
>> translatable string would be the long format. I will change this, although
>> the inconsistency rubs me the wrong way.
>>>      5. We don't want to be using fprintf() here.  All output needs to
>>>         go through the output.c module, so that it's properly managed
>>>         via output sync.
>> I was aware of the output sync issue, I will look into it.
>>>      6. gettimeofday is not portable.  Also, it's not really the best
>>>         option for timing things because (due to NTP etc.) it can change
>>>         (by that I mean that if it reports 10s elapsed it might not be
>> Hmm, didn't thought of that.
>>>         that 10s really elapsed).  Using a monotonic clock is better,
>>>         although that's also not portable.  But if we have to be
>>>         non-portable maybe we should try to get an accurate accounting.
>>>         On the gripping hand maybe it's not that important to be
>>>         absolutely accurate.
>> Is there a portable timing API that has us resolution? Not sure if ms
>> resolution is enough given the speed of current system and Moore's Law
>> predicting even faster systems.
>>> You mentioned something about trying to send the start time through the
>>> environment but I don't see any code to that effect here; how were you
>>> doing that?
>> As said in my previous mail, I wanted to avoid confusion regarding what code
>> to review.
>> ====
>> In reply to your newest mail:
>>> Sorry, I was not clear: I wasn't suggesting that it would be better to
>>> display the absolute time for the start time.  I was wondering why we
>>> display the start time at all.  Why not just show the elapsed time, and
>>> nothing else?  That would avoid all of these issues.
>>> However Tim makes a reasonable point in his response so if it can be
>>> done without too much difficulty it would be good to show a relative
>>> start time.
>> Also, the start time is necessary to be able to see which target was waiting
>> for which in a graph. This way one can see a target starting later than it
>> should logically start, making it a good avenue for performance improvement
>> investigations.
>> --
>> Eddy Petrișor
>> _______________________________________________
>> Bug-make mailing list
>> address@hidden
>> https://lists.gnu.org/mailman/listinfo/bug-make
> --
> You could help some brave and decent people to have access to
> uncensored news by making a donation at:
> http://www.thezimbabwean.co.uk/friends/

You could help some brave and decent people to have access to
uncensored news by making a donation at:


reply via email to

[Prev in Thread] Current Thread [Next in Thread]