autoconf-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fix m4_join


From: Eric Blake
Subject: Re: Fix m4_join
Date: Mon, 15 Oct 2007 19:45:10 +0000 (UTC)
User-agent: Loom/3.14 (http://gmane.org/)

Ralf Wildenhues <Ralf.Wildenhues <at> gmx.de> writes:

> Now it would be even better if m4_join were
> efficient in that it would not scale quadratically:
> 
> m4_join([:
> ], m4_for([i], 1, 1000, [], [i,]))

Yes, some improvements from cubic to quadratic were due to bug fixes in 
m4sugar - the rule of thumb here for optimal behavior is that when writing a 
recursive algorithm, you should limit the parsing of $@ per macro invocation, 
and you should completely output all text related to the first element of the 
list before resuming recursion on the rest of the list.  Although I don't see 
evidence of m4_join being cubic, I do know for certain that m4_foreach was 
cubic in autoconf 2.52, fixed 2002-03-04 for 2.53.  There, prior to the fix, 
each recursion of _m4_foreach resulted in more text around the previous 
invocation, saving the final expansion until the recursion completed, and with 
each iteration requiring one more m4_shift than the previous round.  After the 
fix, the first list element is completely processed before moving on to the 
recursion for the rest of the list, with a constant number of m4_shift's per 
round.

But to go from quadratic to linear, that's a different story.  Here's where 
m4sugar can no longer fix things, but where a fundamental improvement needs to 
be made in M4 itself.  It may be rather invasive, but the payoffs would be 
tremendous for any use of address@hidden

The problem lies in the current m4 implementation of $@ - namely, it spends 
time constructing a string consisting of every quoted argument, and pushes it 
on the obstack, only to then reparse the obstack and turn it back into a list 
of arguments for the next macro in the recursion, even though only the first 2 
or 3 elements of the list even factor into the current iteration.  In other 
words, since iterating through n items results in n obstack growth/parse 
cycles, you have quadratic behavior.  Because of the rules of m4, there are 
some cases where this MUST be done (for example, if the user does a changequote 
in between, or if some of the arguments contain unbalanced quotes).  But in the 
common case, it would be much nicer if m4 would recognize $@ in a macro 
expansion, and simply store a placeholder on the obstack pointing to the 
previous list of parsed arguments.  Most macros simply use accessor methods 
that would expand the placeholder as it is encountered.  But the ifelse, ifdef, 
and shift builtins should be taught how to recognize the placeholder directly, 
where the conditionals avoid the overhead of expanding the placeholder if it 
occurred in an untaken branch, and where the shift merely adjusts where the 
placeholder points to.

Unrelated, but also inherent in m4, is the fact that m4_append is quadratic - 
every invocation concatenates the old definition with the new text, and mallocs 
a new definition for the macro.  That would be another nice idiom to optimize, 
by making the defn builtin expand to a special placeholder.  Most macros simply 
use accessor methods that would expand the placeholder as it is encountered.  
But the define and popdef builtins should recognize a placeholder followed by 
new text as an append operation, then grow the malloc area in a geometric 
fashion (if it isn't large enough already) without wasting time revisiting all 
of the former definition (similar to the speedups possible when bash added the 
+= variable assignment operator).

-- 
Eric Blake







reply via email to

[Prev in Thread] Current Thread [Next in Thread]