bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/4] faster gnulib-tool


From: Paolo Bonzini
Subject: Re: [PATCH 0/4] faster gnulib-tool
Date: Fri, 02 Jan 2009 10:17:00 +0100
User-agent: Thunderbird 2.0.0.18 (Macintosh/20081105)

Bruno Haible wrote:
> Hello Ralf,
> 
> Thank you for your speedups to gnulib-tool. At first I was, of course,
> excited about the 2x speedup. But when looking at the maintainability
> of the code that you propose, I'm not fine with all of it any more.
> 
> My four objections are:
> 
> 1) You observe that forking programs in a shell script is slow, and
>    therefore propose to use more shell built-ins. The problem with it
>    is that I chose to implement gnulib-tool in shell (for the control
>    structure) and sed (for the text processing).
> 
>    If you want to achieve good speedups for scripts that use 'sed':
>    can you work towards making 'sed' a bash built-in? This is challenging,
>    but if you are after performance, that would be promising.

The only microoptimization that is worthwhile doing to speed up shell
scripts, is to avoid forking.  This is *no exaggeration*.  (I am not
talking about algorithmic improvements; though in some cases, as Ralf
showed, forking can hide the benefits of algorithmic improvements).

We (Ralf, Eric and I) saved over 30% of execution time of Autoconf
scripts on Cygwin, and a few percent on Linux too, just by removing one
or two forks here and there.  It's not about making sed a bash builtin,
it's about not forking for things such as

   (...)
   echo abc | ...

A while ago I made a lot of timings regarding the speed of various shell
constructs; you can find them on the Autoconf list.  Here are the
relevant ones:

   $ time sh -c 'for i in `seq 1 1000`; do :; done'
   user    0m0.034s
   sys     0m0.024s

   $ time sh -c 'for i in `seq 1 1000`; do (:); done'
   user    0m0.486s
   sys     0m2.377s

   $ time sh -c 'for i in `seq 1 1000`; do echo abc | :; done'
   user    0m0.958s
   sys     0m4.657s

echo and : are shell builtins, but they fork, so they're slow.  s/:/sed/
and you see my point.

If this 10x-30x improvement affected 20% of the shell execution time,
one could expect a decent speedup.

That would probably amount to a rewrite of bash, dash, or whatever else.
   You would have to make the main shell loop centered around
event-driven processing of file descriptors, to provide all the pipes
with a single process.

A fun project, but probably not one that I or anyone else will attempt
without funding. :-)

Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]