bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Severe memleak in sequence expressions?


From: Bob Proulx
Subject: Re: Severe memleak in sequence expressions?
Date: Wed, 30 Nov 2011 21:34:09 -0700
User-agent: Mutt/1.5.21 (2010-09-15)

Marc Schiffbauer wrote:
> Greg Wooledge schrieb:
> > Marc Schiffbauer wrote:
> > > echo {0..10000000}>/dev/null
> > > 
> > > This makes my system starting to swap as bash will use several GiB of
> > > memory.
> >
> > In my opinion, no.  You're asking bash to generate a list of words from 0
> > to 1000000000000000000 all at once.  It faithfully attempts to do so.
> 
> Yeah, ok but it will not free the mem it allocated later on (see
> other mail)

You are obviously not going to like the answers from most people about
what you are considering a bug.  But that is just the way that it is
going to be.  I think the shell is doing okay.  I think what you are
asking the shell to do is unreasonable.

Basically the feature starts out like this.  Brace expansion is a new
feature.  It didn't exist before.  Then csh comes along and thinks, it
would be really nice to have a way to produce a quick expansion of
strings in the order listed and the result is the "metanotation" of
"a{b,c,d}e" resulting in brace expansion.  More time goes by and bash
thinks, it would be really nice if that brace expansion feature also
worked for sequences and the result is the {n..m[..incr]} added to
brace expansion where it expands to be all terms between n and m
inclusively.  A lot of people use it routinely in scripts to generate
sequences.  It is a useful feature.  The shell is better to have it
than to not have it.

Then, BAM, someone comes along and says, I tried putting a number so
large it might as well be infinity into the end condition.  Bash both
consumed a lot of memory trying to generate the requested argument
list to pass to the program.  Then afterward bash didn't give the
memory it allocated back to the operating system.

To generate 0..9 uses 20 bytes.  10..99 uses 270 bytes.  Let's work
out a few:

  0..9 20
  10..99 270
  100..999 3,600
  1000..9999 45,000
  10000..99999 540,000
  100000..999999 6,300,000
  1000000..9999999 72,000,000

In total to generate all of the arguments for {0..10000000} consumes
at least 78,888,899 bytes or 75 megabytes of memory(!) if I did all of
the math right.  Each order of magnitude added grows the amount of
required memory by an *order of magnitude*.  This should not in any
way be surprising.  In order to generate 1000000000000000000 arguments
it might consume 7.8e7 * 1e10 equals 7.8e17 bytes ignoring the smaller
second order effects.  That is a lot of petabytes of memory!  And it
is terribly inefficient.  You would never really want to do it this
way.  You wouldn't want to burn that much memory all at once.  Instead
you would want to make a for-loop to iterate over the sequence such as
the "for ((i=1; i<=1000000000000000000; i++)); do" construct that Greg
suggested.  That is a much more efficient way to do a loop over that
many items.  And it will execute much faster.  Although a loop of that
large will take a long time to complete.

Put yourself in a shell author's position.  What would you think of
this situation?  Trying to generate an unreasonably large number of
program arguments is, well, unreasonable.  I think this is clearly an
abuse of the feature.  You can't expect any program to be able to
generate and use that much memory.

And as for whether a program should return unused memory back to the
operating system for better or worse very few programs actually do it.
It isn't simple.  It requires more accounting to keep track of memory
in order to know what can be returned.  It adds to the complexity of
the code and complexity tends to create bugs.  I would rather have a
simple and bug free program than one that is full of features but also
full of bugs.  Especially the shell where bugs are really bad.
Especially in a case like this where that large memory footprint was
only due to the unreasonably large argument list it was asked to
create.  Using a more efficient language construct avoids the memory
growth, which is undesirable no matter what, and once that memmory
growth is avoided then there isn't a need to return the memory it
isn't using to the system either.

If you want bash to be reduced to a smaller size try exec'ing itself
in order to do this.

  $ exec bash

That is my 2 cents worth plus a little more for free. :-)

Bob



reply via email to

[Prev in Thread] Current Thread [Next in Thread]