guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: out-of-control GC


From: Linas Vepstas
Subject: Re: out-of-control GC
Date: Sun, 10 Sep 2017 16:47:09 -0500

On Sun, Sep 10, 2017 at 3:36 PM, Marko Rauhamaa <address@hidden> wrote:

> Linas Vepstas <address@hidden>:
> > which suggests that the vast majority of time is spent in GC, and not
> > in either scheme/guile, or my wrapped c++ code.
>
> That may well be normal.
>

It might be "normal", but its objectionable.  If 90% of the cpu time is
spent in GC, and almost nothing at all was actually collected, then its too
much.   On the Amazon cloud, you get to pay by the minute. On laptops with
batteries, people complain. Especially when they are apple users, not linux
users.

If the compute job takes a week, and your user has discovered this GC
thing, then the user will conclude that it should take less than 24 hours,
and will tell you that you are stupid, and that guile sucks.  I have been
told I'm stupid, and that guile sucks too many times.

CPU cycles aren't free.


> > 4 gc's/minute means one every 15 seconds, which sounds reasonable,
> > except that RAM usage is so huge, that each GC takes many seconds to
> > complete.
>
> That may or may not be normal. However, it should not be a function of
> how much RAM has been allocated ourside Guile's heap.
>

The bigger the RAM usage, the slower it seems to be.  It might be due to
fragmentation.  It might be due to memory-bandwidth effects: if GC touches
every page or every other page, out of 100GB, it takes time to move those
pages from RAM through 3rd, 2nd and 1st-level cache.   DDR3 has a bandwidth
of maybe 25GB/sec, if you are lucky; less depending on the northbridge and
cpu design

Its possible/likely that mixed c++ and guile code results in heavily
fragmented memory, where every 4K page has 100 bytes of guile, and 3900
bytes of non-collectible memory.  The TLB will get severely thrashed in
this scenario.  TLB's are infamously under-sized, and have been for decades.


> > Anyway, this demo gives a scenario where gc seems to run too often.
>
> How often is too often? I would imagine the objective is to minimize the
> stop-the-world periods. IOW, you usually want the program to run very
> steadily.
>

Its too often if GC collected less than 10% of the current in-use heap.

In the current case I am looking at, stop-the-world lasts about 5 seconds,
and the guile in-use heap changes by less than 1% before and after.


>
> > I have other scenarios where it does not run often enough -- I have
> > one where the working set is about 200MBytes, but guile happily grows
> > to 10GB or 20GB RAM over a few hours.
>
> You mean the size of Guile's heap?

Yes.


> That seems pathological.

Yes.


> Maybe there
> are lots of values in your data that accidentally look like heap
> addresses.
>

 Maybe. But probably not. For this particular scenario, there are vast
quantities of 1MB to 10MB sized C strings that are sent to
scm_eval_string(), and are then freed.  These C strings are approx 10%
scheme s-exps, and about 90% utf8 strings in various natural languages
(french, chinese, etc)

However, guile uses gc_malloc_atomic for these strings, and so GC should
never-ever peer into the contents of them. So I don't think its "accidental
heap addresses".  Its probably fragmentation, and the very heavy churn of
large malloc chunks.


> > My solution for that other case is to manually GC whenever gc-stats
> > gets above 1GB, because I know a priori my working set is smaller than
> > that.
> >
> > The overall message seems to be that gc either runs too often, or not
> > often enough. Some sort of better planning is needed.
> >
> > I'm motivated to explore this, as it actually impacts my work.
>
> Absolutely, although as an application programmer you should be able to
> trust that GC just works.
>

Well, yes, but I also have to handle user feedback about where I should
stick it.

The c++ code that I manage uses smart pointers for memory management. I
have no idea what fraction of cpu times is spent in that code, as there is
no simple, practical way of instrumenting that code and measuring it.  It
might be a sloppy disaster for all I know -- but I don't know.

By contrast, the guile gc can be directly observed, and some of my users
observe it. Myself, I'm willing to partly ignore it, but it keeps coming up
as a possible bottleneck.

So let me redirect: is there some central spot within guile, where the
decision to run GC is made? If so, what file would that be in? If not, is
there some string I should grep for?

--linas




-- 
*"The problem is not that artificial intelligence will get too smart and
take over the world," computer scientist Pedro Domingos writes, "the
problem is that it's too stupid and already has." *


reply via email to

[Prev in Thread] Current Thread [Next in Thread]