guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to make GNU Guile more successful


From: Freja Nordsiek
Subject: Re: How to make GNU Guile more successful
Date: Sun, 16 Jul 2017 10:30:25 +0200

If I was to hazard a reason for why Guile gets very slow when loading
20 GB or more (may or may not be related to it being buggy and
crashy), my guesses would be a lot of the data when loaded into Guile
was allocated such that the GC scans it for pointers (using
scm_gc_malloc instead of scm_gc_malloc_pointerless) which would vastly
increase the amount of memory the GC needs to scan every time it runs.

Depending on the data types and what is in them, it may be needless
for the GC to run through the bulk of the data looking for pointers
and this might be a fixable problem. For example, it generally isn't
necessary to scan inside strings for pointers so if that is being
done, there is something in Guile to fix.

If there are really pointers in it (say it is a lot of and/or big
lists, vectors, hash tables, etc.) then the GC really does need to
scan them, which suggests a different kind of data structure would
work around the problem. This is not always doable, and even if doable
could take a lot of programmer time. It seems that Go programmers have
run into this with very large maps already (see
https://github.com/golang/go/issues/9477 and
https://groups.google.com/forum/#!topic/golang-nuts/pHYverdFcLc ).

No idea how this relates to being buggy or crashy.


Freja Nordsiek

On Fri, Jul 14, 2017 at 11:54 PM, Linas Vepstas <address@hidden> wrote:
> On Mon, Feb 13, 2017 at 2:28 PM, Panicz Maciej Godek <address@hidden
>> wrote:
>
>>
>> someone
>> responded critically: "are there out of the box libraries to estimate a
>> zero inflated negative
>> binomial regression model in guile". Of course, if I knew what a
>> zero-inflated
>> negative binomial regression model, I could deliver an implementation by
>> just explaining
>> the notions used in that phrase.
>
>
> Caution: the message below sounds negative.  Sorry, I use guile daily and
> almost exclusively now. So there ...
>
> Lack of decent science libraries for scheme is a major stumbling block, for
> me. Simply having sine and cosine is not enough.   I got excited (a decade
> ago) when I realized that guile supported GnuMP, and then rapidly deflated
> when I realized it only supported integers and rationals in GnuMP .. I work
> with arbitrary-precision floats.  Or, I did back then.
>
> Maybe more important is making guile work well with large-RAM setups.
> Currently, I do data analysis, every day, in guile, on datasets that take
> 20GB or 40GB -- my current one is 110GB when loaded in RAM, and guile
> starts getting buggy, crashy and slow when working at that size.
> Sometimes, it starts calling GC half-a-dozen times per second, for no
> apparent reason, eating up 6 cores (or more!) doing nothing but GC. Why?
> Who knows? Who can tell?
>
> Yes, I have a machine with 256 GB RAM and a few dozen cores, and SSD's that
> hold the data, but every time guile crashes, I have to wait an hour for the
> data to reload.  I can live with it, but its a dirty secret I would not
> share with guile wannabe users.
>
> String handling in guile is a disaster area: If I give it a
> 10-megabyte-long string in utf8, it promptly tries to convert all of that
> string in utf32, for utterly pointless reasons. This just makes it slow.
>
> There are still bugs between GC and the compiler: if call (eval "(some
> stuff) (other stuff)")  the compiler will try to compile that string (after
> it was converted ti utf32!) and if GC happens to run at just that moment,
> guile crashes or hangs.  These bugs need to be fixed.
>
> So although its a good start, there's a lot of work left until it can get
> to "the next level". And that work can't happen until guile is more
> popular. So it's very much chicken-and-egg scenario.
>
> --linas



reply via email to

[Prev in Thread] Current Thread [Next in Thread]