guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Limiting parallelism using futures, parallel forms and fibers


From: Zelphir Kaltstahl
Subject: Re: Limiting parallelism using futures, parallel forms and fibers
Date: Fri, 10 Jan 2020 01:43:18 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2

Hi Chris!

On 1/8/20 12:44 PM, Chris Vine wrote:
> On Wed, 8 Jan 2020 08:56:11 +0100
> Zelphir Kaltstahl <address@hidden> wrote:
> [snip]
>> So my questions are:
>>
>> - Is there a default / recommended way to limit parallelism for
>> recursive calls to parallel forms?
>>
>> - Is there a better way than a global counter with locking, to limit the
>> number of futures created during recursive calls? I would dislike very
>> much to have to do something like global state + mutex.
>>
>> - What do you recommend in general to solve this?
> I think you have it wrong, and that futures use a global queue and a
> global set of worker threads.  I don't see how futures could work
> without at least a global set of worker threads.  Have a look at the
> futures source code.

So far I did not write any parallel code in my project. I am only
exploring possibilities, finding out what I should be using. I also
could not think of a good way to limit the number of threads to anything
less than the (current-processor-count) value, but thought maybe someone
knows something I did not see or read about Guile's futures yet : )

> If you want more control over the thread pool than guile's futures
> provide, you could consider something like this:
> https://github.com/ChrisVine/guile-a-sync2/blob/master/a-sync/thread-pool.scm
> But then you would have to make your own futures if you want a graph
> of futures rather than a graph of coroutines (which is what you would
> get if you use the associated event loop).

This seems like a lot of code for the purpose of being able to limit the
maximum number of threads running at the same time. (Perhaps it is
necessary?)

I would not like to create some kind of futures alternative and whatever
else I would have to do. I am not even sure what "making my own futures"
would entail.

Perhaps I can create something like a thread pool using a reused fibers
scheduler, which I give the chosen limit in the #:parallelism keyword
argument. It spawns only as many fibers at the same time, as are
allowed, if that is possible (some kind of loop inside (run-fibers …)?)
Whenever one spawned fiber finishes, it checks, if there is more work
available (on some channel?) and put that work in a new spawned fiber.
If I squint at the fibers library, it almost looks like a thread pool
should be not too difficult to implement using the library, but maybe I
am wrong about this and there is some fundamental difficulty.

> The main thing is to get the parallel algorithm right, which can be
> tricky.

In my decision tree the fitting of the tree is purely functional code so
far (as far as I know). If I can only find a way to limit the
parallelism as I want to, I should be able to simply spawn some kind of
unit of evaluation for each subtree recursively. Usually decision trees
should not be too deep, as they tend to easily overfit, so I think the
overhead would not be too much for spawning a unit of parallel evaluation.

In the end, if it turns out to be too difficult to limit the maximum
number of threads running in parallel, I might choose to discard the
idea and instead use all available cores, depending on the tree depth
and number of cores.


Initially I had 2 reasons for porting the whole program to GNU Guile:

1. I had difficulties using places in Racket, as I could not send normal
lambdas to other places. This means I would have to predefine the
procedures, instead of sending them on channels or I would have to use
serializable-lambdas, which someone told me they did not manage to use
for channels and then later someone on the Racket user list told me they
can be used for sending over channels. I had already started creating a
thread pool, but this whole serializable-lambda thing stopped me from
finishing that project and it is a half-finished thing, waiting to be
completed some day by anyone. Guile in comparison seems to have easier
to use ways of achieving multicore usage.

2. I am using Guile more than Racket now and wanted my own code to be
available there. Possibly develop more machine learning algorithms in
Guile, getting the basics available.

Then, during porting the code, I found more reasons for going on with
porting:

* I used many Racketisms, which are not available in Scheme dialects. My
code would only ever work on Racket.
* I separated out many modules, which in the Racket code were all in one
big file.
* I added tests for some new procedures and some separated modules.
* I refactored some parts, made some display procedures testable with
optional output port arguments.

Regards,
Zelphir




reply via email to

[Prev in Thread] Current Thread [Next in Thread]