[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnugo-devel] Parallelizing GNU Go

From: Evan Berggren Daniel
Subject: Re: [gnugo-devel] Parallelizing GNU Go
Date: Wed, 2 Apr 2003 22:59:33 -0500 (EST)

On Wed, 2 Apr 2003, John Paul Rodman wrote:

> On Tue, 1 Apr 2003, Dave Denholm wrote:
> > It has been mentioned in the past...
> >
> > First question is whether to address
> >
> > a) multi cpu machines
> I have access to some 64 CPU machines at the university where I would be
> running the software.
> I will probably be using MPI or OpenMP.
> Does anyone know if there is an open-source version of either of these,
> so that I could distribute the code?

I know such exist, but I would not know where to find them.

> > b) network of single-cpu machines
> I also have access, through the university, to the distributed (Seti at
> Home style) software developed by United Devices.  Unfortunately, there
> are some problems with using this with GNU Go, that make me lean towards
> the multi cpu machine idea.
> > c) Intel's new scheme (hyper threading..?) for scaling up single cpu.
> I don't know anything about this.

Basically, in an attempt to increase single-CPU performance, modern P4
chips look like two processors logically.  This is implemented by
duplicating things like the instruction pointer and register file, but not
caches or execution resources.  This lets the cpu run two programs that
make use of different execution units, or have different caching behavior,
or experience a lot of pipeline stalls, both at > 50% of the speed on a
non-hyperthreading system.  Intel claims benefits from 0% to 30%, with
typical programs in the 15% to 25% range.

To write for a single-CPU hyperthreading machine, you would just treat it
as a very tightly coupled dual-CPU maching.  For larger systems, you could
treat it as a system with double the CPU count, or you could make use of
the fact that the pairs are more tightly coupled and code for essentially
a NUMA architecture, depending on OS support.

> > a) multi-cpu machines are sufficiently rare that the benefit
> >  is limited. A factor-of-2 speedup doesn't seem big enough
> >  to go after.
> There is a version of MPI that runs distributed over a set of Linux
> machines.
> >  The main process could, say, build a list of moves to consider, and
> >  spawn a child process to look at each one.
> This is just what I was thinking of doing.
> Here are a couple of methods I was considering for the master process to
> pick which moves to send to the child processes to evaluate.
> 1) Breadth search
> Evaluate *every* possible move from the current position.

You don't want to evaluate every move.  Evaluating a lot of moves is
certainly an option, but realistically you need to throw away a large

> 2) Depth search
> Have the master process evaluate the position, and pick the 10 best
> moves, then pick the 10 best responses to each of those.
> Unfortunately, I'm not sure how to evaluate positions that are several
> moves deep.  Can the valuations of moves in two different game variations
> be legitimately compared?  Could a score estimation be used to compare the
> moves?

I think you would want to use standard alpha-beta techniques, and compare
score estimates.  The only tricky part would be comparing at odd depths
with even depths; you would need a way to value a tenuki play to
compensate.  Move valuations cannot be compared.  For example, in endgame
play, in one branch black takes a large gote point; white now has a
handful of moderately small moves to choose among.  In another branch,
black makes a sente move; white now has a single very large move (to save
a group, for example).  This doesn't tell us much about the relative

> I doubt I'll be able to do some of the things suggested like having the
> Owl code run in parallel, since there is some analysis of the algorithm I
> have to do that could be very difficult in such a case. I may still look
> into it.
> > b) is an opportunity to scale up to a large extent, provided the
> >  comms overhead is not too great. ie when asking a remote machine to
> >  answer a question, the compute effort needs to justify the
> >  round-trip cost.
> Is there a way to significantly increase the amount of time GNU Go takes
> to decide on a move, such as increasing the level above 10?  Does doing so
> have much effect on how well GNU Go plays?

There were some experiments done running gnugo 3.2 at level 15.  It takes
significantly longer per move, and I believe improved strength by about a
stone.  I assume the improvement would be comparable for current versions.
If you do this, you will definitely want to increase the hashtable size
beyond 8MB (eg, 100MB+, depending on memory availability), and you might
want to increase the persistent cache size and the depths at which it is
used.  Changes to the persistent cache would require changing the code;
the hastable size can be varied with the -M option.

Good luck

Evan Daniel

reply via email to

[Prev in Thread] Current Thread [Next in Thread]