gnugo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnugo-devel] Improvement of the hash table in GNU go


From: Inge Wallin
Subject: [gnugo-devel] Improvement of the hash table in GNU go
Date: Wed, 5 Feb 2003 03:29:43 +0100

I have come up with an improvement of the hash table in GNU go.

In the current implementation, the transposition table is split into
two parts:Sthese
 1. the positions part, where we store positions.
 2. the part with read results where we store results from 
    reading of various kinds.

We don't actually store the whole position (we used to do that, but it
was removed), but instead let the position be represented by a hash
value.  This value should have a low enough probability of collision
with another hash value of another position that we can safely say
that once we have a position with the same hash value as the one in
the table, we have indeed the same position.  The reason that we don't
store the whole position in the hash table is of course that it takes
a lot of memory to do so.  That memory could be better used for
storing more positions instead.

For every position in the table, there can be one or more so called
Read Results.  Such a read result stores the result of one particular
reading.  It can be tactical reading (attack(), find_defense()), owl
reading or semeai reading.  The thing is that often we need to have
more than one read result for the same position in the table.  There
are many groups on the board, and we might want to store the result of
running attack() on each of them.  In practice we have found out that
there need to be on the average 1.4 read results for every position in
the table.

The read results for one position are chained together in a linked
list. 

This implementation has served us well, but there are several problems
with it:

1. We have to split the memory available to us into an area with
   positions and another one with read results.  One of these will run
   out before the other one does, which will waste memory.  We can't
   know in advance which will actually run out first.

2. The pointers that we use to link the read results together waste
   memory.

3. Because of the split between the two areas of the table, we can't
   use a standard replacement scheme for the transposition table.  By
   this I mean that we can't easily decide if we want to replace an
   old entry in the table with a new one, e.g. if the newer one
   represents more work and would therefore be more expensive to
   recalculate. 

4. To remedy the lack of a replacement scheme we currently have to
   call hashtable_partially_clear() when the table gets full.  This
   function clears the hash table of "inexpensive" entries (tactical
   read results) and keeps the expensive ones - owl reading and (I
   think) semeai reading.  However, in the process we throw away a lot
   of valuable information.  There is a also a problem with "open
   nodes", that are nodes that represent reading that is being
   performed at the time the cleaning is run.  These open nodes
   complicate things.

So instead I propose the following new scheme that I think is better.

Currently we use Zobrist hashing to hash positions into a binary
number.  What we do is that we precompute 3 tables of random numbers,
where every number in a table corresponds to a certain intersection on
the board.   To calculate the hash value of a position, we start out
with zero, and then we go through every intersection on the board.  

If the intersection has a black stone on it, we xor the hash value
with the random number for that intersection from the first table.  If
the intersection has a black stone on it, we xor the hash value with
the random number for that intersection from the second table.  If
there is a ko on the board we use a random number for that
intersection from the third table. 

This hash function is very fast to calculate, and it can also be
performed incrementally when we place stones on the board.

What I propose is a very simple but powerful change: We introduce a
new table of random number for each reading function that we want to
cache the result of.  Then we use these tables to create a unique hash
value for a combination of position and read function on a certain
target.  That way we get a single type of entries and can get rid of a
lot of the current complications.

Take for example attack().  If we call attack(A1), and want to find
out if the result of this reading is in the table we do the following:

Take the current hash value for the position and xor it with
attack_table[A1] (pardon the pseudo code here).  The value that we get
can be immediately used to look up the entry in the hash table and see
if it is there.  We no longer have to split things into positions and
read results.

The good thing about this is that we eliminate all of the weaknesses
above.  We get better memory usage, we can use some nice replacement
scheme, and we don't have to split the table into 2 parts.

I haven't implemented this yet, but I told it to Dan, and he suggested
that I write a description here and start a discussion.

     -Inge





reply via email to

[Prev in Thread] Current Thread [Next in Thread]