[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-gnubg] TRAINING
From: |
Philippe Michel |
Subject: |
Re: [Bug-gnubg] TRAINING |
Date: |
Mon, 7 Aug 2017 23:39:15 +0200 (CEST) |
User-agent: |
Alpine 2.21 (BSF 202 2017-01-01) |
On Sat, 29 Jul 2017, greg etem wrote:
I want to know how I can train gnubackgammon in my PC.Thanks in advance!
You cannot train GNU backgammon with itself. What was used to train the
current networks is available by CVS at
cvs.savannah.gnu.org:/cvsroot/gnubg/gnubg-nn
(instructions to obtain the gnubg source by CVS are at
http://www.gnubg.org/index.php?itemid=26 for gnubg-nn the only difference
is th last argument since you check out gnubg-nn instead of gnubg).
Since the above software performs supervised training you need a training
database. The one used to train the current networks is available at
http://files.gnubg.org/media/nn-training/pmichel/training_data/
There are 3 files since gnubg uses 3 different networks for various stages
of the game.
In sibling directories you'll find the current nets (format is slightly
different from the one used in gnubg itself but the conversion from one to
the other is trivial) in .../nn-training/pmichel/nets and benchmark
databases used to evaluate the results of training in
.../nn-training/pmichel/benchmarks/
You don't say what you aim to do exactly, but you must realize that if you
just start with the above training database and networks you are unlikely
to obtain something meaningfully different from the current level of play.
The kind of changes you may try could be :
- start with a more accurate training database. The current one contains
positions rolled out at 0-ply. Most of them are ok but complex positions
like containment play or backgames could have been badly misevaluated.
Re-rolling out the databases should improve them (the current networks are
better than what was used to create the current databases) but this is a
lot of work. Re-rolling the misevaluated ones only would be faster but you
would have to identify them first.
- add positions for the classes of positions that are misevaluated.
Starting from one of these you would need some way to generate hundreds of
similar positions (adding one or two or ten won't be enough).
- remove "toxic" positions for the training database. At some stage (10 or
so years ago) misevaluated positions were added automatically to the
training database but since the program didn't play too well back then
some of them are bizarre positions the don't happen with sensible play and
may in addition be misevaluated in the rollouts.
- use a different network structure. Current networks have one hidden
layer of 128 neurons. Frank Berger, the creator of BgBlitz reported that
he tried different sizes (up to 160 if I remember correctly) and the
larger ones didn't help. Of course this was for a different program and
may work out differently for gnubg. Moreover, trying a smaller hidden
layer may lead to a program that would be weaker but in a more human-like
way than the current randomly-weakened levels of gnubg.
Trying something more complex like a second hidden layer or some kind of
not fully connected setup could be interesting but would need some
programming work.
- adding new inputs or modifying current ones. This is probably the most
promising way to improve the level of play, but this seems to be
pretty hard. The inputs currently used are mostly (or even exclusively)
from old research papers by Hans Berliner 40 years ago. I don't think there is
a lot more recent litterature on the subject.
As far as I understand (I wasn't interested in gnubg at the time), Joseph
Heled tried some things in this area with gnubg in the early 2000s but was
frustrated by the difficulty to get meaningful improvements.
Of course this one involves some coding in both gnubg and gnubg-nn.
- Re: [Bug-gnubg] TRAINING,
Philippe Michel <=