Re: [Bug-gnubg] TRAINING

bug-gnubg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] TRAINING

From:	Philippe Michel
Subject:	Re: [Bug-gnubg] TRAINING
Date:	Mon, 7 Aug 2017 23:39:15 +0200 (CEST)
User-agent:	Alpine 2.21 (BSF 202 2017-01-01)

On Sat, 29 Jul 2017, greg etem wrote:

I want to know how I can train gnubackgammon in my PC.Thanks in advance!

You cannot train GNU backgammon with itself. What was used to train thecurrent networks is available by CVS at

cvs.savannah.gnu.org:/cvsroot/gnubg/gnubg-nn

(instructions to obtain the gnubg source by CVS are athttp://www.gnubg.org/index.php?itemid=26 for gnubg-nn the only differenceis th last argument since you check out gnubg-nn instead of gnubg).

Since the above software performs supervised training you need a trainingdatabase. The one used to train the current networks is available at

http://files.gnubg.org/media/nn-training/pmichel/training_data/

There are 3 files since gnubg uses 3 different networks for various stagesof the game.

In sibling directories you'll find the current nets (format is slightlydifferent from the one used in gnubg itself but the conversion from one tothe other is trivial) in .../nn-training/pmichel/nets and benchmarkdatabases used to evaluate the results of training in.../nn-training/pmichel/benchmarks/

You don't say what you aim to do exactly, but you must realize that if youjust start with the above training database and networks you are unlikelyto obtain something meaningfully different from the current level of play.


The kind of changes you may try could be :

- start with a more accurate training database. The current one containspositions rolled out at 0-ply. Most of them are ok but complex positionslike containment play or backgames could have been badly misevaluated.Re-rolling out the databases should improve them (the current networks arebetter than what was used to create the current databases) but this is alot of work. Re-rolling the misevaluated ones only would be faster but youwould have to identify them first.

- add positions for the classes of positions that are misevaluated.Starting from one of these you would need some way to generate hundreds ofsimilar positions (adding one or two or ten won't be enough).

- remove "toxic" positions for the training database. At some stage (10 orso years ago) misevaluated positions were added automatically to thetraining database but since the program didn't play too well back thensome of them are bizarre positions the don't happen with sensible play andmay in addition be misevaluated in the rollouts.

- use a different network structure. Current networks have one hiddenlayer of 128 neurons. Frank Berger, the creator of BgBlitz reported thathe tried different sizes (up to 160 if I remember correctly) and thelarger ones didn't help. Of course this was for a different program andmay work out differently for gnubg. Moreover, trying a smaller hiddenlayer may lead to a program that would be weaker but in a more human-likeway than the current randomly-weakened levels of gnubg.Trying something more complex like a second hidden layer or some kind ofnot fully connected setup could be interesting but would need someprogramming work.

- adding new inputs or modifying current ones. This is probably the mostpromising way to improve the level of play, but this seems to bepretty hard. The inputs currently used are mostly (or even exclusively)from old research papers by Hans Berliner 40 years ago. I don't think there isa lot more recent litterature on the subject.As far as I understand (I wasn't interested in gnubg at the time), JosephHeled tried some things in this area with gnubg in the early 2000s but wasfrustrated by the difficulty to get meaningful improvements.

Of course this one involves some coding in both gnubg and gnubg-nn.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Bug-gnubg] TRAINING, Philippe Michel <=
- Re: [Bug-gnubg] TRAINING, Joseph Heled, 2017/08/08

Next by Date: Re: [Bug-gnubg] TRAINING
Next by thread: Re: [Bug-gnubg] TRAINING
Index(es):
- Date
- Thread