bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: current development


From: Nikos Papachristou
Subject: Re: current development
Date: Thu, 5 Dec 2019 20:09:17 +0200

This info is very interesting! Let me share some preliminary attempts at "modernizing" the training of Palamedes so far. They may be of use to GNUBG and others trying something similar.
Note all these tries were done using a RL self-play style of training similar to those described in my research and not in a supervised learning environment.

Things that gave (slightly) better results:
* AmsGrad optimization
* Swish activation function (= x * sigmoid(x) ) in the hidden layer.
* newer weights initialization algorithms (e.g. He init, lecan init etc)

Things that didn't hurt or improve:
* weight decay
* rmsprop, sgd with momentum, nesterov momentum

Things that hurt performance:
* Adam optimizer
* RELU and similar activation functions

Things that I didn't try out yet are:
* deeper nets
* dropout

With this setup, so far I was able to slightly increase the performance of the Plakoto and Fevga nets but not the Portes/Backgammon net.
While this may not be encouraging for backgammon, I did reach roughly the same performance with *only* 1M self-play games compared to 20M using the previous approach.

Nikos

On Thu, Dec 5, 2019 at 2:28 PM Øystein Schønning-Johansen <address@hidden> wrote:
I have tried some experiments, and it looks like the training dataset (for contact positions) with the current input features, do indeed like some of the more modern methods. Briefly summarized:

Things that improves supervised learning on the dataset:
* Deeper nets, 5-6 hidden layers combined with ReLU activation functions.
* Adam (and AdamW) optimizer.
* A tiny bit of weight decay.
* Mini-batch training.

Things that does not work:
* Dropout.
* PCA of inputs.
* RMSProp optimizer (About the same performance as SGD).

I've tried training with Keras and on GPU's, and the training is really fast. However a plain CPU implementation of modern neural network training algorithms is actually not much slower for me. Also porting GPU code over into the GNU Backgammon application might not be faster as a lot of cycles will be used shuffling data back and forth between main memory and GPU memory.

So the process I ended up using was: 
1. Test out what works with Keras+GPU
2. implement that working method in C code for CPU.
3. Train NN with that code.

I've only worked with the contact neural network, as I see some strange issues with race dataset, and I think it require a re-rollout.

-Øystein

On Thu, Dec 5, 2019 at 12:38 PM Nikos Papachristou <address@hidden> wrote:
Hi everybody!

You can view my research publications on backgammon variants at my website: https://nikpapa.com , or alternatively you can download my PhD thesis from:

My personal view on improving GNUBG: Why not try to "upgrade" your existing supervised learning approach? There have been lots of advances in optimization/regularization algorithms for neural networks in the past years and it might be less demanding that trying a new RL self-play approach from scratch.

Regarding expected results, I also believe that backgammon bots are very close to perfection and whatever improvements (from any approach) will be marginal.



On Thu, Dec 5, 2019 at 12:14 AM Joseph Heled <address@hidden> wrote:
A link to something? article? software? did they use alpha-like strategies?

-Joseph

On Thu, 5 Dec 2019 at 11:04, Philippe Michel <address@hidden> wrote:
On Wed, Dec 04, 2019 at 02:07:18PM -0500, Timothy Y. Chow wrote:

> Also, it's my impression that many people *don't* think this is even a
> worthwhile idea to pursue.  Backgammon is already "solved," is what they
> will say.  It's true that "AlphaGammon" will surely not crush existing
> bots in a series of (say) 11-point matches.  At most I would expect a
> slight advantage.  But to me, that is the wrong way to look at the issue.
> I would like to understand superbackgames for their own sake, even though
> they arise rarely in practice.  Furthermore, if we know that bots don't
> understand superbackgames, then the closer a position gets to being a
> superbackgame, the less we can trust the bot verdict.

I'm not sure how related it may be, but there is a group of Greek
academics that have published some articles on their work on a bot,
Palamedes, that plays backgammon but also variants that have different
rules and starting positions and lead to positions that would be very
uncommon in backgammon.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]