Re: current development

On Thu, Dec 5, 2019 at 6:34 PM Timothy Y. Chow <address@hidden> wrote:

> Regarding expected results, I also believe that backgammon bots are very
> close to perfection and whatever improvements (from any approach) will
> be marginal.

In order to determine whether a new network is doing better than the old
network, it helps to have examples of positions where the old network is
clearly playing poorly. Here's one example of a game that I played
against eXtreme Gammon where the bot made a lot of obvious blunders:

http://timothychow.net/cg/Games/7pt2015-05-24e%20Game%202.htm

For example, search for "10/8 6/4(3)". The bot's ridiculous play here
would not be among the top 50 plays of any halfway decent human player.
Admittedly this was XG but I would expect GNU to behave similarly, if not
in these specific positions then in similar ones.

I analyzed the positioned you mention using GNU Backgammon 1.06.002-mingw 32-Bit 20180802.

Since I am not am experienced GNUBG user, if any GNUBG dev spots anything wrong with the following, feel free to correct me.

I started with a 4-ply evaluation and the "correct" move is No 49 in the list of best moves.

ID: AABCgDsTg4MAAA:AQHpACAAAAAE

1. Cubeful 4-ply 18/16 13/7* Eq.: +0.138
0.574 0.000 0.000 - 0.426 0.104 0.060
4-ply cubeful prune [4ply]
2. Cubeful 4-ply 10/4 6/4 Eq.: +0.136 (-0.003)
0.564 0.000 0.000 - 0.436 0.096 0.047
4-ply cubeful prune [4ply]
3. Cubeful 4-ply 23/21 13/7* Eq.: +0.117 (-0.022)
0.569 0.000 0.000 - 0.431 0.113 0.060
4-ply cubeful prune [4ply]
4. Cubeful 4-ply 18/16 10/6 5/3* Eq.: +0.109 (-0.029)
0.566 0.000 0.000 - 0.434 0.111 0.063
4-ply cubeful prune [4ply]
...
49. Cubeful 0-ply 13/7* 5/3* Eq.: +0.115 (-0.023)
0.552 0.000 0.000 - 0.448 0.087 0.050
0-ply cubeful prune [expert]

What happened is that a good move got pruned at 0-ply because the default move filter for 4-ply eval at 0-ply is 16. So the best move did not reach deeper plies for better evaluation. I suspect something similar happened in your game with XG.

Then I changed this setting to 50, and after waiting a minute or two, the move gets to the number 1 spot:

1. Cubeful 4-ply 13/7* 5/3* Eq.: +0.164
0.582 0.000 0.000 - 0.418 0.099 0.060
4-ply cubeful prune [4ply]
2. Cubeful 4-ply 18/16 13/7* Eq.: +0.138 (-0.026)
0.574 0.000 0.000 - 0.426 0.104 0.060
4-ply cubeful prune [4ply]
3. Cubeful 4-ply 10/4 6/4 Eq.: +0.136 (-0.028)
0.564 0.000 0.000 - 0.436 0.096 0.047
4-ply cubeful prune [4ply]
4. Cubeful 4-ply 13/7* 10/8 Eq.: +0.125 (-0.039)
0.571 0.000 0.000 - 0.429 0.110 0.060
4-ply cubeful prune [4ply]
5. Cubeful 2-ply 23/21 13/7* Eq.: +0.153 (-0.011)
0.582 0.000 0.000 - 0.418 0.109 0.061
2-ply cubeful prune [world class]

The moral: If one needs to experience the full power of the bg bots one needs to change the default settings which are configured for the average user. Whatever errors bots occasionally make at their evaluations, they make up by searching deeper.

Nikos

Playing around with positions like this will quickly disabuse anyone of
the illusion that "backgammon bots are very close to perfection."

As I recall, in the past, people have tried specifically training neural
nets on positions like these, as well as "snake" positions where you have
to roll a prime for a long distance, and the problem was that it seemed to
degrade performance on other types of positions. It's possible that, as
Papachristou suggests, recent incremental improvements in regularization
algorithms might be good enough to overcome these difficulties. Anecdotal
evidence from Robert Wachtel's revised version of "In the Game Until the
End" suggests that Xavier was able to improve eXtreme Gammon's post-coup
classique play significantly, without a wholesale switch to modern deep
learning methods.

Tim

From:	Nikos Papachristou
Subject:	Re: current development
Date:	Sat, 7 Dec 2019 20:50:43 +0200