Re: current development

bug-gnubg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: current development

From:	Timothy Y. Chow
Subject:	Re: current development
Date:	Thu, 5 Dec 2019 11:32:00 -0500 (EST)
User-agent:	Alpine 2.21 (LRH 202 2017-01-01)

On Thu, 5 Dec 2019, Nikos Papachristou wrote:

My personal view on improving GNUBG: Why not try to "upgrade" yourexisting supervised learning approach? There have been lots of advancesin optimization/regularization algorithms for neural networks in thepast years and it might be less demanding that trying a new RL self-playapproach from scratch.
Regarding expected results, I also believe that backgammon bots are veryclose to perfection and whatever improvements (from any approach) willbe marginal.

In order to determine whether a new network is doing better than the oldnetwork, it helps to have examples of positions where the old network isclearly playing poorly. Here's one example of a game that I playedagainst eXtreme Gammon where the bot made a lot of obvious blunders:


http://timothychow.net/cg/Games/7pt2015-05-24e%20Game%202.htm

For example, search for "10/8 6/4(3)". The bot's ridiculous play herewould not be among the top 50 plays of any halfway decent human player.Admittedly this was XG but I would expect GNU to behave similarly, if notin these specific positions then in similar ones.

Playing around with positions like this will quickly disabuse anyone ofthe illusion that "backgammon bots are very close to perfection."

As I recall, in the past, people have tried specifically training neuralnets on positions like these, as well as "snake" positions where you haveto roll a prime for a long distance, and the problem was that it seemed todegrade performance on other types of positions. It's possible that, asPapachristou suggests, recent incremental improvements in regularizationalgorithms might be good enough to overcome these difficulties. Anecdotalevidence from Robert Wachtel's revised version of "In the Game Until theEnd" suggests that Xavier was able to improve eXtreme Gammon's post-coupclassique play significantly, without a wholesale switch to modern deeplearning methods.

Tim

[Prev in Thread]

Current Thread

[Next in Thread]

Re: current development, (continued)

Prev by Date: Re: current development
Next by Date: "upgrade" your existing supervised learning approach
Previous by thread: Re: current development
Next by thread: Re: current development
Index(es):
- Date
- Thread