bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

"upgrade" your existing supervised learning approach


From: Wayne Joseph
Subject: "upgrade" your existing supervised learning approach
Date: Thu, 5 Dec 2019 18:00:23 +0000

>> Why not try to "upgrade" your existing supervised learning approach? 

Yes! This is something I really need to do!

Is this the right place to make a feature-request?

** I think it would be really useful if GNU could auto-magically save screenshots of all of my errors above, say, -0.04 in a folder **

That it! (..for now - could be more sophisticated, i.e. auto-position categorisation into sub-folders, racing cube errors, PNPL, make 5 point or not, break prime, trap plays etc. etc. but a good place to start ;-)

Is this stuff of dreams even possible in 2019 A.D.?

Might other bg students find such a personalised blunder folder useful?

Thanks if you can help!

Wayne

-- Sent from my Android phone

On Thu, 5 Dec 2019, 5:00 pm , <address@hidden> wrote:
Send Bug-gnubg mailing list submissions to
        address@hidden

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.gnu.org/mailman/listinfo/bug-gnubg
or, via email, send a message with subject or body 'help' to
        address@hidden

You can reach the person managing the list at
        address@hidden

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Bug-gnubg digest..."


Today's Topics:

   1. Re: current development (Øystein Schønning-Johansen)
   2. Re: current development (Joseph Heled)
   3. Re: current development (Myshkin LeVine)
   4. Re: current development (Joseph Heled)
   5. Re: current development (Joseph Heled)
   6. Re: current development (Philippe Michel)
   7. Re: current development (Philippe Michel)
   8. Re: current development (Joseph Heled)
   9. Re: current development (Russ Allbery)
  10. Re: current development (Joseph Heled)
  11. Re: current development (Philippe Michel)
  12. Re: current development (Joseph Heled)
  13. Re: current development (Joseph Heled)
  14. Re: Alphazero / Deepmind backgammon project (Wayne Joseph)
  15. Re: Alphazero / Deepmind backgammon project (Joseph Heled)
  16. Re: current development (Nikos Papachristou)
  17. Re: current development (Øystein Schønning-Johansen)
  18. Re: current development (Timothy Y. Chow)


----------------------------------------------------------------------

Message: 1
Date: Wed, 4 Dec 2019 21:23:09 +0100
From: Øystein Schønning-Johansen <address@hidden>
To: Joseph Heled <address@hidden>
Cc: Ralph Corderoy <address@hidden>, "address@hidden"
        <address@hidden>
Subject: Re: current development
Message-ID:
        <address@hidden>
Content-Type: text/plain; charset="utf-8"

But let's chat about the idea instead. What will it actually mean to 'apply
"AlphaZero methods" to backgammon.' ?

AlphaZero (and AlphaGo and Lc0 and SugaR NN) is just more or less the same
thing as reinforcement learning in backgammon. So, from my understanding,
it is rather AlphaZero, who has applied the backgammon methods. They are
both the chess and go variants trains with reinforcement learning pretty
much like the original GNU Backgammon, Jellyfish and Snowie. In Go they had
to make a move selection subroutine based on human play and then add MCTS
to train. Also the neural networks are deeper and more complex. The nn
inputs features are also so more complex and can to some extend resemble
convolutions known from convolutional neural network (And that the inputs
are not properly described in the high level articles.)

Apart from that, it is actually same thing: Reinforcement learning.

But how can we improve: We believe (at least I do) that the current state
of backgammon bots are so strong that it plays close to perfect in standard
positions. It is in uncommon and long term plan positions (like deep
backgames and snake rolling prime positions) bots still can improve. Let me
throw some ideas up in the air for discussion:

Can we make a RL algorithm that is so fast that it can learn on the fly?
Say we during play find a position where some indicator (that may be
another challenge) indicates that this is a position that requires long
term planning. If we then have the ability to RL train a neural net for
that specific position, that could be an huge improvement in my opinion.
(Lot's of details missing.)

And then, could the evaluations be improved if we specialize neural
networks in to specific position types, and then make a kind of nn
selection system based on k-means of the input features. I tried that many
years ago with only four classes. Those experiments showed that it's not
hopeless approach, and with faster computers it can easily create much more
than just four classes (fours was only the first number that popped into my
head those days)

Then next idea: What about huge scale distributed rollouts? Maybe we could
have a system like BOINQ to do rollouts on the fly? I'm not sure how this
should be used in a practical sense, and I'm not sure how hard it would be
to implement (with or without BOINQ framework) but I'm just kind of
brainstorming here.

-Øystein


On Wed, Dec 4, 2019 at 6:47 PM Joseph Heled <address@hidden> wrote:

> I was intentionally rude because I thought his original post was
> inappropriate.
>
> -Joseph
>
> On Thu, 5 Dec 2019 at 06:42, Ralph Corderoy <address@hidden> wrote:
> >
> > Hi Joseph,
> >
> > > I thought so.
> > >
> > > I had the same idea the day I heard they cracked go, but just saying
> > > something is a good idea is not helpful at all in my book.
> >
> > I think you're wrong.  And also a bit rude to boot.
> >
> > It's fine for Tim to suggest or ponder an idea to the list.  It may
> > encourage another subscriber, or draw out news of what a lurker has been
> > working on that's related.
> >
> > --
> > Cheers, Ralph.
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191204/4c825a45/attachment.html>

------------------------------

Message: 2
Date: Thu, 5 Dec 2019 09:34:45 +1300
From: Joseph Heled <address@hidden>
To: Øystein Schønning-Johansen <address@hidden>
Cc: Ralph Corderoy <address@hidden>, "address@hidden"
        <address@hidden>
Subject: Re: current development
Message-ID:
        <CAG8x8-0mzJFO=_address@hidden>
Content-Type: text/plain; charset="utf-8"

The main difference, if I understand correctly (and I know very little
here) is to bootstrap from the ground. That is, no pre-computed inputs. and
let the network figure it out by self play.

We have a great test case in that we can start with just racing.

That said, I think we will need a net for each match score, since cubeless
-> cubeful is where things get messy.

Also, given that 0-ply rollouts are relatively fast, when playing against a
human - if you can wait a second or two, you can play using cubeful 0-ply.
Testing how good this is will be problematic.

-Joseph


On Thu, 5 Dec 2019 at 09:23, Øystein Schønning-Johansen <address@hidden>
wrote:

> But let's chat about the idea instead. What will it actually mean to
> 'apply "AlphaZero methods" to backgammon.' ?
>
> AlphaZero (and AlphaGo and Lc0 and SugaR NN) is just more or less the same
> thing as reinforcement learning in backgammon. So, from my understanding,
> it is rather AlphaZero, who has applied the backgammon methods. They are
> both the chess and go variants trains with reinforcement learning pretty
> much like the original GNU Backgammon, Jellyfish and Snowie. In Go they had
> to make a move selection subroutine based on human play and then add MCTS
> to train. Also the neural networks are deeper and more complex. The nn
> inputs features are also so more complex and can to some extend resemble
> convolutions known from convolutional neural network (And that the inputs
> are not properly described in the high level articles.)
>
> Apart from that, it is actually same thing: Reinforcement learning.
>
> But how can we improve: We believe (at least I do) that the current state
> of backgammon bots are so strong that it plays close to perfect in standard
> positions. It is in uncommon and long term plan positions (like deep
> backgames and snake rolling prime positions) bots still can improve. Let me
> throw some ideas up in the air for discussion:
>
> Can we make a RL algorithm that is so fast that it can learn on the fly?
> Say we during play find a position where some indicator (that may be
> another challenge) indicates that this is a position that requires long
> term planning. If we then have the ability to RL train a neural net for
> that specific position, that could be an huge improvement in my opinion.
> (Lot's of details missing.)
>
> And then, could the evaluations be improved if we specialize neural
> networks in to specific position types, and then make a kind of nn
> selection system based on k-means of the input features. I tried that many
> years ago with only four classes. Those experiments showed that it's not
> hopeless approach, and with faster computers it can easily create much more
> than just four classes (fours was only the first number that popped into my
> head those days)
>
> Then next idea: What about huge scale distributed rollouts? Maybe we could
> have a system like BOINQ to do rollouts on the fly? I'm not sure how this
> should be used in a practical sense, and I'm not sure how hard it would be
> to implement (with or without BOINQ framework) but I'm just kind of
> brainstorming here.
>
> -Øystein
>
>
> On Wed, Dec 4, 2019 at 6:47 PM Joseph Heled <address@hidden> wrote:
>
>> I was intentionally rude because I thought his original post was
>> inappropriate.
>>
>> -Joseph
>>
>> On Thu, 5 Dec 2019 at 06:42, Ralph Corderoy <address@hidden>
>> wrote:
>> >
>> > Hi Joseph,
>> >
>> > > I thought so.
>> > >
>> > > I had the same idea the day I heard they cracked go, but just saying
>> > > something is a good idea is not helpful at all in my book.
>> >
>> > I think you're wrong.  And also a bit rude to boot.
>> >
>> > It's fine for Tim to suggest or ponder an idea to the list.  It may
>> > encourage another subscriber, or draw out news of what a lurker has been
>> > working on that's related.
>> >
>> > --
>> > Cheers, Ralph.
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/f92fd680/attachment.html>

------------------------------

Message: 3
Date: Wed, 4 Dec 2019 20:40:56 +0000
From: Myshkin LeVine <address@hidden>
To: Superfly Jon <address@hidden>
Cc: "address@hidden" <address@hidden>
Subject: Re: current development
Message-ID:
        <address@hidden>

Content-Type: text/plain; charset="us-ascii"

Hi Jon,
           Perhaps you could also include build instructions for the Mac in addition to Linux and Windows. I figured out how to fulfill the cglm dependency, but I think many Mac users could not. I do not see cglm in MacPorts, but it does have a formula for those Mac users who like Homebrew. The 3D board seems to be working fine for me, with the exception that clicking on the dice to end my turn has no effect, and I must instead use use Control-F.

Myshkin LeVine


On Dec 4, 2019, at 1:56 PM, Superfly Jon wrote:

> My changes are still under development, the dependency with cglm will be resolved in due course.  I'll update the "INSTALL" file with more detailed how-to build instructions for linux and windows shortly.
>
> Jon
>




------------------------------

Message: 4
Date: Thu, 5 Dec 2019 09:40:48 +1300
From: Joseph Heled <address@hidden>
To: Ingo Macherius <address@hidden>
Cc: "address@hidden" <address@hidden>
Subject: Re: current development
Message-ID:
        <CAG8x8-1ULLS2164rwH8pSGBKFF=address@hidden>
Content-Type: text/plain; charset="utf-8"

What is the matter with you people? bug-gnubg is synonym with dev-gnubg.

Perhaps it is time for me to stop being involved with GNUBG. Does not seem
people here are interested in doing anything constructive. If this is not
the case, prove me wrong.

-Joseph

On Thu, 5 Dec 2019 at 08:57, Ingo Macherius <address@hidden> wrote:

> This is not a bug. While individual members of the gnubg team always had
> setups for dependency management under various IDEs and OSes, there
> never was a satisfying solution in the repository. There was a Linux
> package management HOWTO, but it's went away with the rest of the web
> pages. And no, automake and it's cryptic error message not really
> qualifies as a dependency management system. I'm a Java guy and used to
> tool based solutions such as maven or gradle. Picking and adding
> something similar suiteable for C, ideally something which works on all
> major OSes, would greatly improve the confusion you and everybody else
> starting to work with the code has to go through.
>
> Ingo
>
> Am 04.12.19 um 19:00 schrieb Joseph Heled:
> > And good riddance. This list is called bug-gnubg.
> >
> > -Joseph
> >
> > On Thu, 5 Dec 2019 at 06:59, Ralph Corderoy <address@hidden>
> wrote:
> >> Hi Joseph,
> >>
> >>> I was intentionally rude because I thought his original post was
> >>> inappropriate.
> >> How childish.  We put up with your many posts detailing your failure to
> >> compile from source when a quick Google would have led to the method of
> >> using the package manager to install the build dependencies.  No one was
> >> rude.  You got civil help.
> >>
> >> I've no time for such antics.  This list has always been polite in my
> >> experience.  I'm unsubscribing.
> >>
> >> --
> >> Cheers, Ralph.
> >>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/fcef612b/attachment.html>

------------------------------

Message: 5
Date: Thu, 5 Dec 2019 09:48:19 +1300
From: Joseph Heled <address@hidden>
Cc: "address@hidden" <address@hidden>
Subject: Re: current development
Message-ID:
        <address@hidden>
Content-Type: text/plain; charset="utf-8"

Ralph was not cc'ed

Subject: Re: current development
From: Joseph Heled <address@hidden>
To: Ingo Macherius <address@hidden>
Cc: "address@hidden" <address@hidden>
Content-Type: multipart/alternative; boundary="000000000000e96d5e0598e6d365"

On Thu, 5 Dec 2019 at 09:44, Ralph Corderoy <address@hidden> wrote:

> Joseph, please desist from CC-ing me.
>
> --
> Cheers, Ralph.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/c2efdfaf/attachment.html>

------------------------------

Message: 6
Date: Wed, 4 Dec 2019 22:14:48 +0100
From: Philippe Michel <address@hidden>
To: Russ Allbery <address@hidden>
Cc: Joseph Heled <address@hidden>,    "address@hidden"
        <address@hidden>
Subject: Re: current development
Message-ID: <20191204211448.GA8359@genesis>
Content-Type: text/plain; charset=us-ascii

On Tue, Dec 03, 2019 at 03:51:12PM -0800, Russ Allbery wrote:
>
> Philippe Michel <address@hidden> writes:
>
> > Reasonably recent versions of gcc and clang have a feature (ifuncs) that
> > should allow to to this in one single binary. I don't know how onerous
> > it would be at package building stage, but I think a few parts of Linux,
> > for instance glibc, use that feature, so at least it wouldn't be unknown
> > territory.
>
> Oh, interesting.  Is this something that I can just enable with a compiler
> flag, or does it need code support in gnubg?

This would be mostly in gnubg's code. Maybe something would be needed at
configure stage as well.

The post below shows a minimal example of how this is used:
https://gcc.gnu.org/ml/gcc-help/2012-03/msg00209.html



------------------------------

Message: 7
Date: Wed, 4 Dec 2019 22:28:03 +0100
From: Philippe Michel <address@hidden>
To: Joseph Heled <address@hidden>
Cc: Russ Allbery <address@hidden>, "address@hidden"
        <address@hidden>
Subject: Re: current development
Message-ID: <20191204212803.GB8359@genesis>
Content-Type: text/plain; charset=us-ascii

On Wed, Dec 04, 2019 at 01:21:06PM +1300, Joseph Heled wrote:

> Is that the right way to specify both?
>
> ./configure --enable-simd=avx --enable-simd=sse2

It wasn't expected to specify both :-).

I just checked what it does: the second option overrides the first one,
so your example doesn't do what you hoped.

Just use --enable-simd=yes, it will use avx if your computer supports it
(plus some sse in places where there is no avx implementation), else
sse2.



------------------------------

Message: 8
Date: Thu, 5 Dec 2019 10:31:44 +1300
From: Joseph Heled <address@hidden>
To: Philippe Michel <address@hidden>
Cc: Russ Allbery <address@hidden>, "address@hidden"
        <address@hidden>
Subject: Re: current development
Message-ID:
        <address@hidden>
Content-Type: text/plain; charset="utf-8"

Thanks. will recompile. Also the other non-sse options should go somewhere
else, but I don't know where they should go in the debian build process.

On Thu, 5 Dec 2019 at 10:28, Philippe Michel <address@hidden>
wrote:

> On Wed, Dec 04, 2019 at 01:21:06PM +1300, Joseph Heled wrote:
>
> > Is that the right way to specify both?
> >
> > ./configure --enable-simd=avx --enable-simd=sse2
>
> It wasn't expected to specify both :-).
>
> I just checked what it does: the second option overrides the first one,
> so your example doesn't do what you hoped.
>
> Just use --enable-simd=yes, it will use avx if your computer supports it
> (plus some sse in places where there is no avx implementation), else
> sse2.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/b6184825/attachment.html>

------------------------------

Message: 9
Date: Wed, 04 Dec 2019 13:36:15 -0800
From: Russ Allbery <address@hidden>
To: address@hidden
Subject: Re: current development
Message-ID: <address@hidden>
Content-Type: text/plain

Joseph Heled <address@hidden> writes:

> Thanks. will recompile. Also the other non-sse options should go
> somewhere else, but I don't know where they should go in the debian
> build process.

The override_dh_auto_configure target includes the invocation of
./configure.

--
Russ Allbery (address@hidden)             <https://www.eyrie.org/~eagle/>



------------------------------

Message: 10
Date: Thu, 5 Dec 2019 10:42:35 +1300
From: Joseph Heled <address@hidden>
To: "address@hidden" <address@hidden>
Subject: Re: current development
Message-ID:
        <address@hidden>
Content-Type: text/plain; charset="utf-8"

On Wed, 4 Dec 2019 at 20:59, Joseph Heled <address@hidden> wrote:

> An 8 "core" machine, i.e. fake intel count number
>
> $ grep -m1 '^model name' /proc/cpuinfo
> model name      : Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz
>
> in debian rules file:
> SSE = --enable-simd=avx --enable-simd=sse2 --enable-threads -with-gtk
> --with-board3d --with-python
> compiled with gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0  -O3
>
> position id: 4NvBGECYr8ELAA:MBngAAAAIAAE
>
> rollout cube action:
>
>
So make this sse2.


> 8 threads: 93 seconds
> 4 threads: 100 seconds    On stock debian: 111 seconds
> 2 threads: 172 seconds
> 1 thread:   312
>

with avx,  8 threads: 81 seconds and 99 seconds with 4 threads, so maybe a
small improvement, maybe not.


> -Joseph
>
>
> On Wed, 4 Dec 2019 at 20:29, Ralph Corderoy <address@hidden> wrote:
> >
> > Hi Joseph,
> >
> > > What we really need is someone with access to some computing power
> > > (aka grid) to run a set of reference positions - 0-ply cube decisions
> > > vs 2-ply, and see what the difference is. That would give a hint as to
> > > what to do.
> >
> > How about reporting your
> >
> >     grep -m1 '^model name' /proc/cpuinfo
> >
> > along with the stock Ubuntu package version's time on a reference
> > position when given 1, 2, ... threads.  And then you're self-compiled
> > version for comparison, noting what you changed in debian/rules.
> >
> > It would be a start, and also offer some precision so if something is
> > awry then others on the list may have data to judge by.
> >
> > --
> > Cheers, Ralph.
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/80c4e2b4/attachment.html>

------------------------------

Message: 11
Date: Wed, 4 Dec 2019 23:04:00 +0100
From: Philippe Michel <address@hidden>
To: address@hidden
Cc: "address@hidden" <address@hidden>
Subject: Re: current development
Message-ID: <20191204220400.GA44635@genesis>
Content-Type: text/plain; charset=us-ascii

On Wed, Dec 04, 2019 at 02:07:18PM -0500, Timothy Y. Chow wrote:

> Also, it's my impression that many people *don't* think this is even a
> worthwhile idea to pursue.  Backgammon is already "solved," is what they
> will say.  It's true that "AlphaGammon" will surely not crush existing
> bots in a series of (say) 11-point matches.  At most I would expect a
> slight advantage.  But to me, that is the wrong way to look at the issue.
> I would like to understand superbackgames for their own sake, even though
> they arise rarely in practice.  Furthermore, if we know that bots don't
> understand superbackgames, then the closer a position gets to being a
> superbackgame, the less we can trust the bot verdict.

I'm not sure how related it may be, but there is a group of Greek
academics that have published some articles on their work on a bot,
Palamedes, that plays backgammon but also variants that have different
rules and starting positions and lead to positions that would be very
uncommon in backgammon.





------------------------------

Message: 12
Date: Thu, 5 Dec 2019 11:12:39 +1300
From: Joseph Heled <address@hidden>
To: Philippe Michel <address@hidden>
Cc: address@hidden, "address@hidden" <address@hidden>
Subject: Re: current development
Message-ID:
        <address@hidden>
Content-Type: text/plain; charset="utf-8"

A link to something? article? software? did they use alpha-like strategies?

-Joseph

On Thu, 5 Dec 2019 at 11:04, Philippe Michel <address@hidden>
wrote:

> On Wed, Dec 04, 2019 at 02:07:18PM -0500, Timothy Y. Chow wrote:
>
> > Also, it's my impression that many people *don't* think this is even a
> > worthwhile idea to pursue.  Backgammon is already "solved," is what they
> > will say.  It's true that "AlphaGammon" will surely not crush existing
> > bots in a series of (say) 11-point matches.  At most I would expect a
> > slight advantage.  But to me, that is the wrong way to look at the
> issue.
> > I would like to understand superbackgames for their own sake, even
> though
> > they arise rarely in practice.  Furthermore, if we know that bots don't
> > understand superbackgames, then the closer a position gets to being a
> > superbackgame, the less we can trust the bot verdict.
>
> I'm not sure how related it may be, but there is a group of Greek
> academics that have published some articles on their work on a bot,
> Palamedes, that plays backgammon but also variants that have different
> rules and starting positions and lead to positions that would be very
> uncommon in backgammon.
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/ba104667/attachment.html>

------------------------------

Message: 13
Date: Thu, 5 Dec 2019 11:23:04 +1300
From: Joseph Heled <address@hidden>
To: Philippe Michel <address@hidden>
Cc: address@hidden, "address@hidden" <address@hidden>
Subject: Re: current development
Message-ID:
        <address@hidden>
Content-Type: text/plain; charset="utf-8"

I googled and found this:

   https://hal.inria.fr/hal-01521393/document

Seems very much like GNUBG, only a smaller net. No way to tell how it
compares to (say) GNUBG.

-Joseph

On Thu, 5 Dec 2019 at 11:12, Joseph Heled <address@hidden> wrote:

> A link to something? article? software? did they use alpha-like strategies?
>
> -Joseph
>
> On Thu, 5 Dec 2019 at 11:04, Philippe Michel <address@hidden>
> wrote:
>
>> On Wed, Dec 04, 2019 at 02:07:18PM -0500, Timothy Y. Chow wrote:
>>
>> > Also, it's my impression that many people *don't* think this is even a
>> > worthwhile idea to pursue.  Backgammon is already "solved," is what
>> they
>> > will say.  It's true that "AlphaGammon" will surely not crush existing
>> > bots in a series of (say) 11-point matches.  At most I would expect a
>> > slight advantage.  But to me, that is the wrong way to look at the
>> issue.
>> > I would like to understand superbackgames for their own sake, even
>> though
>> > they arise rarely in practice.  Furthermore, if we know that bots don't
>> > understand superbackgames, then the closer a position gets to being a
>> > superbackgame, the less we can trust the bot verdict.
>>
>> I'm not sure how related it may be, but there is a group of Greek
>> academics that have published some articles on their work on a bot,
>> Palamedes, that plays backgammon but also variants that have different
>> rules and starting positions and lead to positions that would be very
>> uncommon in backgammon.
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/265dafd1/attachment.html>

------------------------------

Message: 14
Date: Thu, 5 Dec 2019 06:35:18 +0000
From: Wayne Joseph <address@hidden>
To: address@hidden
Subject: Re: Alphazero / Deepmind backgammon project
Message-ID:
        <CAH=address@hidden>
Content-Type: text/plain; charset="utf-8"

Hi Tim / Hi all,

It might be worth reaching out to Jens Averkamp who I believe was in
contact with a Dev team working this avenue.

I also tried to get in touch with Demis, CEO of Deepmind (who almost
certainly can play bg) a while ago, but I don't think my message completed
its intended journey to him (via his P.A).

After seeing what Deepmind has done to publicize Go and StarCraft, I was
hoping the same might be possible for backgammon. Does anybody else fancy
seeing Mochy beat Deepmind? ;)

Good luck!

-- Sent from my Android phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/2f5aab15/attachment.html>

------------------------------

Message: 15
Date: Thu, 5 Dec 2019 19:44:51 +1300
From: Joseph Heled <address@hidden>
To: Wayne Joseph <address@hidden>
Cc: "address@hidden" <address@hidden>
Subject: Re: Alphazero / Deepmind backgammon project
Message-ID:
        <CAG8x8-3GQqDSe=address@hidden>
Content-Type: text/plain; charset="utf-8"

Sounds good, Wayne!!

Personal opinion: Mochy will be lucky to win one match, ah-la Lee Sedol, if
matches are long enough :)

-Joseph


On Thu, 5 Dec 2019 at 19:35, Wayne Joseph <address@hidden> wrote:

> Hi Tim / Hi all,
>
> It might be worth reaching out to Jens Averkamp who I believe was in
> contact with a Dev team working this avenue.
>
> I also tried to get in touch with Demis, CEO of Deepmind (who almost
> certainly can play bg) a while ago, but I don't think my message completed
> its intended journey to him (via his P.A).
>
> After seeing what Deepmind has done to publicize Go and StarCraft, I was
> hoping the same might be possible for backgammon. Does anybody else fancy
> seeing Mochy beat Deepmind? ;)
>
> Good luck!
>
> -- Sent from my Android phone
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/f14abd8d/attachment.html>

------------------------------

Message: 16
Date: Thu, 5 Dec 2019 13:30:17 +0200
From: Nikos Papachristou <address@hidden>
To: Joseph Heled <address@hidden>
Cc: Philippe Michel <address@hidden>, address@hidden,
        "address@hidden" <address@hidden>
Subject: Re: current development
Message-ID:
        <CAPF31MRT=address@hidden>
Content-Type: text/plain; charset="utf-8"

Hi everybody!

You can view my research publications on backgammon variants at my website:
https://nikpapa.com , or alternatively you can download my PhD thesis from:
https://www.didaktorika.gr/eadd/handle/10442/43622?locale=en

My personal view on improving GNUBG: Why not try to "upgrade" your existing
supervised learning approach? There have been lots of advances in
optimization/regularization algorithms for neural networks in the past
years and it might be less demanding that trying a new RL self-play
approach from scratch.

Regarding expected results, I also believe that backgammon bots are very
close to perfection and whatever improvements (from any approach) will be
marginal.



On Thu, Dec 5, 2019 at 12:14 AM Joseph Heled <address@hidden> wrote:

> A link to something? article? software? did they use alpha-like strategies?
>
> -Joseph
>
> On Thu, 5 Dec 2019 at 11:04, Philippe Michel <address@hidden>
> wrote:
>
>> On Wed, Dec 04, 2019 at 02:07:18PM -0500, Timothy Y. Chow wrote:
>>
>> > Also, it's my impression that many people *don't* think this is even a
>> > worthwhile idea to pursue.  Backgammon is already "solved," is what
>> they
>> > will say.  It's true that "AlphaGammon" will surely not crush existing
>> > bots in a series of (say) 11-point matches.  At most I would expect a
>> > slight advantage.  But to me, that is the wrong way to look at the
>> issue.
>> > I would like to understand superbackgames for their own sake, even
>> though
>> > they arise rarely in practice.  Furthermore, if we know that bots don't
>> > understand superbackgames, then the closer a position gets to being a
>> > superbackgame, the less we can trust the bot verdict.
>>
>> I'm not sure how related it may be, but there is a group of Greek
>> academics that have published some articles on their work on a bot,
>> Palamedes, that plays backgammon but also variants that have different
>> rules and starting positions and lead to positions that would be very
>> uncommon in backgammon.
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/0a61ede8/attachment.html>

------------------------------

Message: 17
Date: Thu, 5 Dec 2019 13:28:04 +0100
From: Øystein Schønning-Johansen <address@hidden>
To: Nikos Papachristou <address@hidden>
Cc: Joseph Heled <address@hidden>, address@hidden,
        "address@hidden" <address@hidden>
Subject: Re: current development
Message-ID:
        <CAOzpFnRQj7f7qpomYD9dNM3QPkjhHqLAKiQvtwQVocXn=address@hidden>
Content-Type: text/plain; charset="utf-8"

I have tried some experiments, and it looks like the training dataset (for
contact positions) with the current input features, do indeed like some of
the more modern methods. Briefly summarized:

Things that improves supervised learning on the dataset:
* Deeper nets, 5-6 hidden layers combined with ReLU activation functions.
* Adam (and AdamW) optimizer.
* A tiny bit of weight decay.
* Mini-batch training.

Things that does not work:
* Dropout.
* PCA of inputs.
* RMSProp optimizer (About the same performance as SGD).

I've tried training with Keras and on GPU's, and the training is really
fast. However a plain CPU implementation of modern neural network training
algorithms is actually not much slower for me. Also porting GPU code over
into the GNU Backgammon application might not be faster as a lot of cycles
will be used shuffling data back and forth between main memory and GPU
memory.

So the process I ended up using was:
1. Test out what works with Keras+GPU
2. implement that working method in C code for CPU.
3. Train NN with that code.

I've only worked with the contact neural network, as I see some strange
issues with race dataset, and I think it require a re-rollout.

-Øystein

On Thu, Dec 5, 2019 at 12:38 PM Nikos Papachristou <address@hidden>
wrote:

> Hi everybody!
>
> You can view my research publications on backgammon variants at my
> website: https://nikpapa.com , or alternatively you can download my PhD
> thesis from:
> https://www.didaktorika.gr/eadd/handle/10442/43622?locale=en
>
> My personal view on improving GNUBG: Why not try to "upgrade" your
> existing supervised learning approach? There have been lots of advances in
> optimization/regularization algorithms for neural networks in the past
> years and it might be less demanding that trying a new RL self-play
> approach from scratch.
>
> Regarding expected results, I also believe that backgammon bots are very
> close to perfection and whatever improvements (from any approach) will be
> marginal.
>
>
>
> On Thu, Dec 5, 2019 at 12:14 AM Joseph Heled <address@hidden> wrote:
>
>> A link to something? article? software? did they use alpha-like
>> strategies?
>>
>> -Joseph
>>
>> On Thu, 5 Dec 2019 at 11:04, Philippe Michel <address@hidden>
>> wrote:
>>
>>> On Wed, Dec 04, 2019 at 02:07:18PM -0500, Timothy Y. Chow wrote:
>>>
>>> > Also, it's my impression that many people *don't* think this is even a
>>> > worthwhile idea to pursue.  Backgammon is already "solved," is what
>>> they
>>> > will say.  It's true that "AlphaGammon" will surely not crush existing
>>> > bots in a series of (say) 11-point matches.  At most I would expect a
>>> > slight advantage.  But to me, that is the wrong way to look at the
>>> issue.
>>> > I would like to understand superbackgames for their own sake, even
>>> though
>>> > they arise rarely in practice.  Furthermore, if we know that bots
>>> don't
>>> > understand superbackgames, then the closer a position gets to being a
>>> > superbackgame, the less we can trust the bot verdict.
>>>
>>> I'm not sure how related it may be, but there is a group of Greek
>>> academics that have published some articles on their work on a bot,
>>> Palamedes, that plays backgammon but also variants that have different
>>> rules and starting positions and lead to positions that would be very
>>> uncommon in backgammon.
>>>
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnu.org/archive/html/bug-gnubg/attachments/20191205/75b93a17/attachment.html>

------------------------------

Message: 18
Date: Thu, 5 Dec 2019 11:32:00 -0500 (EST)
From: "Timothy Y. Chow" <address@hidden>
To: "address@hidden" <address@hidden>
Subject: Re: current development
Message-ID: <address@hidden>
Content-Type: text/plain; charset=US-ASCII; format=flowed

On Thu, 5 Dec 2019, Nikos Papachristou wrote:
> My personal view on improving GNUBG: Why not try to "upgrade" your
> existing supervised learning approach? There have been lots of advances
> in optimization/regularization algorithms for neural networks in the
> past years and it might be less demanding that trying a new RL self-play
> approach from scratch.
>
> Regarding expected results, I also believe that backgammon bots are very
> close to perfection and whatever improvements (from any approach) will
> be marginal.

In order to determine whether a new network is doing better than the old
network, it helps to have examples of positions where the old network is
clearly playing poorly.  Here's one example of a game that I played
against eXtreme Gammon where the bot made a lot of obvious blunders:

http://timothychow.net/cg/Games/7pt2015-05-24e%20Game%202.htm

For example, search for "10/8 6/4(3)".  The bot's ridiculous play here
would not be among the top 50 plays of any halfway decent human player.
Admittedly this was XG but I would expect GNU to behave similarly, if not
in these specific positions then in similar ones.

Playing around with positions like this will quickly disabuse anyone of
the illusion that "backgammon bots are very close to perfection."

As I recall, in the past, people have tried specifically training neural
nets on positions like these, as well as "snake" positions where you have
to roll a prime for a long distance, and the problem was that it seemed to
degrade performance on other types of positions.  It's possible that, as
Papachristou suggests, recent incremental improvements in regularization
algorithms might be good enough to overcome these difficulties.  Anecdotal
evidence from Robert Wachtel's revised version of "In the Game Until the
End" suggests that Xavier was able to improve eXtreme Gammon's post-coup
classique play significantly, without a wholesale switch to modern deep
learning methods.

Tim



------------------------------

Subject: Digest Footer

_______________________________________________
Bug-gnubg mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/bug-gnubg


------------------------------

End of Bug-gnubg Digest, Vol 201, Issue 5
*****************************************

reply via email to

[Prev in Thread] Current Thread [Next in Thread]