bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Re: Bug-gnubg Digest, Vol 89, Issue 11


From: Frank Berger
Subject: Re: [Bug-gnubg] Re: Bug-gnubg Digest, Vol 89, Issue 11
Date: Fri, 30 Apr 2010 00:01:13 +0200

Hi Ian,

tnx for the info. 

Yes, BGBlitz uses TD-Lambda training only.  I assume that your TD-Lambda code 
migt have a small quirk somewhere. At least that was my experience when I 
started some years ago based on the net of eric groleau. I was sure my encoding 
was better, but I couldn't get it playing stronger and the net was stuck at 
intermediate strength.
So I decided to start from scratch. Two years later I found that I had a small 
bug in the encoding where for one input the sign for Gammons was wrong.....

I made some experiments with training a 1-ply net with a 2-ply net (0 and 1 
-ply in Gnu speak :)) but it was terribly slow and after around 3 weeks there 
was a slight decrease in playing strength. It might have been a transition to 
better play but it was so slow, that I wasn't patient enough :))

ciao
Frank



> Frank, I mis-remembered our test results. We have seen steadily
> increasing performance as we have gone through 40, 80, 128 (gnubg's
> current size) and 180 hidden nodes. 200 nodes is roughly the same as 180
> so far, and I'm not sure what really happened to the 512 node test - I
> can't find the results right now. 
> 
> One of our problems is to get the best training out of the network. So
> far, our best efforts have been when we started with temporal difference
> training, then switched to supervised training when that stalls.
> 
> So far, this has worked better that temporal difference or supervised
> training alone. We can't explain why, and it is a problem because it
> requires manual intervention. We would like a single training method
> that convergence to optimal performance, because this will make it much
> easier to try new things. 
> 
> You seem to get by on purely TD training, don't you.
> 
> -- Ian





reply via email to

[Prev in Thread] Current Thread [Next in Thread]