Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> Bot Development >> Building a MLP evaluation function
(Message started by: Ikki on Jun 21st, 2013, 10:40pm)

Title: Building a MLP evaluation function
Post by Ikki on Jun 21st, 2013, 10:40pm
Hello all,

I'm trying to build my own evaluation function for an Arimaa bot.
I've got a problem when trying to imagine a good way to make my NN learn.

For example, I'd like to be able to provide it thousand of positions with an accurate evaluation in order to make it able to recognize strong position from bad ones.

The problem, of course is that i'm not able to provide theses evaluations because it would require me to be an Arimaa GM and even then, it would take decades to provide enough positions to my NN.

One of my ideas was to provide my NN positions with something that I would call a "linear evaluation". It would work that way:

Imagine a game against A and B. A plays gold and B silver. B win after 50 turns.

After turn 0 (setup), the position is even (50/50)
After turn 1 (1 move each side), the position si 1/50 in favor of B (the winner).
After turn 2, it is 2/50 in favor of B.
Etc.
Until turn 50 in which B won and means that B's position worth 100% and A's position worth 0.

So it supposes a linear evolution of the advantage of the winning player. It supposes that after 1 turn, the winning player already has an advantage which can be true sometimes but will be wrong most of the time. But maybe that in average, that can be a good method, when provide sufficiently games and positions.

Any thoughts about how good or bad this can be ?

Title: Re: Building a MLP evaluation function
Post by Fritzlein on Jun 22nd, 2013, 12:14am
Have you read up on temporal difference learning?

http://en.wikipedia.org/wiki/Temporal_difference_learning

It was the idea behind TDGammon, which was, for a time, the best backgammon bot in the world.

http://en.wikipedia.org/wiki/TD-Gammon

On a cautionary note, make sure to read haizhi's thesis.  His learning bot became superstitious, i.e. the weightings on the learned evaluation were unrelated to good play.

http://arimaa.com/arimaa/papers/HaizhiThesis/haizhiThesis.doc

Title: Re: Building a MLP evaluation function
Post by mattj256 on Jun 25th, 2013, 2:13am
A different thread (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=devTalk;action=post;num=1206411478;quote=0;title=Post+reply;start=0) you might find interesting:


on 03/24/08 at 21:17:58, IdahoEv wrote:
I have a stored database of all boards that have existed in the game (at least through the last time I ran the updater), meaning every step of every turn of every game saved independently as a board description.

I am mining this database to work on the eval function of my bot.   What would be nice to have (but is of course impossible) is some a priori measure of which player is winning  -- and ideally by how much -- that could be used to train an eval function to model that variable.



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.