Arimaa Forum - Print Page


    
      
        Arimaa Forum
        (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
      

        Arimaa >> Bot Development >> bot_akimot update
        
(Message started by: sleepywind on Feb 2^nd, 2009, 2:13am)

Title: bot_akimot update
Post by sleepywind on Feb 2^nd, 2009, 2:13am

Hi all,
I am the author of bot_akimot which recently showed up on the bot rating list ...
I have started working on akimot in 09/2008 from scratch as a subject of my master thesis. This thread is meant as an update on what akimot is capable of.

Akimot

*is developed in C++ under Unix environment

* is a UCT-based bot with static evaluation.
Classical UCT tree is build up in the progress of the search, while to-the-end-of-the-game Monte Carlo simulations are replaced by fixed-length simulations followed by static evaluation. Search is step-based with simple transposition table for the UCT tree. Program runs in multiple threads - speedup is +- 1.6 in 2 threads so far. There is a simple 1-ply goal check extension done by narrowed exhaustive search.

* uses bitboard representation
Originally integer-board representation was used, but like a month ago I refactored it to use the bitboards - not like it is much faster, but the code is more elegant and in the time of 64-bit processors it is difficult to resist temptation for almost free speedup.

* uses aei interface
Getmove is implemented as well (mostly for testing against other source open bots), but for connection to the gameroom only aei is used.

* is capable of (hopefully) correct play
I haven't performed much testing so far ( basically just series of one-to-one bot matches with sample_c bot and very little bot-human games), but all rules of arimaa should be handled correctly.

Highest on todo list right now:

* static evaluation function
I don't aim for powerfull, fancy evaluation functions strong bots around are having, but something better than material sum up used now would be good. Originally I intended to let the constants in evaluation function get automatically tuned by something like SA (I have already downloaded and parsed games archive to particular positions for that purpose), but after reading about negative experiences on this forum I guess I will stick with hand tuning for now.

* knowledge to guide playouts/uct tree
Hooks for these are already done, it's the knowledge that needs to be filled. Actually in previous version of board representation I had the knowledge here as well and it showed positively in the program performance.

* simple trap search extension
I plan to apply search extensions in the uct tree right now, thus they don't have to be superb speedy.

*find a reasonable server (64-bit 2 cores) to run the bot in the gameroom for longer time ( my laptop is just too old for that :))

Performance

Initial testing was done with previous version of akimot using the integer board representation (I used adapted Fairy evaluation function + simple knowledge in playouts/tree)

* performance scales well
Not a surprise - more time makes program stronger ( 25s/move akimot defeated 5s/move akimot in more than 75% cases ).

* akimot is still very weak
Biggest success so far is 20% winning rate against sample_c bot with 2-ply search (akimot was in 5s/move) , which is rather pathetic I know :).

Small overview:
So far I have worked on the bot "parttime" (1. work, 2. school, 3. akimot ) - with like end of march this will change (1. akimot 2. akimot 3. akimot).
I expect most of the coding work to be finished ( board implementation, rules + testing, search startup, protocol implementation, ... ). Ahead is knowledge input, debugging, tweaking, testing, etc. Right now there is about 10000 loc given by wc -l (around 6000 effective I guess).

Goal
I am supposed to deliver my master thesis till end of July. I hope to achieve > 1800 with akimot till then.

Title: Re: bot_akimot update
Post by arimaa_master on Feb 2^nd, 2009, 5:07am

Nice,
good luck with your bot (you will definitely need some - 1800 rating to achieve in less then 6 months will be quite an achievement I guess).

Title: Re: bot_akimot update
Post by tize on Feb 2^nd, 2009, 8:55am

This sounds like what I'm planing for my bot_carlos, good luck with akimot.

Hopefully it will prove to be, at least, as good as alpha-beta pruned minimax.

Title: Re: bot_akimot update
Post by omar on Feb 5^th, 2009, 9:56pm

Hi sleepywind,

Thanks for giving us some information about your bot. It is always interesting to see what new ideas are being tried out. Best of luck with Akimot. Hope you will enter it into the bot tournament next year. BTW what university are you in?

Title: Re: bot_akimot update
Post by sleepywind on Feb 6^th, 2009, 6:51am

Thank you. Yes, next year I would like to enter the bot tournament. I study at Charles University at Prague/Czech Republic, my master programme is Artificial Intelligence.

Title: Re: bot_akimot update
Post by omar on Feb 14^th, 2009, 10:09pm

It will be interesting to see if an Arimaa bot using UCT will be competitive with some of the other top bots. UCT has been somewhat of a break through for Go programs, but I think jdb's attempts to use it with bot_clueless did not go well. Karl mentioned to me that in Arimaa random play would tend to encourage advancing rabbits without support; generally a bad thing. Some enhanced to UCT would probably be required to apply it to Arimaa.

Title: Re: bot_akimot update
Post by sleepywind on Feb 16^th, 2009, 2:46am

Actually, main motivation for me to start a work on a UCT-based arimaa program was to try out sucessful approaches in computer go (UCT) in a fundamentally different board game. I can only confirm what was previously written couple of times in this forum, that to-the-end-of-the-game playouts just don't work in arimaa becouse of overrated rabbit values. But in my opinion combination of (a) UCT tree + (b) short playout + (c) static evaluation might provide a good base for reasonably playing program. This way, in theory, bot could benefit from : more human-like approach of move selection (=a) than in alpha-beta, ability of potential further lookahead (=b playouts right now are set to length of 3 moves) + reasonable evaluation of positions based on piece values, trap evaluations, etc. (=c). Tactical weakness (compared to alpha beta bots) will be an issue though ...

Title: Re: bot_akimot update
Post by BlackKnight on May 2^nd, 2009, 9:30am

May I kindly ask whether you have any new information about Akimot for us? Thanks.

Title: Re: bot_akimot update
Post by sleepywind on May 20^th, 2009, 1:00am

Sorry for such a late reply. I was away for a while ...
So here is what is new:

I made some progress in akimot's performance. Now he is able to win over 95% of games against 2-ply sample C bot and 50% of games against 3s/move fairy. However (!) in both cases akimot gets a time handicap of 10s/move.
I guess I have to reevaluate my goals. To achieve 1800 till the end of the summer is probably not realistic :( I have stated a new goal - to match the fairy bot without the time handicap :).
I tried out the UCT-RAVE algorithm which ended up as a failure (somehow I wasn't surprised). The reason is I believe the little information gained from the RAVE playouts ... while the playouts are not that long (3-4 half moves) and are pressured to be quite local there are quite little moves adding RAVE infomation.
Quite a (relative) success was adding a history heuristic. The implementation is trivial - I update statistics on all the steps throughout the tree and use these as a part of exploration formula in UCT descend.
Dropping FPU and using virtual games instead (node is initialised with d games and 0 value) also made a nice improvement.
I believe the biggest potential for further improvement is in better (yet somehow random) playouts. So far I have just couple of hardcoded weights (step killing a piece is good, step with an elephant is good, step with rabbit in the end of the game is good, push/pulls are good) and when a step is supposed to be played in the playout I randomly choose a few steps and play the one with the highest weight.
Also I noted that the weakness of the program is not that it cannot find the good moves (or rather reasonable moves), but it blunders often. In a situation where there is no clear candidate it might play a blunder because the simulations get spread too wide in the tree and the final move doesn't get that much attention to reveal potential problems deeper in the tree. So it might easily happen that in relatively calm situation the program plays a move and loses a piece the very next move.
I fixed the threaded version so the playouts speedup is approximately linear, but it's strength is not up to the expectation (maybe still some bug in there ... ). I am using the "root" model - all threads run separately in their UCT trees and in the end the trees are merged and final move is selected - this was reported to work well up to 16 threads in some paper on computer go I read.

Title: Re: bot_akimot update
Post by jdb on May 20^th, 2009, 1:34am

on 05/20/09 at 01:00:06, sleepywind wrote:

I fixed the threaded version so the playouts speedup is approximately linear, but it's strength is not up to the expectation (maybe still some bug in there ... ). I am using the "root" model - all threads run separately in their UCT trees and in the end the trees are merged and final move is selected - this was reported to work well up to 16 threads in some paper on computer go I read.

This doesn't feel right to me. All threads should work together to build a single tree. This allows much more information sharing between the threads.

Title: Re: bot_akimot update
Post by sleepywind on May 20^th, 2009, 3:30am

That's my first feeling as well. The reason why I decided to implement the mentioned model is it's very positive evaluation in this paper http://www.cs.unimaas.nl/g.chaslot/papers/parallelMCTS.pdf and also it's simplicity compared to other models.

Title: Re: bot_akimot update
Post by jdb on May 20^th, 2009, 5:42am

I just read the paper you indicated. Their results do look good for the root parallel method. It still does not make sense to me, so I'll think about it some more.

Title: Re: bot_akimot update
Post by Fritzlein on May 27^th, 2009, 6:42pm

on 05/20/09 at 01:00:06, sleepywind wrote:

Also I noted that the weakness of the program is not that it cannot find the good moves (or rather reasonable moves), but it blunders often. In a situation where there is no clear candidate it might play a blunder because the simulations get spread too wide in the tree and the final move doesn't get that much attention to reveal potential problems deeper in the tree. So it might easily happen that in relatively calm situation the program plays a move and loses a piece the very next move.

I am curious about this point. If a bot can find a good move for itself, then why can't it find a good move for the opponent too? A blunder which loses a piece on the very next move should get refuted quickly and fall out of favor, right? Unless the tree doesn't get deep enough to see the refutation move, but I thought that UCT actually would build a deeper tree than vanilla alpha-beta by being more selective.

Hannoskaj has expressed great optimism about what Monte Carlo methods will be able to do, which makes me eager to understand why randomized methods haven't been able to catch up to full-width searching so far.

Title: Re: bot_akimot update
Post by sleepywind on May 28^th, 2009, 1:26am

Quote:

If the position is "calm" then a lot of moves (the tree is step-based so I mean sequences of steps here) would get a similar evaluation by UCT (the granularity is not as fine as in alpha beta). This I believe prevents selectivity and results in rather shallow wide tree. Of course that a stupid blunder gets refuted quite quickly, however if there are a lot of possible blunders it might very well happen that the time runs out during exploration of one of them. The problem then is the low confidence (number of visits) of the node representing the final move. With approx. 70000pps and 5s/move, final move with less than 5000 visits (which happens in a "calm" position) is not explored enough.

Another (maybe critical) factor is that I was doing most of the tests with fairly short time settings (3s/move or 5s/move). Well, I put up the bot online for some time now with 15s/move so we shall see :)

Title: Re: bot_akimot update
Post by Fritzlein on Jul 31^st, 2009, 5:11pm

Sleepywind, thank you for putting up akimot for public play. It looks like you are only a little short of your goal of 1800 strength (at blitz speed) by the end of July.

I must say that akimot plays more strangely than any bot except Rat. It is fun to have some extra variety in addition to the range of styles that already exists in the bot ladder.

I am curious why akimot is so prone to single-step moves. To say that there are no clear objectives in the position is not enough explanation. A random mover with no objective whatsoever will almost always play a four-step move simply because the four-step moves are most numerous. Therefore akimot must have something biasing it towards single-step moves.

Again, thank you for the public service of offering your bot in the game room.

P.S. If you are supposed to give your thesis at the end of July, does that mean development on akimot stops tomorrow?

Title: Re: bot_akimot update
Post by sleepywind on Aug 1^st, 2009, 12:39am

Hello,
first of alll, thank you for encouragement :)

Quote:

Sleepywind, thank you for putting up akimot for public play. It looks like you are only a little short of your goal of 1800 strength (at blitz speed) by the end of July.

It's difficult for me to estimate the strenght of the engine, as I myself am not very experienced in the game, but I think it's real strength is far below the 1800 (maybe around 1600?).

Quote:

I must say that akimot plays more strangely than any bot except Rat. It is fun to have some extra variety in addition to the range of styles that already exists in the bot ladder.

:) Well I am afraid the engine is quite weak in tactics (not that it's any stronger in strategy) and quite often it just "doesn't know" what is it doing.

Quote:

I am curious why akimot is so prone to single-step moves. To say that there are no clear objectives in the position is not enough explanation. A random mover with no objective whatsoever will almost always play a four-step move simply because the four-step moves are most numerous. Therefore akimot must have something biasing it towards single-step moves.

The reason for this could be following: the UCT tree is step-based and I am treating pass as any other step (except for demotivation to play it in node's initialization). Now comes the catch - in the end of the search when a move to play is to be outputted, I take all the legal moves and select the one with most visits. If the position is rather "calm" then the move with pass on the second level (step1-pass) can get more visits then all the moves on the fourth level (step1-step2-step3-step4) even though their 'shorter version' (step1-step2) outnumbers the passing move by a lot. Solution to this might be different approach to selecting a final move (something like start in the root, take step with most visits and descend until it's not opponent's turn).

For past month and a half I've been writing the thesis and doing some offline performance tests against anchor bots (mostly bot fairy). There was very little progress in the code in this period... I am handing my theses this week and here is what I would like to do (hopefully sometimes in the very near future):
1) provide the thesis for public download (no big discoveries there - summary of MCTS methods, how we applied them in the Arimaa environment, some of our improvements and performance analysis)
2) release the akimot code (GPL?) - if someone is toying around with idea of writing a MCTS arimaa bot this could be a good starting point for him. There is not much documentation though (except for short comments on methods in header files), tragically little unit tests and probably a lot of bugs :). On the other hand it should be a use-out-of-the-box bot with reasonable configuration options and hopefully managable code.
3) i have written couple of small apps during the course of work on the project, for instance: simple game replaying GUI, simple positions testing framework. These are written in python and are using excellent AEI module (thanks Janzert). I will "polish" these a bit and provide them for Arimaa developers as well, if there will be interest ...

Quote:

Again, thank you for the public service of offering your bot in the game room.

P.S. If you are supposed to give your thesis at the end of July, does that mean development on akimot stops tomorrow?

No, the development of akimot doesn't stop here. Even though the overall performance of akimot (coming from my programming inaptitude or impropriety of the MCTS approach or both) was a bit disappointing to me. I learned a lot during its development and I still have some ideas ...

Title: Re: bot_akimot update
Post by sleepywind on Dec 6^th, 2009, 3:25am

Hi everyone,
unfortunately I have been quite busy with work lately and it seems to hold for (near) future as well. Therefore I am not spending much time (hardly any :() on akimot's development. On the other hand I have finally forced myself to fulfill my promise and provide the resources behind akimot to the community. So here we go:

My masters thesis about akimot can be downloaded at http://kozelek.cz/akimot/mt.pdf, short presentation is at http://kozelek.cz/akimot/mtp.pdf. I have also created an online source code repository for the project at GitHub:http://github.com/tomik/akimot. This repository is a full copy of my development repository for akimot. User manual (http://kozelek.cz/akimot/doc/usr/usr_doc.pdf) gives basic information about the project, it's layout and it's support applications (i.e. the testing suite, the gui, etc.). Programming manual(http://kozelek.cz/akimot/doc/prg), on the other hand, gives brief overview over main classes and their interaction (this manual can be generated with doxygen from the source code). Majority of code is my own except from excellent AEI library by Janzert, perl match scripts by Omar and pieces of C/C++ code from other open source engines. The project is released under GNU GPL, but if the license is too restrictive for someone, let me know and we can talk about that :)

I guess the project could be useful mainly for beginning Arimaa programmers, but maybe the "experienced guys" can find some ideas as well. I welcome any questions/comments regarding the project. I hope to keep the development alive, but probably in a very slow pace. I think that this way the project is way more useful.

Title: Re: bot_akimot update
Post by Fritzlein on Dec 6^th, 2009, 12:05pm

It is very good of you to donate to the community, sleepywind, especially since your bot is a completely different approach. Developers who are interested in getting started with UCT rather than alpha-beta search now have a nice resource.

Title: Re: bot_akimot update
Post by Fritzlein on Dec 6^th, 2009, 1:25pm

I have now read your thesis. I think it must be an even more valuable contribution than your code, in part because the code would be very difficult to understand without the thesis, but also because many people with no use for the code will be able to benefit from your clear exposition of your motivations and results. It was a good read for me even though I have never developed a bot and probably never will.

I had done some minimal reading about Monte Carlo Tree Search, but now I better understand how unsettled the field is, and how much room remains for optimizing the basic premise. If I had been tuned in to the research all along, I would not have been perpetually expecting David Fotland to return to Arimaa programming. MCTS was not just a one-time surprise that Fotland had to catch up to, it has been an ongoing field for exploration with rich payoffs for optimizers. I will not hope for Fotland to revisit Arimaa until progress in MCTS Go has plateaued.

One nugget of information in your results that quite surprises me is the extreme benefit bot_akimot derived from time doubling. From your graph I infer that it gained about 130 Elo points of playing strength per doubling. This is twice what clueless gained in my one experiment. Also it is apparently more than bot_fairy was gaining per doubling. If the trend continues, then one would expect developers to eventually be forced away from alpha-beta to MCTS when the hardware gets fast enough.

One question which I am extremely curious to have answered was unfortunately not one of your research goals. You seemed focused on comparing MCTS in Go to MCTS in Arimaa. I would be very interested also in comparing MCTS in Arimaa to alpha-beta in Arimaa.

In particular, the strength of bot_akimot is apparently very dependent on the strength of its hand-coded evaluation. Was bot_akimot unable to equal bot_fairy due to fairy's superior evaluation, or due to the the superiority of full-width search over random search? You could have pitted bot_akimot against an alpha-beta searcher with akimot's evaluation and 4-step goal and capture detection. If the alpha-beta searcher did better, then it would seem that MCTS is a handicap rather than a benefit for Arimaa.

To put it another way, my question is whether random playouts must give reasonable evaluations for MCTS to be a valuable technique. When you limit the playouts and introduce static evaluation, have you inherently destroyed the reason to use MCTS? Perhaps you know the answer to this already from reading about MCTS for Lines of Action, Amazons, and Hex.

Thank you again for your interesting research and lucid presentation.

Title: Re: bot_akimot update
Post by Fritzlein on Dec 6^th, 2009, 1:30pm

on 02/06/09 at 06:51:30, sleepywind wrote:

Yes, next year I would like to enter the bot tournament.

Are you still planning to enter the 2010 Computer Championship? I hope so, although the new qualifying procedures increase the work that is required to do so.

Title: Re: bot_akimot update
Post by omar on Dec 7^th, 2009, 12:30am

Wow, great Thesis Tomas; very interesting. Thanks for making your bot available for others. It will be a great example for others wanting to try the MCTS approach.

I have added it to the 'Academic Papers' page:
http://arimaa.com/arimaa/papers/

and also to the 'Downloads' page:
http://arimaa.com/arimaa/download/

Title: Re: bot_akimot update
Post by sleepywind on Dec 10^th, 2009, 6:24am

Quote:

One question which I am extremely curious to have answered was unfortunately not one of your research goals. You seemed focused on comparing MCTS in Go to MCTS in Arimaa. I would be very interested also in comparing MCTS in Arimaa to alpha-beta in Arimaa.

In particular, the strength of bot_akimot is apparently very dependent on the strength of its hand-coded evaluation. Was bot_akimot unable to equal bot_fairy due to fairy's superior evaluation, or due to the the superiority of full-width search over random search? You could have pitted bot_akimot against an alpha-beta searcher with akimot's evaluation and 4-step goal and capture detection. If the alpha-beta searcher did better, then it would seem that MCTS is a handicap rather than a benefit for Arimaa.

This is a good point and I probably should have invested more effort in comparing MCTS and alpha-beta approaches. On the other hand experiments against bot_fairy reveal some information. If you check the "Scalability test" you can see that on reasonable time conditions (16s) akimot is almost equal to bot_fairy (over 45% winning rate). Moreover the graph suggests that akimot's winning ratio might have rising tendency for longer times. This variant of bot_fairy has "full" evaluation which, as you said, is superior to the akimot's one (I haven't invested much effort into a good evaluation function). The fact that akimot can match alpha-beta engine with superior evaluation is encouraging for MCTS approach.
Moreover I suspect a lot of potential improvements in the akimot's approach - most of these are described in "future work" section. Just a small example: after I handed my thesis I made a small experiment and I have implemented a "relative evaluation of the position" (position value is taken with respect to value in the root node) - this took around 5 lines of code to be changed and the performance of bot improved significantly (over 90% winning against bot_fairy_3s on 16s/move instead of previous 75%).
But one think is quite clear to me now. MCTS doesn't feel that "right" (or natural if you want) for Arimaa as it does for Go. A lot of factors contribute to this: longer playouts are inaccurate, 4 steps in a move make complications, repetitions are way more difficult to handle, etc.

Title: Re: bot_akimot update
Post by Fritzlein on Dec 10^th, 2009, 10:13am

Thanks for the additional observations. Clearly the last word on MCTS versus alpha-beta tree search has not been written. It is fun to track the experiments in such an interesting discipline.

From a couple of Google searches, it appears that the level of the best bots in Hex was raised by MCTS, but not in Lines of Action or Amazons, at least not yet. The (champion) MCTS Hex program uses playouts to game end for evaluation, though, which does nothing to help answer the question of how useful MCTS can be when evaluations must be imposed before game end.

Title: Re: bot_akimot update
Post by Hippo on Feb 1^st, 2010, 1:50pm

It would be interesting experiment to let play two bots with the same evaluation function, but one of them alpha-beta and the other MCTS (of course it is not so easy as the move generators algorithms must differ a lot).

I still think the MCTS will dominate in such game.

The important thing would be in the time planning strategy.

Actually it may be valuable even for MCTS to build time reserve in "easy" situations and go to reserve in the more difficult ones, but stopping in given time is much easier in MCTS than in alpha-beta where it's must to decide at which level of search to stop.

Title: Re: bot_akimot update
Post by Fritzlein on Feb 1^st, 2010, 3:41pm

on 02/01/10 at 13:50:42, Hippo wrote:

So, can you use your academic authority to mandate this experiment? :) BlackKnight is forcing his entire AI class to write alpha-beta searchers, so there is a precedent...

Title: Re: bot_akimot update
Post by Hippo on Feb 3^rd, 2010, 12:54am

I will certainly try, but we should wait for such thesis to be written.
We can wait 1 or 2 years for results :(.

Title: Re: bot_akimot update
Post by Janzert on Feb 3^rd, 2010, 5:26pm

on 02/01/10 at 13:50:42, Hippo wrote:

Hmm, I expect that an eval developed with one search type and then used in another search type will play worse than an eval developed under the search type it is used with. Whichever direction that development then use transfer was to take place in.

My belief on this arises from the poor experiences of people trying to take chess evals developed with PVS search and use them with MTD-f. Even this relatively minor change in search behavior seems to really require somewhat different evaluation behavior (for instance MTD-f seems to like a larger score granularity). I would expect the change from MCTS to alpha-beta to be even greater.

Janzert