|
||
Title: Monte-Carlo simulation balancing Post by Hannoskaj on Jun 1st, 2009, 1:41am If there is anyone who tries to write a MC(TS) bot and does not read the computer go mailing list, here is an article that should prove useful. They have formalised the intuition that MC playouts should be as unbiased as possible: mc-6x6.pdf (http://computer-go.org/pipermail/computer-go/2009-April/018159.html) |
||
Title: Re: Monte-Carlo simulation balancing Post by Fritzlein on Jun 8th, 2009, 7:31pm Thanks for the interesting link, Hannoskaj. Let me see if I can apply it to the case of Arimaa: Monte Carlo search works badly for Arimaa because advanced rabbits are grossly over-valued. With random playouts, there is too great a chance that an advanced rabbit will goal by accident, when it could easily be stopped. Therefore Monte Carlo Arimaa players will fling forward their rabbits recklessly, always thinking it just might produce a goal. An obvious approach to correcting the problem would be to make playouts less random by making them stronger. The thought is that if the two players are not quite so moronic, then advanced rabbits are less likely to score by accident. Indeed, taking the "strong playout" principle to an extreme, if we played out every position perfectly, there would be no accidental wins whatsoever. With perfect playouts, Monte Carlo evaluations would become perfect. (On the other hand, if we could play out positions perfectly, we wouldn't need Monte Carlo methods, would we? I have always felt some sort of paradox was present when people have suggested that reducing the randomness of random playouts would fix the brokenness of Monte Carlo searches in Arimaa.) This is where Silver and Tesauro come in. They point out that playouts can be strong, but as long as they are not perfect, they can still be biased. This seems perfectly applicable to Arimaa, wherein an element of randomness is still going to bias evaluations in favor of advanced rabbits even if the playouts are relatively less stupid. That bias is still going to mess up playout-produced evaluation. The problem seems insurmountable to a schloob like me, but these geniuses notice that the bias can be corrected by weakening the play of the side that is getting too many accidental wins. Monte Carlo evaluations are not wrecked by playouts being idiotic per se, as long as the playouts are equally idiotic by both players. Therefore, we need only take care that the defender isn't playing more idiotically than the attacker. A very weak playout function can produce accurate evaluations when the playout function isn't biased towards either side. This is brilliant, and I tip my hat to Silver and Tesauro. Unfortunately, my mathematics isn't quite adequate to understand how they propose to eliminate the playout bias. They apparently need some strong evaluator to let them know in which direction their playouts are being biased. I don't immediately see why the bias-correction service that is performed by the strong evaluator (in conjunction with Monte Carlo playouts) should create a better player than the strong evaluator itself (in conjunction with alpha-beta search). Nevertheless, although I can't see the whole theoretical path to a strong Monte Carlo Arimaa player, my mind is more open to the possibility than it has been in a while. Thanks again, Hannoskaj. |
||
Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1! YaBB © 2000-2003. All Rights Reserved. |