Arimaa Forum - Print Page


    
      
        Arimaa Forum
        (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
      

        Arimaa >> General Discussion >> Gathering statitistics on B/W imbalance
        
(Message started by: gatsby on Jun 30^th, 2008, 2:22am)

Title: Gathering statitistics on B/W imbalance
Post by gatsby on Jun 30^th, 2008, 2:22am

Some days ago, I read at http://www.rybkachess.com/index.php?auswahl=Rybka+2.3+readme that Rybka, the world's best chess engine (it is said to play at a level of around 3100 Elo points on a fast machine) incorporates a feature which is called the "randomizer". Rybka's creator, Vasik Rajlich, explains it as follows:

"There is one interesting new feature, which we will call the "randomizer". The idea is to allow a user to play many games from a single starting position in order to collect statistics about that position. A randomized Rybka will keep track of the previous games and not repeat previous variations, so that a match between two randomized Rybkas will systematically explore the space of variations from the starting position. [When you set up this kind of match,] make sure that you ask for many games. I suggest somewhere around 500. [Also,] I can suggest very fast games, for example use a fixed-depth of 6, 7 or 8. Once the match is finished, you will have a set of games. This is useful [because] the statistics on this set of games are in my experience a fairly reliable indicator of the evaluation of the position."

It must be said that a "random Rybka" is not entirely random, but that it chooses its moves among those which are inside some centipawn threshold margin, which Rajlich recommends to establish at 10.

This said, and taking into account that bots aren't providing valuable statistics on whether Gold or Silver have an initial advantage, due to their lack of an apropriate use of the set-up phase:

Couldn't we create a database with the positions immediately after the set-up phase of all the human versus human games, and then set up a match between two "randomized Bombs" in a way that they play (say) 50 fast games with every position present in it? The statistical analysis of this match would provide us valuable information on the issue of whether Gold or Silver has an initial advantage and by which amount. The results wouln't be definitive, specially provided that Bomb is not nearly as good at arimaa as Rybka is at chess, but, anyway, it would be interesting to get this information before arimaa grows (more) in popularity. If we need to change some rules (I hope not) to keep the game balanced, the moment is now.

Title: Re: Gathering statitistics on B/W imbalance
Post by aaaa on Jun 30^th, 2008, 3:03am

If even high-skilled players have barely a clue as to what the advantage is of playing one color over the other, I sincerely doubt that much weaker bots will be able to find this out by semi-randomly playing out games.

Title: Re: Gathering statitistics on B/W imbalance
Post by Fritzlein on Jun 30^th, 2008, 6:19am

on 06/30/08 at 02:22:11, gatsby wrote:

This said, and taking into account that bots aren't providing valuable statistics on whether Gold or Silver have an initial advantage, due to their lack of an apropriate use of the set-up phase:

Couldn't we create a database with the positions immediately after the set-up phase of all the human versus human games, and then set up a match between two "randomized Bombs" in a way that they play (say) 50 fast games with every position present in it?

I concur with aaaa's comment that Bomb isn't strong enough for the results to be meaningful. For example, one strategic tendency of Bomb is that it prefers to group its camel and both horses all around the same trap. This is sometimes a good attacking idea, but more often is a liability because it leaves a weak side. (Arimaabuff exploited this weakness of Bomb's, among many others, in his record-setting handicap game.) In many games Bomb never achieves this clumping of strong pieces, because they are split in the setup, and the opportunity to rearrange them never arises, but the tendency is nevertheless there.

If Silver has an inherent advantage, it must be due to having the second setup. Let's say, for example, that Silver can gain tempo by placing a camel on one wing and both horses on the other, but Gold cannot. In order to gain an advantage from unbalanced forces like this, Silver must be willing to dart forward with with his camel when Gold's elephant is preoccupied on the other side, and jump back to safety when the gold elephant crosses or threatens to cross. Bomb does not do this: it keeps its camel safe except in extreme situations. If you forced Bomb to start with M on one side and HH on the other, it would only make it easier than usual for Bomb to achieve the MHH clumping all on one side.

Similar objections occur around rabbit placement in the setup. Perhaps swarming is good in some situations and bad in others. However, Bomb does not swarm. You can force it to set up its rabbits however you like, but it still won't swarm. It won't play differently if a different situation demands it.

In short, forcing Bomb to play from different Silver setups will not be a good indicator of the value of the second setup. Ultimately, if such a test measures anything at all, it will measure the value of the initiative of the first move, and even that will be questionable.