Arimaa Forum - Print Page


    
      
        Arimaa Forum
        (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
      

        Arimaa >> General Discussion >> Measure stereotyped openings
        
(Message started by: Fritzlein on Apr 8^th, 2008, 2:24pm)

Title: Measure stereotyped openings
Post by Fritzlein on Apr 8^th, 2008, 2:24pm

In another thread, gatsby suggested a way to avoid stereotyped openings. At the present, standardized play seems to be a total non-issue. Can we quantify this somehow?

The metric I would like is to look at all rated games between humans in the entire database, and see how quickly they diverge into distinct positions. There might be 5000 such games, and the statistic might look like this:

After 1w, there are 50 distinct positions, 4950 duplicates
After 1b, there are 1000 distinct positions, 4000 duplicates
After 2w, there are 4000 distinct positions, 1000 duplicates
After 2b, there are 4850 distinct positions, 150 duplicates
After 3w, there are 4980 distinct positions, 20 duplicates
After 3b, there are 4998 distinct positions, 2 duplicates
After 4w, there are 5000 distinct positions, 0 duplicates

I'm curious what the exact shape of this curve will be, and what is the longest that two games have coincided. It would be great marketing to say that no two games have ever been identical after four moves.

Generating these numbers is beyond me, but would it be super-much trouble for someone who has already has a game database and a hashing function? Janzert? 99of9? Actually I have a vague recollection that 99of9 already did something like this to generate an opening book for Gnobot...

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 8^th, 2008, 2:48pm

Actually, a much cooler measure of how stereotyped openings are would be Claude Shannon's measure of entropy. If there are only ten setups that have ever been tried for Gold, there is less diversity (entropy) if the distribution is

4991, 1, 1, 1, 1, 1, 1, 1, 1, 1

and more diversity if the distribution is

500, 500, 500, 500, 500, 500, 500, 500, 500, 500

even though the two distributions each have 10 distinct positions and 4990 duplicates. By Shannon's measure the first distribution has entropy 0.025 and the second has entropy 3.32. If we get up to five thousand distinct positions having occurred once each, the entropy is 12.29.

The second nice feature of using Shannon entropy is that one could compare it to a selection of 5000 random chess games from some database, and see the relative speed at which the two games break away from the grip of the opening and into something unique.

Title: Re: Measure stereotyped openings
Post by aaaa on Apr 8^th, 2008, 2:49pm

You're getting ahead of yourself there. Currently, only 3772 rated games have been played between humans so far. I frankly think that's too little to get any meaningful statistics about the diverging nature of Arimaa.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 8^th, 2008, 6:22pm

Yes, it is far too early to know if Arimaa will be prone to stereotyped openings once we know how to play it well. Even if we fall into repeating openings some day, it may be positions that aren't even in the database yet. I guess I just wanted to say how divergent openings are so far...

Title: Re: Measure stereotyped openings
Post by woh on Apr 10^th, 2008, 6:15am

As of march 31 3745 rated HvH games have been played. Only 3719 lasted till turn 4b. 3701 board positions occured only in one those games and 9 board positions occured in two games each. So after 4b, there are 3710 distinct positions and 9 duplicates.

More results:
total distinct duplicate 1w 3745 605 3140 1b 3745 1977 1768 2w 3745 2547 1198 2b 3745 2993 752 3w 3745 3459 286 3b 3738 3641 97 4w 3726 3693 33 4b 3719 3710 9

Title: Re: Measure stereotyped openings
Post by woh on Apr 10^th, 2008, 6:22am

The distribution of the positions after each move is:

1w 784 x1 778 x1 273 x1 140 x1 ... 4 x10 3 x31 2 x78 1 x420 1b 195 x1 188 x1 178 x1 143 x1 ... 4 x27 3 x47 2 x135 1 x1680 2w 103 x1 94 x1 57 x1 56 x1 ... 4 x26 3 x61 2 x148 1 x2155 2b 34 x1 29 x1 26 x1 24 x1 ... 4 x26 3 x47 2 x148 1 x2723 3w 9 x1 8 x3 7 x4 6 x1 5 x6 4 x9 3 x28 2 x121 1 x3283 3b 5 x2 3 x11 2 x67 1 x3561 4w 3 x3 2 x27 1 x3663 4b 2 x9 1 x3701

Title: Re: Measure stereotyped openings
Post by woh on Apr 10^th, 2008, 6:30am

To clearify the data in the previous post:

After 3b 2 board positions occured in 5 games each, 11 positions in 3 games each, 67 positions occured in 2 games each and 3561 positions occured only in one game, for a total of 3738 games.

The entropy after each turn is:
1w 5.6121 (max 11.87)
1b 9.4795 (max 11.87)
2w 10.4058 (max 11.87)
2b 11.2211 (max 11.87)
3w 11.6808 (max 11.87)
3b 11.8120 (max 11.868 )
4w 11.8451 (max 11.8634)
4b 11.8559 (max 11.8607)

Title: Re: Measure stereotyped openings
Post by woh on Apr 10^th, 2008, 11:49am

After Gold setup the most popular positions are:

1. 99of9 setup with cats behind the traps (http://home.scarlet.be/~woh/1w_784.jpg): 784 times
2. 99of9 setup with dogs behind the traps (http://home.scarlet.be/~woh/1w_778.jpg): 778 times
3. default setup (http://home.scarlet.be/~woh/1w_273.jpg): 273 times
4. omar setup (http://home.scarlet.be/~woh/1w_140.jpg): 140 times

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 10^th, 2008, 12:03pm

Woh, thank you so much for running these numbers! I didn't know you had implemented your own position hash function. When can we expect your bot in the game room?

on 04/10/08 at 06:15:11, woh wrote:

total distinct duplicate 1w 3745 605 3140 1b 3745 1977 1768

I'm astonished that there have been 605 unique setups for Gold in HvH games. Even with symmetry knocking out half of them (and one would expect that not all setups have been tried in reflection) that's hundreds of Gold setups we have collectively tried. I almost can't believe it. Are you sure that you counted it as a duplicate if the same setup was reached in a different order of steps?

on 04/10/08 at 06:30:41, woh wrote:

The entropy after each turn is:
1w 5.6121
1b 9.4795

I would like to measure the corresponding entropy for chess. I suspect it is less than 2.0 after 1w and about 4.0 after 1b, but one would have to measure to be sure. This is something it is probably within my skill to measure from a chess database, since one doesn't have to worry about transposed steps in the first two ply of chess.

But it is interesting to note that the theoretical most entropy possible on the first two ply of chess is lg(20^2), since each side has only twenty possible moves. That works out to an entropy of 8.64. That is to say, the players in Arimaa voluntarily provide more freshness in the opening than would be provided by the opening move of each player in chess being chosen entirely at random.

Similarly the entropy provided by Fischerandom chess setup is only lg(960) = 9.9. Players in Arimaa who are trying to win, trying to select the single best move, apparently provide as much variety in the setups as Fischer introduced with dice.

The Arimaa entropy approaches maximum too quickly to say much meaningful beyond the setup, but I see no reason that the huge branching factor won't continue to induce the players to provide their own variety throughout the game. Even as we get better and better at Arimaa, there are just so many possible choices on each move that one can expect differences in taste and judgment to keep the game fresh from very early on.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 10^th, 2008, 12:15pm

on 04/10/08 at 11:49:56, woh wrote:

1. 99of9 setup with cats behind the traps (http://home.scarlet.be/~woh/1w_784.jpg): 784 times
2. 99of9 setup with dogs behind the traps (http://home.scarlet.be/~woh/1w_778.jpg): 778 times

I wonder if these two setups will eventually be like e4 and d4 in chess: dominant, equally popular, and entirely a matter of taste rather than quality.

Title: Re: Measure stereotyped openings
Post by aaaa on Apr 10^th, 2008, 12:52pm

on 04/10/08 at 12:03:47, Fritzlein wrote:

But it is interesting to note that the theoretical most entropy possible on the first two ply of chess is lg(20^2), since each side has only twenty possible moves. That works out to an entropy of 8.64. That is to say, the players in Arimaa voluntarily provide more freshness in the opening than would be provided by the opening move of each player in chess being chosen entirely at random.

Which is exactly what makes me wonder whether it's fair to compare chess and Arimaa on a per half-move basis in the first place. I think what we should look for in a game is how little relative compression is possible of a typical database of games.

Title: Re: Measure stereotyped openings
Post by mistre on Apr 10^th, 2008, 12:54pm

on 04/10/08 at 11:49:56, woh wrote:

Can you continue this list down to at least the top 10? I am curious. Also, I wonder how many people would actually choose the default set-up if it wasn't the default. Any player that has any experience at all will begin to see that Horses on b2 and g2 are superior to dogs. Which is yet another reason to change the default set-up to the omar set-up.

Title: Re: Measure stereotyped openings
Post by aaaa on Apr 10^th, 2008, 1:00pm

on 04/10/08 at 12:54:36, mistre wrote:

Why not make it dynamically take the then currently most common opening (possibly only of rated human-vs-human games)?

Anyway, it shouldn't matter with the new client.

Title: Re: Measure stereotyped openings
Post by mistre on Apr 10^th, 2008, 1:07pm

on 04/10/08 at 13:00:03, aaaa wrote:

Anyway, it shouldn't matter with the new client.

What is this about a new client?

Title: Re: Measure stereotyped openings
Post by aaaa on Apr 10^th, 2008, 1:08pm

Go to Settings, Game Client and change the client to version 2.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 10^th, 2008, 1:53pm

on 04/10/08 at 12:52:28, aaaa wrote:

It certainly isn't fair to chess to compare it to Arimaa, because Arimaa is a much better game. :D If a high branching factor in Arimaa means that it does have more variety in practice than chess can have in theory, then it supports my hypothesis that we don't have to worry about stereotyped openings in Arimaa just because it is a problem that has bedeviled chess. Right now people may expect that the variety of openings exists in Arimaa just because it is a new game and we don't yet know how to play well, but I rather expect that the variety is intrinsic, and we can list it among Arimaa's inherent advantages over chess.

Title: Re: Measure stereotyped openings
Post by aaaa on Apr 10^th, 2008, 2:05pm

As a thought experiment, imagine a chess game where one plays a move and then also gives a list of conditional moves based on every possible reply of the opponent. If we could call this a "move" instead, then the entropy per move would obviously be higher, but one would be hard-pressed to call this a fundamentally different game.

Title: Re: Measure stereotyped openings
Post by mistre on Apr 10^th, 2008, 2:42pm

on 04/10/08 at 13:08:59, aaaa wrote:

Go to Settings, Game Client and change the client to version 2.

I guess I am one of the hold-outs that still use Version 1. I have no problems with it like I used to have. While I like some of the new features of version 2, the graphics aren't up to par of version 1.

So enlighten me. Why does the default set-up not matter in Version 2?

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 10^th, 2008, 2:47pm

Since a game of Arimaa takes about as many moves as a game of chess, and we allow about the same thinking time per move, and the experience of playing a move in each game is similar, I think it is in fact illuminating to compare the two on the basis of standard moves in each game. Sure, as a thought experiment one could redefine what a chess move is. But that wouldn't change the percentage of the game that the players are in book. If it takes 26 moves of a 40-move game before the game reaches a position that is not familiar to both players, then the openings are sadly stereotyped. Redefine "move" so that it takes 13 moves of a 20-move game to see something new, and the magnitude of the problem has not changed.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 10^th, 2008, 2:51pm

on 04/10/08 at 14:05:05, aaaa wrote:

The game would be so fundamentally different from chess, you would be hard pressed to get two chess players to play under these rules at all. You would have succeeded in minimizing one difference between Arimaa and chess (branching factor) by introducing a much bigger difference (committing to conditional choices)

Title: Re: Measure stereotyped openings
Post by aaaa on Apr 10^th, 2008, 3:04pm

on 04/10/08 at 14:42:56, mistre wrote:

So enlighten me. Why does the default set-up not matter in Version 2?

You have to choose the square for each major piece one by one.

Title: Re: Measure stereotyped openings
Post by aaaa on Apr 10^th, 2008, 3:55pm

on 04/10/08 at 14:47:46, Fritzlein wrote:

But now you're no longer following your criterion of entropy per move. You had to take into account another factor, namely that the two games tend to have the same amount of moves per game. Conversely, by your criterion, Arimaa would have to be categorized as being more stereotyped in nature if we go for Omar's per-piece setup, which I doubt is your intention.

It's your "lg(20^2)" argument I take issue with, not the claim that Arimaa allows for more varied play than chess; I'm sure this will be born out if we use the aforementioned criterion of comparing the relative compression of a typical database of games. Additionally, it would also allow us to better compare Arimaa with other games that have different typical numbers of moves, like Go.

This is in fact all just a rephrasing of the discussion where a game falls in the tactical-strategic spectrum. No game can be strategic without also being tactical, which means it must have its share of obvious and nonsensical moves and Arimaa is no different in that respect.

on 04/10/08 at 14:51:58, Fritzlein wrote:

I agree that there are additional complexities arising here due to the intricacies of human thought processes, but it's telling that engines would be able to perform fundamentally the same without changing any of the domain knowledge.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 10^th, 2008, 4:24pm

on 04/10/08 at 15:55:15, aaaa wrote:

Conversely, by your criterion, Arimaa would have to be categorized as being more stereotyped in nature if we go for Omar's per-piece setup, which I doubt is your intention.

On the contrary, I would agree that Omar's per-piece setup would make the opening more stereotyped and by extension more boring. My objection to his proposed setup procedure was that it would add playing time and rules complexity for no benefit. Breaking up one move into four moves with the same total "action" just makes the opening of the game dull; if the entropy per move turned out to be 1/4 as much as the current setup (because people don't actually use the decision opportunities to deviate, but rather end up with the 99of9 setup four moves later) that would serve to confirm that a good metric of the inherent interest of the opening is the number of moves (relative to the length of the whole game) that it takes to max out the entropy.

Yes, if chess and Arimaa didn't take about the same number of moves per game, they wouldn't be as directly comparable. I am more interested in what proportion of the game is spent "in book" than I am interested in the absolute number of moves spent in book. I take your theoretical point against entropy-per-move. I was merely confused by your two thought experiments about rule changes, both of which seemed to confirm rather than refute that Arimaa openings are not stereotyped as chess openings (and therefore more interesting in my opinion.)

The general thrust of your comments appeared to me to be that the measurements woh has made are arbitrary and meaningless. Your implication was that if you change the way you look at it, chess openings are no more stereotyped than Arimaa openings. I still disagree with that claim, if in fact it is what you were claiming. I apologize if I misinterpreted the point of your remarks.

Title: Re: Measure stereotyped openings
Post by woh on Apr 10^th, 2008, 4:57pm

on 04/10/08 at 12:03:47, Fritzlein wrote:

I didn't know you had implemented your own position hash function.

Well, I haven't. I used a standard hash function on the board position.

on 04/10/08 at 12:03:47, Fritzlein wrote:

Are you sure that you counted it as a duplicate if the same setup was reached in a different order of steps?

I am pretty sure since board positions were used and not moves. I will double check it. But I will only have time for this after the week-end.

When the same board position is reached after some move, it is still counted as duplicate even if the board position after one of the previous moves was different.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 10^th, 2008, 4:59pm

on 04/10/08 at 12:52:28, aaaa wrote:

I think what we should look for in a game is how little relative compression is possible of a typical database of games.

Shannon's original definition of entropy was very much motivated by studying data compression. In fact, he proved precisely that the greater the entropy of data, the less it can be compressed. It would be ironic to object to entropy as a metric merely to embrace possible compression, since the two are exactly inversely related in theory. Fundamentally it's the same metric.

Of course, in practice data compression would be influenced by how wasteful the game notation is in the first place. Arimaa games could be much compressed by omitting the first of the four characters in every step as entirely redundant. I assume you would want to measure possible compression at a more abstract level that doesn't depend on notational efficiency.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 10^th, 2008, 5:01pm

on 04/10/08 at 16:57:59, woh wrote:

When the same board position is reached after some move, it is still counted as duplicate even if the board position after one of the previous moves was different.

Excellent! Sorry that I suspected the numbers were too good to be true. I'm willing to accept that there has been far more experimentation with the Gold setup than I had guesstimated. Indeed, I'm thrilled that I underestimated the actual variation.

Title: Re: Measure stereotyped openings
Post by woh on Apr 10^th, 2008, 5:18pm

on 04/10/08 at 12:54:36, mistre wrote:

Can you continue this list down to at least the top 10? I am curious.

Sure.

5. 62 times
RHRMERHR
RDRCCRDR

6. 60 times
RHCMECHR
RDRRDRRR

7. 51 times
RMCRREHR
RHDRRCDR

8. 49 times
RHDEMDHR
RRRCCRRR

9= 48 times
RMDHEDHR
RRRCCRRR

9= 48 times
DHCCEDMH
RRRRRRRR

Title: Re: Measure stereotyped openings
Post by aaaa on Apr 10^th, 2008, 5:27pm

I guess what I'm trying to say here is that to me a move is just a means of branching out in the game tree and that it isn't so much the strategic freedom per move which interests me, but the overall one. So I would want to take a look at the relative entropy at each move, not the absolute one, and see how it changes with every move. What is such an indictment of chess is that the number of acceptable openings at a given ply is so little when compared to all possible moves.

If you took an arbitrary game and then start adding different moves that directly lose the game to the rules, then that would degrade it by relative, but not absolute measurement. In fact, it would actually increase if you start including random players.

Title: Re: Measure stereotyped openings
Post by woh on Apr 10^th, 2008, 5:29pm

After Silver setup the most popular positions are:

1. Gold 99of9 with dogs behind the traps vs Silver 99of9 with cats behind the traps: 195 times
2. Gold 99of9 with cats behind the traps vs Silver 99of9 with dogs behind the traps: 188 times
3. Gold 99of9 with cats behind the traps vs Silver 99of9 with cats behind the traps: 178 times
4. Gold 99of9 with dogs behind the traps vs Silver 99of9 with dogs behind the traps: 143 times
5. Gold standard setup vs Silver 99of9 setup with cats behind the traps: 45 times

In all the 5 board position the silver elephant is in the same file as the gold camel.

More results later.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 12^th, 2008, 10:58am

on 04/10/08 at 17:18:37, woh wrote:

5. 62 times
RHRMERHR
RDRCCRDR

Mostly played by me?

Quote:

6. 60 times
RHCMECHR
RDRRDRRR

Mostly played by chessandgo

Quote:

7. 51 times
RMCRREHR
RHDRRCDR

Exclusively played by blue22

It goes to show that some of the entropy comes not from players experimenting, but from players disagreeing and stubbornly sticking with pet openings that others never adopt.

Quote:

8. 49 times
RHDEMDHR
RRRCCRRR

Aha, so mirror reflections are counted as distinct! I wonder how much entropy comes from that. The Gold setup is the only point at which symmetry comes into play, since the Gold elephant itself breaks symmetry thereafter. One could eliminate this one symmetry by reflecting the entire game if Gold sets up with the elephant west of the midline. But that's probably more trouble than it's worth.

Is it true that Go players, by convention rather than by rule, always play the first stone in the same corner? Maybe Arimaa players will eventually adopt the convention of always setting up the Gold elephant in the eastern half of the board.

Title: Re: Measure stereotyped openings
Post by chessandgo on Apr 12^th, 2008, 12:06pm

on 04/12/08 at 10:58:44, Fritzlein wrote:

Is it true that Go players, by convention rather than by rule, always play the first stone in the same corner?

yeah, even 1/8th of the board, thanks to diagonal symmetry :)

Title: Re: Measure stereotyped openings
Post by lightvector on Apr 12^th, 2008, 12:13pm

According to Go etiquette, when playing black against another opponent, you place the first stone in the upper right corner (relative to you). I'm not exactly sure on the why/how of this custom, but supposedly you are offering the upper left corner to your opponent, since that is the corner closest to his/her right hand. The least polite spot is to place a stone in the upper left corner.

However, with online Go, both players typically view the board in the same orientation, which twists things around in an interesting way. Still, the typical play is in the upper right corner, even online, although most won't care if you choose the first move differently.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Jun 27^th, 2010, 6:17pm

on 04/10/08 at 06:30:41, woh wrote:

The entropy after each turn is:
1w 5.6121 (max 11.87)
1b 9.4795 (max 11.87)
2w 10.4058 (max 11.87)
2b 11.2211 (max 11.87)
3w 11.6808 (max 11.87)
3b 11.8120 (max 11.868 )
4w 11.8451 (max 11.8634)
4b 11.8559 (max 11.8607)

Woh, if there is some rainy day when you have nothing better to do, I would be curious to know whether the recent EHH revolution has increased the entropy of openings in the last couple of years. My expectation is that if you were to measure again, the entropy after both setups would now be higher than 9.5.

Of course the entropy for later moves will have increased, because it rapidly approaches 100% unique positions, such that we don't have enough games to measure how random it is. For the setup moves themselves, though, entropy could already be decreasing if we were all standardizing on the 99of9 setup. So measuring entropy for the setup moves would give us a solid measurement of what is otherwise just a qualitative impression that openings have gone haywire.

Title: Re: Measure stereotyped openings
Post by JoeHead on Jun 28^th, 2010, 9:21pm

I love this arimaa analysis threads by Fritzlein. He is so eloquent in highlighting pluses of the game.
Oh my, what would it be if there was 1 000 000 rated players worldwide. With chessbase like software with games and analysis. With Rybka-like engines. With hundreds of book describing ancient origin of game and strategy, tactics and opening analysis... Oh my, let those times come

Title: Re: Measure stereotyped openings
Post by woh on Jul 1^st, 2010, 3:15am

on 06/27/10 at 18:17:53, Fritzlein wrote:

Fritz, you're quite right. It is higher than 9.5 after 1s.

mov	gam	dis	dup	max	sin	S	Smax
1g	6175	937	5238	1249	676	5.8995	12.5922
1s	6175	3313	2862	256	2819	10.2214	12.5922
2g	6175	4103	2072	121	3613	11.1567	12.5922
2s	6175	4979	1196	43	4545	11.9500	12.5922
3g	6173	5728	445	15	5469	12.4074	12.5918
3s	6155	6000	155	5	5881	12.5317	12.5875
4g	6125	6073	52	3	6026	12.5629	12.5805
4s	6107	6088	19	2	6069	12.5700	12.5762

mov: move
gam: number of games that lasted that long
dis: number of distinct positions after that move
dup: number of duplicate positions
max: number of games with the most frequent position
sin: number of games with a position occurring only once

Title: Re: Measure stereotyped openings
Post by Fritzlein on Jul 1^st, 2010, 3:29am

Thanks, woh. Your research is excellent.

We can now officially say that Arimaa has more variety of openings than Chess960. What's more, our variety is voluntarily chosen by players trying to win, not arbitrarily imposed by dice. Therefore we have the best of both worlds: the lack of randomizers (like chess) and the lack of memorized openings (like Chess960).

Title: Re: Measure stereotyped openings
Post by woh on Jul 1^st, 2010, 3:33am

The top 10 gold setups are now:

1. (2) 1249
RHDMEDHR
RRRCCRRR

2. (1) 1172
RHCMECHR
RRRDDRRR

3. (3) 405
HDCMECDH
RRRRRRRR

4. (4) 250
DHCMECHD
RRRRRRRR

5. (-) 160
RHRMERHR
RCRDDRCR

6. (-) 157
RHDMECHR
RRRCDRRR

7. (5) 111
RHRMERHR
RDRCCRDR

8. (9) 82
RMDHEDHR
RRRCCRRR

9. (6) 73
RHCMECHR
RDRRDRRR

10= (-) 65
RHDMEDHR
RCRRRRCR

10= (9) 65
DHCCEDMH
RRRRRRRR

The setup with 4 rabbits upfront and the the dog in the center makes an impressive entry in the top 10. It is now more popular than the same setup but with the cats in the center.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Jul 1^st, 2010, 5:49am

I wonder how long until some variation of the double flank horse breaks into the top ten, or whether it ever will. One thing holding it back is that when the symmetry is broken, it becomes less obvious what to do with the remaining pieces, so there are more sub-variations.

I notice that the top four from two years ago remained the top four, but all of them lost mind share. Still, it would take some doing for any other setup to compete.

Title: Re: Measure stereotyped openings
Post by Sconibulus on Jul 1^st, 2010, 2:02pm

Woh, does that list of top openings include mirrored positions? Also, what's the time frame? The top positions seem very high, while those around 10 seem really low...

Oh, also, would it be possible to ignore bots playing as gold? Since most of those are probably from the ladder, they don't change, and therefor will keep human variance from showing up as readily.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Jul 2^nd, 2010, 1:38am

Only woh can answer authoritatively, but based on what he said last time:
1) Mirror setups are counted separately.
2) The time frame is Arimaa's entire history.
3) The disparity between the top positions is real, because if one doesn't do the 99of9 or interface default setup, there is little standardization.
4) Bots are ignored. These stats are from rated human vs. human games only.

Title: Re: Measure stereotyped openings
Post by Hippo on Jul 2^nd, 2010, 2:46am

Yes "symmetrical setups" tend to be more frequent. It is not only from historical reasons, but as well as there are much less options to make the setup symmetric.
I am playing mostly the Ef2 Hg2 Hh2 setup now, but I am not sure how often the full setup repeats.

For the entire analysis there is nothing to change, but I would be interested on seeing top ten openings with both horses on the same half of board.

And may be separate counts of symmetrical/assymetrical gold setup.

Title: Re: Measure stereotyped openings
Post by woh on Jul 2^nd, 2010, 12:40pm

on 07/01/10 at 14:02:07, Sconibulus wrote:

Sconibulus, like Fritzlein already pointed out, those results are based on all HvH games ever played. Games with bots are ignored.

In my original post mirrowed positions were counted separately but I have changed that since. They are now considered the same.

For comparison here are the results with mirrowed positions counted separately:

mov	gam	dis	dup	max	sin	S	Smax
1w	6175	1058	5117	1121	754	6.3538	12.5922
1b	6175	3492	2683	234	2998	10.4251	12.5922
2w	6175	4252	1923	120	3770	11.2903	12.5922
2b	6175	5090	1085	43	4684	12.0180	12.5922
3w	6173	5768	405	14	5524	12.4258	12.5918
3b	6155	6017	138	5	5909	12.5381	12.5875
4w	6125	6081	44	3	6040	12.5658	12.5805
4b	6107	6093	14	2	6079	12.5717	12.5762

Obviously the entropy is higher.
After Gold's setup the difference is about 0.45 and after both player's setup 0.2

Title: Re: Measure stereotyped openings
Post by woh on Jul 2^nd, 2010, 12:48pm

on 07/02/10 at 02:46:48, Hippo wrote:

... but I would be interested on seeing top ten openings with both horses on the same half of board.

Hippo, there are 492 games in which Gold's horses are setup in the same half of the board. This is 8% of all HvH games.

There are 227 distict such setups.
The most frequent ones are:

21. 34
RHCHECMR
RRRDDRRR

29. 23
HHCEDRMR
RRRDRRCR

31. 19
DHCHECMD
RRRRRRRR

32. 17
RMCDDEHH
RRRRRCRR

35. 16
RHDHEDMR
RRRCCRRR

37. 15
RMREHRHR
RCRDDRCR

38. 15
RHHEMDDR
RRRCCRRR

52. 9
HHECCDMR
RRDRRRRR

60. 8
HECDDCMR
RHRRRRRR

Title: Re: Measure stereotyped openings
Post by Fritzlein on Jul 4^th, 2010, 2:37am

on 07/02/10 at 12:48:26, woh wrote:

Hippo, there are 492 games in which Gold's horses are setup in the same half of the board. This is 8% of all HvH games.

Nice. It will be interesting to see whether that 8% share goes up in the next couple of years.

Quote:

37. 15
RMREHRHR
RCRDDRCR

I wonder how many of those fifteen were me. I know I tried it for a while, but I forget how long exactly.

Thanks again, woh, for compiling such interesting and illuminating statistics.

Title: Re: Measure stereotyped openings
Post by FireBorn on Jul 4^th, 2010, 9:40am

I think I've tried that one a few times too

Title: Re: Measure stereotyped openings
Post by rozencrantz on Jul 4^th, 2010, 3:41pm

Are there any other stereotyped positions to be alert for? I know that Go's joseki can show up any time during the early game, and Hex's templates can show up almost any time.

I'm working on little more than hunches here, but some early-game behavior seems stereotyped to me, even if the specifics vary quite a bit, it seems like gold very often starts by harassing with the elephant.

Title: Re: Measure stereotyped openings
Post by woh on Jul 5^th, 2010, 3:21am

on 07/04/10 at 02:37:34, Fritzlein wrote:

37. 15
RMREHRHR
RCRDDRCR

I wonder how many of those fifteen were me. I know I tried it for a while, but I forget how long exactly.

Fritzlein, you have used this setup 8 times. The first time was in game 88582 played on November 15th 2008 and the last time in game 121447 on April 12th 2009.
It has also been used by Simon (4 times) and Nombril, robinson, and Harren (1 time each).

on 07/04/10 at 09:40:46, FireBorn wrote:

I think I've tried that one a few times too

FireBorn, there are no games where you have used this setup, neither against a human nor against a bot.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Jul 5^th, 2010, 4:18am

on 07/05/10 at 03:21:21, woh wrote:

Interesting, thanks. To some extent I am sure that variation in opening setups comes from every individual having his own pet setup that he sticks to religiously, i.e. the variation is more from player to player than from game to game for each player. This setup, however, shows that individuals are experimenting. It was more than just a phase I went through on my own.

Title: Re: Measure stereotyped openings
Post by FireBorn on Jul 5^th, 2010, 2:14pm

You're right. I think I switched a dog and a cat when I used it, but I've used it a few times.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 7^th, 2012, 7:16pm

Woh, it has been four years since you first quantified Arimaa openings, and almost two years since your last update. If you have the time and the old code lying around, I wold love to see the up-to-date statistics. I'm going to bet that entropy is still increasing, since no consensus on openings has emerged in the past two years, or if it has emerged then nobody told me. Thanks in advance!

Title: Re: Measure stereotyped openings
Post by woh on Apr 9^th, 2012, 11:01am

mov	gam	dis	dup	max	sin	S	Smax	S%
1w	8794	1258	7536	1815	893	6.0942	13.1023	46.51
1b	8794	4724	4070	342	4047	10.6823	13.1023	81.53
2w	8794	5927	2867	153	5259	11.6852	13.1023	89.18
2b	8794	7155	1639	59	6588	12.4650	13.1023	95.14
3w	8792	8190	602	17	7851	12.9221	13.1020	98.63
3b	8765	8568	197	6	8421	13.0469	13.0975	99.61
4w	8724	8663	61	3	8608	13.0763	13.0908	99.89
4b	8694	8669	25	3	8645	13.0800	13.0858	99.96

The entropy is still increasing but not that fast (after 1w).
The increase was only 0.1947 while the maximum increased by 0.5101
The top opening is almost 50% up, from 1249 to 1815 now.
On the other hand 321 new setups has been used.
It looks to me like some players are experimenting with new setups but nothing is really catching on while most players stick with what they used to.

After 4b (or even sooner) there still are hardly any duplicate positions.

Title: Re: Measure stereotyped openings
Post by woh on Apr 9^th, 2012, 11:59am

The top gold setups now are:

1. (2) 1815 (+643)
RHCMECHR
RRRDDRRR

2. (1) 1523 (+274)
RHDMEDHR
RRRCCRRR

3. (4) 582 (+332)
DHCMECHD
RRRRRRRR

4. (3) 413 (+8 )
HDCMECDH
RRRRRRRR

5. (6) 215 (+58 )
RHDMECHR
RRRCDRRR

6. (5) 202 (+42)
RHRMERHR
RCRDDRCR

7. (7) 129 (+18 )
RHRMERHR
RDRCCRDR

8. (-) 110
RHDMEDHR
RRCRRCRR

9. (10) 108 (+43)
RHDMEDHR
RCRRRRCR

10. (8 ) 106 (+24)
RMDHEDHR
RRRCCRRR

The top 2 switched position again with the 99of9 setup with cats behind the traps taking the lead over the 99of9 setup with the dogs behind the traps.

The setups with 4 rabbits in front seems to loose popularity with the first one dropping from 5th to 6th place.

Title: Re: Measure stereotyped openings
Post by woh on Apr 9^th, 2012, 12:14pm

Now 831 games started with both horses on the same half of the board. This accounts for 9,4% of the games, a small increase.

20. (21) 49 (+15)
RHCHECMR
RRRDDRRR

30. (29) 27 (+4)
RMRDECHH
RCRRDRRR

32. (-) 26
RMRDERHH
RCRCDRRR

34. (31) 24 (+5)
DHCHECMD
RRRRRRRR

35. (35) 23 (+7)
RHDHEDMR
RRRCCRRR

37. (-) 19
DMRCEDHH
RRRCRRRR

38. (32) 19 (+2)
RMCDDEHH
RRRRRCRR

45. (-) 16
DMCDECHH
RRRRRRRR

46. (-) 16
RHDHECMR
RRRCDRRR

Title: Re: Measure stereotyped openings
Post by ginrunner on Apr 9^th, 2012, 9:11pm

Mine isn't there ::) either it is a bad set or too weird for people to be comfortable with.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 10^th, 2012, 10:47am

Thanks woh. I agree with your impression that a lot of people are just sticking with what they know. I also thought maybe 99of9 was re-asserting its dominance for Gold, and so it is. We have 42% more games than last time you ran the numbers, and the one true 99of9 setup (cats behind the traps) was up 55%. But if we take the three 99of9 that are in the top ten (#1, #2, and #5 together), it has only been a 38% increase. So as a whole, the 99of9 setup lost mindshare.

Some of that mindshare went to different placements of rabbits on the back rank. Apparently pulling a back central rabbit is no longer enough of a threat to deter starting rabbits there, so people are looking for different advantages. Also the fact that the percentage of Gold setups with both horses on the same side actually increased means the conventional wisdom that Gold must have a balanced setup is still far from universal.

It is tremendously encouraging that Gold setup entropy is still increasing. Already by the Silver setup, I am quite sure that we haven't played enough games yet to measure it properly. I expect the entropy after 1s to keep rising for some time, until the unique positions aren't such a large proportion of the total. But the entropy after 1g could have fallen even this time without surprising me. It is much easier to standardize when you have nothing to react to.

So the fact that even the entropy of 1g is still increasing means that we still have no generally accepted opening theory at all, not even for the setups. If the repetition in setups comes mostly from the fact that everyone has his own pet setup, then simply the increased number of people playing is increasing entropy.

As always, the numbers are fascinating. Thanks for running it again.

Title: Re: Measure stereotyped openings
Post by hyperpape on Apr 10^th, 2012, 3:58pm

Entropy can also be increasing because as players get more experience, they may feel more of an urge to experiment, regardless of what they think is the best setup. That seems like an untestable, but also unfalsifiable suggestion, unfortunately.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Mar 21^st, 2014, 9:12pm

Hervé, it has been almost two years since your last update of these statistics, and a lot has changed in the opening, particularly due to the influence of browni's setup as Silver (and sometimes Gold). Would you be willing to give us another update?

In the past the entropy of the Gold setup has increased
5.6121 > 5.8995 > 6.0942,
while the entropy of the combined setups has increased
9.4795 > 10.2214 > 10.6823

I would be shocked if the second number hasn't increased still further. Indeed, I expect that when opening theory finally stabilizes, the second number will be more than double the first, because Silver will have more viable options in response to Gold than Gold has in a vacuum.

On the other hand, it wouldn't be too surprising if the first number has declined, given the increasing ascendancy of the 99of9 setup. That seems to be the emerging consensus, anyway, although I haven't done even the briefest scan to see if it holds numerically. Will the one, true 99of9 setup have more than 20.6%? Will the combination of that with dogs behind the traps (which 99of9 himself played in 2014 World Championship) have increased beyond 38.0%?

I am curious, if you have the time to run the numbers.

Title: Re: Measure stereotyped openings
Post by woh on Mar 24^th, 2014, 7:21am

on 03/21/14 at 21:12:28, Fritzlein wrote:

Sure. I didn't realize it had been that long.

Title: Re: Measure stereotyped openings
Post by woh on Mar 25^th, 2014, 4:17pm

mov	gam	dis	dup	max	sin	S	Smax	S%
1g	11434	1503	9931	2818	1071	6.0417	13.4810	44.82
1s	11434	5861	5573	629	5003	10.8053	13.4810	80.15
2g	11434	7515	3919	276	6647	11.9185	13.4810	88.41
2s	11434	9131	2303	144	8405	12.7262	13.4810	94.40
3g	11431	10519	912	81	10080	13.2372	13.4807	98.19
3s	11387	11065	322	26	10853	13.4036	13.4751	99.47
4g	11336	11247	89	8	11176	13.4512	13.4686	99.87
4s	11294	11257	37	5	11225	13.4563	13.4633	99.95

You're correct Fritz!
The entropy after 1g has slightly decreased. 6.0942 > 6.0417
And after 1s the entropy has still encreased.

Title: Re: Measure stereotyped openings
Post by woh on Mar 25^th, 2014, 4:38pm

1. (1) 2818 (+1003)
RHCMECHR
RRRDDRRR

2. (2) 1683 (+160)
RHDMEDHR
RRRCCRRR

3. (3) 841 (+259)
DHCMECHD
RRRRRRRR

4. (4) 424 (+11)
HDCMECDH
RRRRRRRR

5. (5) 266 (+51)
RHDMECHR
RRRCDRRR

6. (6) 240 (+38 )
RHRMERHR
RCRDDRCR

7. (8 ) 168 (+58 )
RHDMEDHR
RRCRRCRR

8. (-) 134
CHDMEDHC
RRRRRRRR

9. (7) 132 (+3)
RHRMERHR
RDRCCRDR

10. (9) 119 (+11)
RHDMEDHR
RCRRRRCR

No changes at the top. But the 99of9 setup with cats behind the traps has increased its lead.
Its share went up to 24.6%.
The combination of both 99of9 setups has increased to 39.4%.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Mar 25^th, 2014, 4:51pm

Super, thanks for doing this again! It is telling that the entropy in the Gold setup has decreased for the first time. If anyone asks whether Arimaa has opening theory yet we can now say, "Yes it does: use the 99of9 setup as Gold." After that, however, we're still pretty much at sea, as demonstrated by how quickly we approach maximum entropy.

One of the more dramatic jumps in the numbers from last time to this time was the most common position after 3g (17 -> 81) and after 3s (6 -> 26). The absolute numbers are small, but the more than 300% increase is huge given that the total number of games increased only 30%. This most-popular line is nowhere near a standard opening, but it is perhaps the first inkling of one.

Title: Re: Measure stereotyped openings
Post by woh on Mar 25^th, 2014, 4:57pm

The share of setups with both horses on the same half of the board stopped increasing. It is now 9.2% or 1053 games.

16. (20) 85 (+36)
RHCHECMR
RRRDDRRR

27. (35) 40 (+17)
RHDHEDMR
RRRCCRRR

29. (32) 36 (+10)
RMRDERHH
RCRCDRRR

37. (30) 27 (+0)
RMRDECHH
RCRRDRRR

39. (34) 27 (+3)
DHCHECMD
RRRRRRRR

40. (45) 23 (+7)
DMCDECHH
RRRRRRRR

45. (38 ) 20 (+1)
RMCDDEHH
RRRRRCRR

48. (37) 19 (+0)
DMRCEDHH
RRRCRRRR

54. (46) 17 (+1)
RHDHECMR
RRRCDRRR

Title: Re: Measure stereotyped openings
Post by Fritzlein on Mar 25^th, 2014, 5:40pm

on 03/25/14 at 16:57:02, woh wrote:

The share of setups with both horses on the same half of the board stopped increasing. It is now 9.2% or 1053 games.

But sure the share of openings with a decentralized camel has increased? It seems the new trend in imbalance isn't HH on one side and M on the other, but rather MH on one side and H on the other, with a rather more aggressive intent.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 9^th, 2014, 12:13pm

By the way, I find that a number of people who want to fix chess think that Chess960, by shuffling the setup of the pieces behind the pawns, does away with opening theory for chess without creating any problems. This isn't quite true, but as a fan of Arimaa it isn't my task to convince Chess960 fans that their variant fails to fix chess. Rather, I can piggyback on their preference to extol the virtues of Arimaa, because Arimaa has even more opening variety. Converting the opening entropy into a number, 2^10.8053 = 1789, so we can say Arimaa is like "chess one thousand seven hundred and eighty nine". (Say it that way unless you want it to sound like the year of the French Revolution.) Our number is bigger than your number, in fact we are almost double Chess960. :)

Title: Re: Measure stereotyped openings
Post by chessandgo on Apr 10^th, 2014, 7:28am

Just like Chess960 is an improvement over chess, I believe some EEE-style-ish variant would be an improvement over arimaa (randomly choosing material to be used at setup, or even randomly choosing a random starting setup position with that random material).

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 10^th, 2014, 10:58am

on 04/10/14 at 07:28:14, chessandgo wrote:

Just like Chess960 is an improvement over chess [...]

Ah, I am surprised. Apart from the inelegance of taking control away from players who otherwise determine everything between them, isn't it fairly well accepted that some of the 960 setups give white a big opening advantage, even bigger than the traditional chess setup? Or is the first-player advantage not a big flaw of chess in your estimation, such that potentially increasing it is no big deal? By the way, there was at least one EEE setup that was overwhelmingly in favor of one player.

Quote:

[...] I believe some EEE-style-ish variant would be an improvement over arimaa (randomly choosing material to be used at setup, or even randomly choosing a random starting setup position with that random material).

For my part, I would not add external randomness to an otherwise pure strategy game unless there were too little variety and novelty produced by the players themselves trying to win. I don't think the repetition in Arimaa is anywhere close to being a big enough problem to warrant adding randomizers. As just a taste of the downside, imagine the loser of a championship match being able to say, with justice, "My opponent got lucky because the randomized setup kept favoring his playing strengths over mine."

To me it is important to have the variations in who wins come from inside the players themselves. Also I value the fact that removing external noise makes it easier (albeit not trivial) to know which of two players is better. But I guess I'm not going to get far with my bias against randomization when talking to a professional poker player. The fact that it is extraordinarily difficulty to tell which of two poker players is better is an essential feature of the game. If the suckers weren't able to think, "I only lost because I got unlucky," then they would quit playing and poker would die off.

Even supposing we do start randomizing setups, I strongly feel we should continue to use the full set of pieces. Admittedly, the Endless Endgame Event is great for training in an area where we don't have enough opportunities to train, and also it leads to positions which are immediately sharp and therefore exciting. If, however, EEE were the entirety of Arimaa it would probably considerably reduce the strategic depth. Using the full set of pieces brings a broader array of strategic ideas into play. The full piece set allows for many types of elephant blockades (full or partial) and frames that can't happen in an endgame. Also it makes it so that the question of whether to pull rabbits or advance them yourself is the most subtle point of strategy, and the most enduring advantage humans have over computers. A reduced piece set like the EEE uses would mostly eliminate what is currently the strategy of the opening, and leave us only with the strategy of the endgame, a smaller set and more in favor of the computers.

I can see the benefit of a reduced piece set as a teaching tool, and I have proposed removing HDC from each side as a possible to solution to the hypothetical problem of drawn games, but in the absence of draws, I would strongly advocate for using all sixteen pieces on each side for "real" Arimaa, whether randomized or no.

I know I always come out as critical or at least skeptical of any change to the rules of Arimaa, but that is because I feel that Arimaa is something very rare. It is easy to identify some imperfection in any game and propose new rules to fix it, but the cost of any such changes is hidden. We don't know what we will lose. If awesome abstract strategy games were more abundant, I would probably jump on the bandwagon of "improving" Arimaa myself. The reality, however, is that almost every abstract strategy game eventually breaks in some serious way, and as long as Arimaa shows no signs of breaking in any of these ways, I'm going to be unenthusiastic about trying to fix anything cosmetic.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 10^th, 2014, 11:14am

Furthermore, I don't even think the variation in Gold setups is necessarily destined to decline further from here. The dominance 99of9 setup may just be a passing fad. For example, I don't think we have seen the last of setups with the gold elephant behind a trap. Browni's opening attacking idea would be most powerful with the horse on the a-file, camel on the b-file, and elephant on the c-file. Silver surely isn't going to come up with a faster attack in the east, so what is the Silver defense of the west that would deter this Gold setup? My meager opening knowledge would say that the silver elephant wants to be on b4 in this case, but does it then start on d7 and move there with a loss of time? I'm not saying there is no defense, just that it isn't obvious what Silver should do, and that we are far from having "settled" that the 99of9 setup is best. The theory of even just the setup, never mind of the whole opening, has a long way to evolve before we need dice to keep it interesting.

Title: Re: Measure stereotyped openings
Post by chessandgo on Apr 10^th, 2014, 3:26pm

I was not aware that some 960 setups give a significantly different white advantage. That would indeed be a problem.

When I say EEE style, I mean essentially playing with one or two (or three?) fewer pieces than all 16.

The current opening trend seems to be for each player to go all-in with a horse on opposite wings, which means the opening proceeds at a good pace. A few years ago, horse dances (or heaven forbid facing camels) were a lot more common, and I was worried that 20 moves dancing-for-nothing situations might become the norm. Hopefully that won't happen, but if it does, and long openings start boring players, I think taking one or two pieces off would re-make a game of arimaa exciting from start to finish.

Title: Re: Measure stereotyped openings
Post by browni3141 on Apr 10^th, 2014, 5:31pm

on 04/10/14 at 15:26:01, chessandgo wrote:

I agree about this current trend. It seems players have started to think it is an error to move a camel to face the opponent's :P. What I don't agree with is your opinion that it leads to a boring opening. The opening may be longer on average, but why does it have to be boring? Often I see games with symmetrical camels where both sides can think of nothing but trying to pull pieces out, and neither side makes any progress until the opponent makes a mistake. I find such games very boring, but they happen because the involved players have limited understanding of the opening, and perhaps not enough creativity to make something happen. I don't think opposing camels are intrinsically boring in any way.

I have no idea what you mean by horse dances. Perhaps you could provide an example game.

Title: Re: Measure stereotyped openings
Post by chessandgo on Apr 12^th, 2014, 5:24am

A can't really find relevant game. I mentioned the The Jeh ga,e in my book to supersamu in the chatroom, not exactly what I'm looking for.

For example, I believe the following opening would be reasonable:

1g Ra1 Rb1 Rc1 Dd1 De1 Rf1 Rg1 Rh1 Ra2 Hb2 Cc2 Ed2 Me2 Cf2 Hg2 Rh2
1s ra7 hb7 cc7 md7 ee7 cf7 hg7 rh7 ra8 rb8 rc8 dd8 de8 rf8 rg8 rh8
2g Ed2n Ed3n Ed4n Hb2n
2s ee7s ee6s hg7s hb7s
3g Me2n Hg2n De1n Dd1n
3s md7s dd8s de8s ee5s
4g Hg3e Hh3n Me3e Hh4n
4s md6e hg6e me6e hb6s
5g Hh5s Hh4s Mf3w Me3w

By horse dance I mean a Horse avances on one wing, the enemy camel gets closer, the Horse retreats, a horse advances on the other wing, the caMel gets closer, the horse withdraws, rince and repeat.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Apr 12^th, 2014, 8:47am

In my
2013 Postal Mixer game with mistre (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=278545), nothing happened for the first 30 moves of the opening, except one square of rabbit pulling by me and one square of voluntary rabbit advancing by mistre. I didn't mind because I had some more involved games eating up my thinking time and because I was the one who made all two squares of progress; I'm not sure why mistre didn't mind.

If it were commonplace for Arimaa openings to just shuffle and reshuffle pieces without any commitment by either player, it would be a problem. On the other hand, it isn't necessarily right to blame the game when both players are being tentative. Just because the players aren't willing to commit to an away game and they aren't interested in pulling rabbits doesn't mean the inherent risk/reward balance prevents the players from committing to anything.

When the dual lone-elephant attack was the dominant opening, I was afraid of a "shuffling stalemate" becoming the dominant mode of play for Arimaa, because it might be too risky for an elephant to decentralize itself enough pull a 99of9 flank rabbit. Now that we know that horses can profitably advance in the opening, I am no longer afraid. The horse-dance shuffling stalemate seems unlikely because there are too many types of threats; someone will be able to at least pull a rabbit even if all swarming opportunities are blocked.

To repeat, the players can always conspire not to make progress for either a home game or an away game (and then they can complain that they want to agree to a draw), but as long as that kind of shuffling doesn't represent best play, I won't be be worried.

Title: Re: Measure stereotyped openings
Post by Boo on May 19^th, 2014, 9:30am

Quote:

isn't it fairly well accepted that some of the 960 setups give white a big opening advantage, even bigger than the traditional chess setup?

I don't think there are such setups. All setups give the same advantage - the 1st move.

Title: Re: Measure stereotyped openings
Post by Fritzlein on Sep 26^th, 2014, 4:07pm

on 05/19/14 at 09:30:26, Boo wrote:

I don't think there are such setups. All setups give the same advantage - the 1st move.

Really? If that were true, then I would think Chess960 World Cup would randomize the pieces before every game, rather than playing with the same setup twice and colors reversed between each pair of opponents. Why be concerned with giving each player the same first-move advantage if the first-move advantage is the same regardless of setup?

"All sections will be with double round robin (players will have the same position with Black and White but all positions will defer from one opponent to another)."

[EDIT]

Oops, after further reading I want to backpedal. Here are some statistics from someone who is worried about the problem: http://chess960frc.blogspot.be/2012/11/waving-yellow-flag.html

He has collected about 120 game results for each of the 960 starting positions, and it appears that the some favor white more than others. But he isn't careful about his statistics. I did a little experiment where I simulated the same data set, but under the assumption that every starting position would have a 45% chance of white winning, a 35% chance of black winning, and a 20% of a draw, similar to the results this fellow is quoting, and consistent with an expected score of 0.55 for white in every game. Just by random variation, the five best and five worst positions for white in my simulation had scores of

0.691
0.662
0.658
0.654
0.650
...
0.450
0.425
0.421
0.417
0.412

In other words, under the assumption that every Chess960 position has exactly the same first-move advantage, by natural variation I get results just as extreme as the ones our blogger has compiled. So perhaps some positions have just been lucky for white so far, and others unlucky, with no inherent bias. At a minimum, if these are the most conclusive stats available, we have to say there is so far no statistical evidence that some positions favor white more than others.

I therefore withdraw my statement that Chess960 is flawed in this way.