Author |
Topic: Gold advantage impossible to measure? (Read 4714 times) |
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Gold advantage impossible to measure?
« on: Nov 30th, 2005, 9:21pm » |
Quote Modify
|
I just did another little query on the games database. Over the last 400 rated games between humans both rated over 1600, Gold was expected from the ratings to win 180 games, but actually won 173 games. This indicates that playing Silver is an advantage! In order to make the expected number of wins also be 173, we need to add 19 rating points to each Silver player. However, in those 400 games, the standard deviation is plus or minus 9.9 wins, which overwhelms the difference between actual and expected score. That is to say, the figure of a 19 point rating advantage is statistically worthless. Thus the results are invalidated even before we consider a possible source of bias: Stronger players may be more likely to invite weaker players to a game than vice versa, and the inviting player may be more likely to give himself Silver. This would explain why Gold was expected to win less than half the games. If in addition it happens that stronger players are likely to be underrated relative to weaker players, then the inaccuracies of the ratings rather than the color advantage would explain any discrepancies. Does anyone have an idea how we are ever going to know whether Gold has an advantage, and if so, how much? I don't think it helps at all to include human versus bot games, given all the bot-bashing that goes on. For example, RyanCable bashed Bomb2005Blitz almost entirely from the Gold side when building up his pre-tournament rating. If we included that data, it would look like playing Gold is an enormous advantage. Should we therefore rely entirely on bot vs. bot games? We could do another self-play experiment like we did with Clueless over 144 games, but as JDB pointed out, self-play would tend to exaggerate any advantage, because the bots are using the same evaluation function. I can't think of any good methodology. Perhaps we just won't be able to tell for some time into the future, and must keep randomizing color assignments on the theory that it might make a difference, even if we can't tell what that difference is.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Gold advantage impossible to measure?
« Reply #1 on: Dec 2nd, 2005, 12:40am » |
Quote Modify
|
Quote:a possible source of bias: Stronger players may be more likely to invite weaker players to a game than vice versa, and the inviting player may be more likely to give himself Silver |
| To eliminate this, between any pair of players, you should only include the same number of G-S games as you do S-G games. (Or weight them so that the effective number of games with each colour is equal.) I don't think human-bot games are at all useful at answering this question. Bot-bot games are ok, but since they are weak players, they are not the best group to sample to determine intrinsic bias in a game. For that you need to look at players as close to perfect as you have got. At the moment, that is human-human games.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Gold advantage impossible to measure?
« Reply #2 on: Dec 2nd, 2005, 12:42am » |
Quote Modify
|
Oh, and of course 400 is quite a small sample when trying to estimate a bias that could easily be less than 5%.
|
|
IP Logged |
|
|
|
Adanac
Forum Guru
Arimaa player #892
Gender:
Posts: 635
|
|
Re: Gold advantage impossible to measure?
« Reply #3 on: Dec 2nd, 2005, 10:15am » |
Quote Modify
|
Suppose that 300 years ago there were a couple of dozen chess players that wanted to determine whether white had any advantage in chess. Suppose also that those players considered 1. g4 to be white's strongest opening move. They could play one another several hundred times to try to determine whether white has an advantage and...that might be a good parallel to the current state of arimaa opening thoery. I'm very interested to learn the size of gold's advantage in the opening (if any) but we're going to need a lot more players, a far larger database of games, and more importantly, a much better knowledge of the game. I don't know the answer to the following question, but do chess grandmasters have a larger advantage with the white pieces against other grandmasters than, say, a 2000 player would have against another 2000 player? If so, that might suggest that gold's advantage should increase as our best arimaa players increase in strength.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Gold advantage impossible to measure?
« Reply #4 on: Dec 2nd, 2005, 11:11am » |
Quote Modify
|
on Dec 2nd, 2005, 10:15am, Adanac wrote:[D]o chess grandmasters have a larger advantage with the white pieces against other grandmasters than, say, a 2000 player would have against another 2000 player? If so, that might suggest that gold's advantage should increase as our best arimaa players increase in strength. |
| I have read that (astonishingly) the advantage for white seems to be constant all the way from beginner to grandmaster, with the difference being only in the number of draws. I'm not sure of the source, but it may have been Elo's old book, and if so, it's a little shaky. On the other hand, if it is true for chess, it might also be true for Arimaa, and therefore present data might be valuable even though we aren't very good yet.
|
« Last Edit: Dec 2nd, 2005, 11:13am by Fritzlein » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Gold advantage impossible to measure?
« Reply #5 on: Dec 2nd, 2005, 1:14pm » |
Quote Modify
|
on Dec 2nd, 2005, 12:40am, 99of9 wrote:To eliminate this, between any pair of players, you should only include the same number of G-S games as you do S-G games. |
| Good methodology. We can eliminate lots of biases by looking only at pairs of games with reversed colors between the same two players. When I get around to it, I'll find aas many such game pairs as possible among rated human vs. human games in the database going back to the very start. And I guess as long as I'm doing it, it isn't much extra work to compile the numbers for bot vs. bot games and bot vs. human games as well, for whatever they're worth.
|
« Last Edit: Dec 2nd, 2005, 1:15pm by Fritzlein » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Gold advantage impossible to measure?
« Reply #6 on: Dec 4th, 2005, 1:41pm » |
Quote Modify
|
OK, I stretched my meagre programming abilities to pair up games based on reversed colors between the same opponents. I tried to pair up the games that were closest in time, e.g. if two players had colors G-S G-S G-S G-S S-G then I counted the last two games as a pair while disregarding the first three. Indeed, I was so worried about changing skill over time that if the games were G-S G-S G-S G-S S-G S-G S-G S-G then I only counted the middle two games while disregarding the other six. Oh, and I only counted rated games, and games ending in "b" or "w" (no draws or aborts). The end result is that Gold won 4725 games out of 4692 pairs, i.e. 50.35% of the games. This suggests a Gold advantage of 2.44 points for Gold. That is to say, Someone who is rated 2.44 points higher than his opponent should win 50.35% of the games. However, this doesn't capture the effect of mismatches. The more the players in a pair are mismatched, the more it masks the advantage of playing gold, because the stronger player is probably just going to win both games anyway. The average rating difference between the players in those games was 189 points. If two players are mismatched by 189 points, and they play games of alternating color, then Gold must have an advantage of 3.23 rating points to account for winning 50.35% of the games. A guesstimate of the error would be to suppose all 4692 pairs were played at a mismatch of 189 points, i.e. were about 3:1 for the favorite, so the standard deviation would be 41.9 games. For Gold to win 33 games more than expected represents 0.79 standard deviations, i.e. the Gold advantage is clearly statistically insignificant. This was for all types of games combined. If we repeat the calculation based on the types of opponents we have Game Type Pairs Gold Wins Mismatch Gold Adv. # Std. Dev. --------- ----- --------- -------- --------- ----------- ALL . 4692 4725 . 189 . 3.23 . 0.79 H v B . 3839 3851 . 192 . 1.45 . 0.32 B v B . 608 630 . 152 . 15.1 . 1.38 H V H . 245 244 . 237 . -2.19 . 0.12 Yes, that's right, over the human games, Silver actually has the advantage. This doesn't matter, of course, because all the results are statistically insignificant. We have essentially zero evidence that either side has an advantage.
|
« Last Edit: Dec 4th, 2005, 1:46pm by Fritzlein » |
IP Logged |
|
|
|
Ryan_Cable
Forum Guru
Arimaa player #951
Gender:
Posts: 138
|
|
Re: Gold advantage impossible to measure?
« Reply #7 on: Dec 6th, 2005, 5:27pm » |
Quote Modify
|
Well, at least we can now say with 95% certainty that the color advantage is less than 10 points for games with one or more humans. I think this is small enough to justify our current method of assigning gold, which would be a joke in a chess tournament.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Gold advantage impossible to measure?
« Reply #8 on: Dec 7th, 2005, 9:50am » |
Quote Modify
|
on Dec 6th, 2005, 5:27pm, Ryan_Cable wrote:Well, at least we can now say with 95% certainty that the color advantage is less than 10 points for games with one or more humans. |
| That narrow range holds if you are lumping hvh games in with hvb games. For the purists who only think hvh games are relevant, the 95% confidence interval is a bit wider. I'm wavering in my own mind as to how convinced I am that the colors are equal. The statistics are getting fairly strong, but maybe the data we have so far isn't as relevant as data that is yet to come. Some day, if the evidence for color equality gets strong enough, one could justify assigning colors completely at random, rather than merely with-some-potential-for-imbalances as in the 2006 WC. But I suppose that there will always be players who prefer a particular color regardless of what the statistics say, so there will always be an argument for attempting to equalize color assignments.
|
|
IP Logged |
|
|
|
acheron
Forum Full Member
Arimaa player #1613
Gender:
Posts: 11
|
|
Re: Gold advantage impossible to measure?
« Reply #9 on: Dec 9th, 2005, 8:14pm » |
Quote Modify
|
Another reason gold is less intrinsically advantaged than white, is that unlike chess, the opponent has the ability to respond to your setup. So while the gold player must arrange his initial layout blind, the silver player can examine this arrangement and respond accordingly. Against the bots for example, this can be a sizable advantage, positioning your camel away from the opposing elephant, and ensuring each board subsection is arranged to your advantage.
|
|
IP Logged |
|
|
|
robinson
Forum Senior Member
Arimaa player #719
Gender:
Posts: 30
|
|
Re: Gold advantage impossible to measure?
« Reply #10 on: Dec 12th, 2005, 4:56pm » |
Quote Modify
|
wow... i just looked at my stats vs paulMertens... maybe thats the only way we can find out were the advantage is.. i have 8 to 12 with gold and 12 to 2 with silver knowing that not all of them can count cause of some expiriments....
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: Gold advantage impossible to measure?
« Reply #11 on: Dec 13th, 2005, 1:21am » |
Quote Modify
|
on Dec 9th, 2005, 8:14pm, acheron wrote:Another reason gold is less intrinsically advantaged than white, is that unlike chess, the opponent has the ability to respond to your setup. So while the gold player must arrange his initial layout blind, the silver player can examine this arrangement and respond accordingly. Against the bots for example, this can be a sizable advantage, positioning your camel away from the opposing elephant, and ensuring each board subsection is arranged to your advantage. |
| Indeed, sometimes I wonder if this may actually give silver more of an advantage once we learn more about Arimaa openings.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Gold advantage impossible to measure?
« Reply #12 on: Dec 13th, 2005, 3:01pm » |
Quote Modify
|
Here's one way opening theory could eventually favor Silver: What if it turns out that opening with the camel on one flank is an attacking advantage, but it becomes a disadvantage if the other player lines up his elephant opposite to it? Then Gold wouldn't be able to start with a flank camel, because Silver would put an elephant on the same flank, whereas Silver would still have the ability to start with a flank camel on whichever side is away from the gold elephant.
|
« Last Edit: Dec 13th, 2005, 3:02pm by Fritzlein » |
IP Logged |
|
|
|
Adanac
Forum Guru
Arimaa player #892
Gender:
Posts: 635
|
|
Re: Gold advantage impossible to measure?
« Reply #13 on: Dec 13th, 2005, 6:06pm » |
Quote Modify
|
on Dec 13th, 2005, 3:01pm, Fritzlein wrote:Here's one way opening theory could eventually favor Silver: What if it turns out that opening with the camel on one flank is an attacking advantage, but it becomes a disadvantage if the other player lines up his elephant opposite to it? Then Gold wouldn't be able to start with a flank camel, because Silver would put an elephant on the same flank, whereas Silver would still have the ability to start with a flank camel on whichever side is away from the gold elephant. |
| I tried that idea with silver once against Robinson including a rabbit on f7 to minimize the impact of a direct elephant charge up the middle (and it worked!) but I'm still not convinced it's a good idea due to the decentralization of the elephant. http://arimaa.com/arimaa/gameroom/replayFlash.cgi?gid=21919&s=w& client=1 Had I known that I would meet Robinson in round 4 with the silver pieces, I would have waited a few weeks to use this idea I've thought of a new idea, but it all depends upon how Robinson sets up
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Gold advantage impossible to measure?
« Reply #14 on: Mar 3rd, 2007, 9:04pm » |
Quote Modify
|
Nic's question in the bot forum prompted me to think about this again. Fritz, do you have an easy way to tell if the two opening setups are symmetrical to each other? If so, you could run these queries again and split them into games where silver responded passively (either a mirror image, or a rotation), and games where silver responded actively (asymmetry w.r.t. the opponent is sometimes indicative of an attempt to gain an advantage by a method similar to that outlined by Fritz and Adanac). If symmetrical setups still give silver an advantage relative to gold, then I can only see 3 options (in order of likelihood as I see it): 1) This was an errant statistical fluctuation. 2) Our play is so suboptimal that we're actually using our gold initiative to our disadvantage!! 3) Gold has somehow been forced to setup in a zugzwang position!!! nb When I say "symmetrical setup", it's not quite the same as Fritz's previous definitions of symmetry, which were related to the symmetry of each players own pieces with respect to each other. What I'm talking about is when you can apply either a reflection or a rotation of the gold pieces, and get the silver pieces.
|
|
IP Logged |
|
|
|
|