Welcome, Guest. Please Login or Register.
Apr 24th, 2024, 9:13am

Home Home Help Help Search Search Members Members Login Login Register Register
Arimaa Forum « First Move Advantage vs. Second Setup Advantage »


   Arimaa Forum
   Arimaa
   General Discussion
(Moderator: supersamu)
   First Move Advantage vs. Second Setup Advantage
« Previous topic | Next topic »
Pages: 1 2 3  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: First Move Advantage vs. Second Setup Advantage  (Read 16573 times)
chessandgo
Forum Guru
*****



Arimaa player #1889

   


Gender: male
Posts: 1244
Re: First Move Advantage vs. Second Setup Advantag
« Reply #15 on: Nov 18th, 2006, 5:22am »
Quote Quote Modify Modify

Interesting statistics. I have no idea wether gold or silver should have an intrinseque advantage, but whichever way it can't but be a VERY small one, at least with our current way to play. As during the games, many tactical blunders and large strategical errors occur, I don't see how the starting advantage could be relevant in the final result.
 
We might have to wait to get a lot stonger before getting some precise statistical or metaphysical answer, don't you think ? Or does a large enough data show it in spite of the errors occuring in any game ?
IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: First Move Advantage vs. Second Setup Advantag
« Reply #16 on: Nov 18th, 2006, 9:25am »
Quote Quote Modify Modify

on Nov 18th, 2006, 5:22am, chessandgo wrote:
We might have to wait to get a lot stonger before getting some precise statistical or metaphysical answer, don't you think ?

To get a precise metaphyscial answer, we need to be sure we are playing well, which we aren't.
 
Quote:
Or does a large enough data show it in spite of the errors occuring in any game ?

It already is a large enough dataset to overcome the randomness from game to game, or at least that's what 2.12 standard deviations says.  A 30-point rating advantage is just about the smallest that can be detected at this point.  We would need four times as many games to detect half that advantage, one hundred times as many games to detect one-tenth that advantage, etc.
« Last Edit: Nov 21st, 2006, 1:45pm by Fritzlein » IP Logged

chessandgo
Forum Guru
*****



Arimaa player #1889

   


Gender: male
Posts: 1244
Re: First Move Advantage vs. Second Setup Advantag
« Reply #17 on: Nov 21st, 2006, 12:44pm »
Quote Quote Modify Modify

Here is another attempt to explain this silver advantage. The player inviting usually takes silver, and if he/she invites, it means he/she's ready to play, even hungry to do so, not too tired, not immersed in doing something else, while the other player might not be under as good conditions of play, even though he/she accepts.  
When someone challenges me I almost never decline, and I think it might explain the pourcentage difference between my gold and silver results (I don't know who has the advantage, but I'm really convinced that it's too small to be a real bias to a game result, even to a ten thousands games results).  
 
[Casting a quick glance at some top-rated player's statistics, it appear that they indeed have better results with silver than gold, with the exceptions of yourself, Karl, and Adanac. Are you too never tired ?Smiley ]
 
Do you think it makes sense ?
IP Logged

chessandgo
Forum Guru
*****



Arimaa player #1889

   


Gender: male
Posts: 1244
Re: First Move Advantage vs. Second Setup Advantag
« Reply #18 on: Nov 21st, 2006, 12:44pm »
Quote Quote Modify Modify

*Are you two ...  Lips Sealed
IP Logged

99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: First Move Advantage vs. Second Setup Advantag
« Reply #19 on: Nov 21st, 2006, 3:06pm »
Quote Quote Modify Modify

on Nov 21st, 2006, 12:44pm, chessandgo wrote:
Casting a quick glance at some top-rated player's statistics, it appear that they indeed have better results with silver than gold, with the exceptions of yourself, Karl, and Adanac.

If you limit my games to HvH, or if you only consider rated games, I have a big advantage with gold.
IP Logged
chessandgo
Forum Guru
*****



Arimaa player #1889

   


Gender: male
Posts: 1244
Re: First Move Advantage vs. Second Setup Advantag
« Reply #20 on: Nov 21st, 2006, 3:35pm »
Quote Quote Modify Modify

Indeed 99, you have a good 3 extra percents with gold compared to silver ... I just saw that Omar has one extra % as well ... and I guess many other players should be in the same case ...
 
My theory crumbles ... Smiley
IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: First Move Advantage vs. Second Setup Advantag
« Reply #21 on: Mar 26th, 2008, 10:02am »
Quote Quote Modify Modify

All the talk about evening up the colors in tournament pairing and possibly changing the setup rules to equalize for Gold's advantage made me revisit this thread.  I remember now how most queries on the database to determine rating advantage are invalidated by the inaccuracy of the gameroom ratings and/or the fact that the stronger players tend to play Silver more often.  But there is one methodology I still haven't found any holes in, and that is using only pairs of games where the two players have reversed colors.  Rated games only, of course, to exclude timeouts that got unrated and exclude experiments.  When I ran that 16 months ago, the results were
 
on Nov 16th, 2006, 1:02am, Fritzlein wrote:

Game Type  Pairs  Gold Wins  Mismatch  Gold Adv.  # Std. Dev.
---------  -----  ---------  --------  ---------  -----------
ALL    .   11222  11231   .  212   .     0.4   .  0.14
B v B   .   1676   1684   .  190   .     2.2   .  0.32
H v B   .   8972   9001.5 .  213   .     1.6   .  0.53
H v H   .    574    545.5 .  258   .   -30.3   .  2.2

Note that I express the advantage of being Gold in rating points.  I decided to try this method again, this time only on game results after 11/15/06, in an attempt to detect any trends in game play.  Perhaps the way we used to play gave an advantage to Silver, but the times have changed and now the playing style gives an advantage to Gold.  The results according to the new data only:

Rated Pairs from 11/15/2006 to 3/23/2008
 
Game Type  Pairs  Gold Wins  Mismatch  Gold Adv.  # Std. Dev.
---------  -----  ---------  --------  ---------  -----------
ALL    .   10347  10392   .  251   .     2.4   .  0.79
B v B   .    143    151   .  128   .    22.2   .  0.95
H v B   .   9872   9896   .  251   .     1.4   .  0.43
H v H   .    332    345   .  309   .    27.5   .  1.33

Sadly, limiting the dataset to only games of the last sixteen months means that all of the results are statistically insignificant.  On just that data, the Gold advantage (if any) is too small to measure.  Most of the data is HvB, which shows a negligible advantage to Gold, whereas the BvB and HvH data, which shows a larger advantage to Gold, is on so few games that it could just be a fluke.  We are looking for 2 standard deviations but have at most 1.33.
 
The one thing this data does show to me, however, is that in November 2006 when I measured a "statistically significant" advantage for Silver in HvH games, it was probably a fluke after all, just accidentally among those 5% of times when the result is an outlier due to chance.
 
In an effort to achieve greater statistical significance, I gave up on trying to detect a recent trend.  (Craps players and stock pickers are always sure of finding short-term "trends".  Tongue)  Instead I ran the the test on the entire database, i.e. all pairs of games since the beginning of time.
 

Rated Pairs from 11/22/2002 to 3/23/2008
 
Game Type  Pairs  Gold Wins  Mismatch  Gold Adv.  # Std. Dev.
---------  -----  ---------  --------  ---------  -----------
ALL    .   21926  21985   .  231   .     1.4   .  0.69
B v B   .   1872   1896   .  185   .     5.8   .  0.88
H v B   .  19112  19163.5 .  233   .     1.4   .  0.65
H v H   .    942    925.5 .  278   .   -10.9   .  1.05

 
Even taking all the data together, we have no statistical significance.  For all we know the advantage of setting up second exactly balances the advantage of moving first.
 
I persist in my belief that Gold probably has the advantage, but it appears to be so small that we won't be able to measure it until we have a game database five or ten times the size we have now.  I predict that in the long run, between evenly matched players, Gold will win less than 51%, which is to say playing Gold is worth less than 7 rating points.
 
There is a possibility that once we know what we are doing, the advantage of the first move will be magnified.  Maybe now our incompetence adds so much noise to the signal, we simply can't see the true worth of tempo.  Note, however, that this is not true for chess.  In amateur chess games white scores about 55% and in expert chess games white also scores about 55%.  The difference is just that the chess experts have more draws.  If Arimaa is similar, and the balance between the colors doesn't change with increasing skill, then we will perpetually have the luxury of not worrying too much about color balance in tournaments, or changing the setup rule to even things out.  
« Last Edit: Mar 26th, 2008, 10:53am by Fritzlein » IP Logged

The_Jeh
Forum Guru
*****



Arimaa player #634

   


Gender: male
Posts: 460
Re: First Move Advantage vs. Second Setup Advantag
« Reply #22 on: Mar 26th, 2008, 12:08pm »
Quote Quote Modify Modify

Let me try to understand your math. Would it be correct to say that, if gold and silver were in fact completely even, the chance of seeing results as good or better for gold in the ALL category is 28.5%, which is far from statistically significant?
« Last Edit: Mar 26th, 2008, 12:14pm by The_Jeh » IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: First Move Advantage vs. Second Setup Advantag
« Reply #23 on: Mar 26th, 2008, 12:54pm »
Quote Quote Modify Modify

on Mar 26th, 2008, 12:08pm, The_Jeh wrote:
Let me try to understand your math. Would it be correct to say that, if gold and silver were in fact completely even, the chance of seeing results as good or better for gold in the ALL category is 28.5%, which is far from statistically significant?

Well, my math is kind of fuzzier than that.  Let me walk through the all case step by step.
 
We have 21926 pairs of games, i.e. 43852 games total, of which Gold won 21985.  If Gold and Silver were completely equal, then it would be just like flipping a coin to test if the coin is fair.  If you do a large number of coin flips, the results will fall into a normal distribution (whether or not the coin is fair) with a standard deviation of sqrt(npq), where n is the number of trials, p the probability of heads, and q is the probability of tails.
 
Let's say we want to test the null hypothesis that the coin is fair.  If we flip a fair coin 43852 times, we expect heads 21926 times, with a standard deviation of sqrt(43852*(1/2)*(1/2)) = 104.7.  I'm sure you know how to convert standard deviations into probabilities, e.g. there is 68.3% chance of the actual number of head falling within 104.7 of the mean of 21926.  Our actual number of Gold wins was only 59 away from the mean, i.e. only 0.56 standard deviations.  There is a 28.66% chance that Gold would do this well or better just due to random chance, which I think is the number you are referring to in your post.
 
However, Gold and Silver are usually not even, which affects the significance of the result.  Suppose the data comes from repeated games of players who are 400 rating points apart.  Usually the stronger player will win both as Gold and as Silver, since he has 10 to 1 odds if there is no color advantage.  This helps force the results towards being 50-50 between Gold and Silver, i.e. it conceals the color advantage.  If the observed results deviate from even, we should attribute more significance to them.  The standard deviation for this mismatched case is sqrt(43852*(10/11)*(1/11)) = 60.2, so an observed difference of 59 extra wins for Gold is almost a full standard deviation.
 
Unfortunately, the true skill gap between pairs of players is unknown, and only inaccurately measured by game room ratings.  Furthermore, the skill gap varies from game to game, so I really should sum up the variance introduced by each pair of games, but I was lazy and instead just calculated an average mismatch.  Over all games the stronger player is rated an average of 231 point higher than the weaker player.  So I did my calculation of the size of one standard deviation as if each game had a 231-point mismatch.
 
The last column in the row for all games is the number of standard deviations.  For all games, 0.69 standard deviations means that Gold would have scored 59 extra wins by chance with 24.5% probability.  But actually, since we don't know whether Gold or Silver has an advantage, we really should use a two-tailed statistic, and say that there was a 49% chance of a result at least this extreme.
 
Not only is that number not statistically significant, it is right in middle of "average extremeness", so to speak.  If the actual game results had been exactly half Gold wins and half Silver wins, or only off by one or two, that would be a really spooky result in a different way from showing a color advantage.  It would be like strong evidence of cosmic Karma trying to even out wins.  I would have to double check my code for errors or look for reasons why the split was so exact and there wasn't more random variation.
 
All in all, the overall result of 0.69 standard deviations is about as non-informative as you can get.
« Last Edit: Mar 26th, 2008, 12:56pm by Fritzlein » IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: First Move Advantage vs. Second Setup Advantag
« Reply #24 on: Mar 26th, 2008, 1:09pm »
Quote Quote Modify Modify

on Dec 14th, 2003, 7:43am, clauchau wrote:
The most elementary strategy consists in randomly picking up a valid step among the immediately available steps, making the selected step on the board, and repeating this four times for every move.
 
random stepper / random stepper
 
Gold won 50.3%,  Silver won 49.7%
 
The winner reached the goal 62.5%
The loser moved an opposing rabbit to the goal 26.0%
The loser was unable to move 11.5%
Loss by 3-times repetition: 0 (no such loss).
 
Shortest game: 11 half-moves
Mean length: 76.9 half-moves (standard deviation = 21.7)
Longest game: 194 half-moves
 
(figures based on 100,000 games run in 4 minutes)

The standard deviation for 100,000 games between equal players is 158 wins, so actual results of 50,300 wins for Gold is almost two standard deviations, i.e. on the borderline of statistical significance.  Clauchau, would you be willing to run a larger test so we can say with greater confidence that moving first is an advantage to a random stepper?
 
Won't it be a hoot if we someday measure that among equally matched experts Gold has a 50.3% chance of winning?  That would be an advantage of 2 rating points to Gold Tongue.
IP Logged

The_Jeh
Forum Guru
*****



Arimaa player #634

   


Gender: male
Posts: 460
Re: First Move Advantage vs. Second Setup Advantag
« Reply #25 on: Mar 26th, 2008, 1:44pm »
Quote Quote Modify Modify

The number I was referring to was the binomial probability if gold and silver are each equally likely to win. It's been a while since I took statistics and I had forgotten how to calculate what the standard devation is. Thanks for the review.
 
Perhaps you could get some perfectly even games by pitting a bot against itself several times. The results might be confounded by the style of the bot itself, but the results could be interesting.
« Last Edit: Mar 26th, 2008, 1:52pm by The_Jeh » IP Logged
mistre
Forum Guru
*****





   


Gender: male
Posts: 553
Re: First Move Advantage vs. Second Setup Advantag
« Reply #26 on: Mar 26th, 2008, 3:12pm »
Quote Quote Modify Modify

Or we could just have Karl play a match against himself.  Whichever color wins has the advantage.  If it is a draw, then there is no advantage for either color.   Wink
IP Logged

The_Jeh
Forum Guru
*****



Arimaa player #634

   


Gender: male
Posts: 460
Re: First Move Advantage vs. Second Setup Advantag
« Reply #27 on: Mar 26th, 2008, 3:41pm »
Quote Quote Modify Modify

Let's say that eventually we find that gold wins 60% of the time and silver 40% in master-level games. That is significant and I'm sure we would want a rule change. Or would we? Suppose, hypothetically, that we solve the game and discover that despite gold's 60% rate of victory, it is actually silver who could force a win with perfect play. With this new knowledge, would we want to change the rules or not?
 
My guess is that we would. But why does gold win more often? It must be because it is easier to make the best moves as gold than it is as silver. But why? And how do we measure this "difficulty in choosing?" Is it that gold's move choices lead to a higher average outcome than silver, even though silver's highest forced outcome is better than gold's highest forced outcome? Or is it for some reason easier to spot a good move for gold than for silver?
 
When a game has the intuitive feel of being mechanically perfect, such as chess, I am superstitious of changing its rules just for the sake of statistics obtained from mortal players.
« Last Edit: Mar 26th, 2008, 3:43pm by The_Jeh » IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: First Move Advantage vs. Second Setup Advantag
« Reply #28 on: Mar 26th, 2008, 3:43pm »
Quote Quote Modify Modify

on Mar 26th, 2008, 1:44pm, The_Jeh wrote:
The number I was referring to was the binomial probability if gold and silver are each equally likely to win. It's been a while since I took statistics and I had forgotten how to calculate what the standard devation is. Thanks for the review.

Ah, that's another way in which I'm careless: I assume the normal curve as an approximation of the binomial.  Also I vaguely recall that when I use the continuous approximation instead of the exact discrete, I should have my boundary at 59.5 instead of 59.  Oh, well.  It's a good thing we got nearly the same answer!
« Last Edit: Mar 26th, 2008, 3:45pm by Fritzlein » IP Logged

aaaa
Forum Guru
*****



Arimaa player #958

   


Posts: 768
Re: First Move Advantage vs. Second Setup Advantag
« Reply #29 on: Apr 13th, 2008, 3:01pm »
Quote Quote Modify Modify

Based on my calculations, Gold's winning percentage varies wildly depending on how the games are weighted by rating. In order for me to come to a more reliable number, Fritzlein and chessandgo would need to play a bunch of games against each other with alternating colors (cf. Karpov vs. Kasparov).
IP Logged
Pages: 1 2 3  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.