Author |
Topic: 2009 Arimaa Events (Read 4180 times) |
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: 2009 Arimaa Events
« Reply #15 on: Nov 7th, 2008, 8:12pm » |
Quote Modify
|
You could mitigate the problem a bit by having the seeding be determined to a certain extent by the performance of a player in last year's championship.
|
|
IP Logged |
|
|
|
RonWeasley
Forum Guru
Harry's friend (Arimaa player #441)
Gender:
Posts: 882
|
|
Re: 2009 Arimaa Events
« Reply #16 on: Nov 12th, 2008, 7:43am » |
Quote Modify
|
Omar, I went to register for the Spectator contest. The description says the fee is $5 but PayPal asks for $10. I'll pay either one, but which is correct? (Or is it $10 for me and $5 for everybody else?)
|
|
IP Logged |
|
|
|
chessandgo
Forum Guru
Arimaa player #1889
Gender:
Posts: 1244
|
|
Re: 2009 Arimaa Events
« Reply #17 on: Nov 12th, 2008, 10:26am » |
Quote Modify
|
This is tax on Time-Turners probably.
|
|
IP Logged |
|
|
|
Adanac
Forum Guru
Arimaa player #892
Gender:
Posts: 635
|
|
Re: 2009 Arimaa Events
« Reply #18 on: Nov 14th, 2008, 8:26am » |
Quote Modify
|
on Nov 12th, 2008, 7:43am, RonWeasley wrote:Omar, I went to register for the Spectator contest. The description says the fee is $5 but PayPal asks for $10. I'll pay either one, but which is correct? (Or is it $10 for me and $5 for everybody else?) |
| I noticed the same thing as well, but Omar has now fixed the problem. The time-turner-tax has been eliminated.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2009 Arimaa Events
« Reply #19 on: Nov 14th, 2008, 11:41am » |
Quote Modify
|
I forgot to change the settings on the PayPal button. Thanks for reminding me. It's been fixed now. There goes my hopes of winning the spectator contest by being the only contestant
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2009 Arimaa Events
« Reply #20 on: Nov 21st, 2008, 6:14am » |
Quote Modify
|
It's official. There are nine players registered for the World Championship, therefore we can't skip straight to the finals. There will have to be a preliminary round. Yay!
|
|
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: 2009 Arimaa Events
« Reply #21 on: Nov 25th, 2008, 2:45pm » |
Quote Modify
|
Can the format of the Computer Championship be changed to floating quadruple elimination? We now have what appears to be four closely matched bots, so I think that in order for the final result to be a sufficiently accurate reflection of strength, it's now more important than ever that the tournament should truly be discerning, especially in light of the excessive influence we have currently attributed to its seeding.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2009 Arimaa Events
« Reply #22 on: Nov 26th, 2008, 8:22am » |
Quote Modify
|
I also like quadruple elimination. The pairing program can handle it, because the program is limited by participants, not by rounds. The extra elimination makes the seeding less relevant and the determination of the winner more accurate. The only drawback is that the tournament lasts longer and is more work for Omar. There are too many possible pairings to calculate by hand, but I estimate that for eight participants triple elimination has 7-10 rounds and 21-23 games, while quadruple elimination has 9-13 rounds and 28-31 games. To compress the time, Omar might want to do two rounds per day in the late rounds. Omar has talked about not compressing the rounds, so that spectators have some time to see the schedule in advance and watch the games live, but I don't think this should be a large factor in the pacing of the tournament. The slow time control is not good for spectating anyway, plus the different time zones already mean each game is impossible for some spectators to watch, so I don't see a problem with running through the CC games as quickly as is convenient.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2009 Arimaa Events
« Reply #23 on: Nov 29th, 2008, 12:27pm » |
Quote Modify
|
on Nov 25th, 2008, 2:45pm, aaaa wrote:Can the format of the Computer Championship be changed to floating quadruple elimination? We now have what appears to be four closely matched bots, so I think that in order for the final result to be a sufficiently accurate reflection of strength, it's now more important than ever that the tournament should truly be discerning, especially in light of the excessive influence we have currently attributed to its seeding. |
| You might be surprised to know that ratings can be better at picking the best player than even a double round robin tournament. Just depends on how accurate the ratings are. Also the gain in higher probability of picking the best player is only about 2% between FTE and FQE. http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=talk;action=display;nu m=1114794077;start=84#84 It is quite surprising, but there is much more to be gained by improving the initial seeding than by extending the number of rounds. However, these simulations were done for 16 player tournaments; they need to be rerun for 8 player tournaments.
|
« Last Edit: Nov 29th, 2008, 12:28pm by omar » |
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: 2009 Arimaa Events
« Reply #24 on: Nov 29th, 2008, 2:31pm » |
Quote Modify
|
on Nov 29th, 2008, 12:27pm, omar wrote:You might be surprised to know that ratings can be better at picking the best player than even a double round robin tournament. |
| Oh, but I'm not surprised by this at all, for see below. Quote:Just depends on how accurate the ratings are. |
| And how do we arrive at accurate ratings? By having games be played between the contestants in the first place! You must surely see that there is some sort of "no free lunch" theorem at work here? Wouldn't it in that case be much better that it's more the games in the controlled environment of the tournament itself that are going to determine the result rather than a rating system that represents other games, full with the severe flaws that would come with it, not the least of which the extreme vulnerability to self-selection distortions? I just met an Arimaa player who saw maximizing his own rating as a goal in itself and for that reason refused to play me rated. Quote: Even accepting that number for sake of argument, that doesn't cover the additional advantage that, even with no change in winner, the inclination towards the "right" result would still be more likely to become manifest in the degree of decisiveness of the final result, i.e. it would likely make us more confident with a "right" result and less with a "wrong" one. The importance of this should not be underestimated. Quote:It is quite surprising, but there is much more to be gained by improving the initial seeding than by extending the number of rounds. |
| I still think that, for the purpose of seeding, trying to fix the ratings rather than outright minimizing their influence would be beating the air. It might even be a dangerous pursuit, as it could lead to a false confidence in a solution that would only (temporarily) hide the weaknesses. Quote:However, these simulations were done for 16 player tournaments; they need to be rerun for 8 player tournaments. |
| More simulations can never hurt of course, but I for one, would like to see what happens if there are several closely matched players (as I consider the top-4 bots to be) in them. Chances are, we'll see the percentages fall significantly, even with more reliable ratings.
|
|
IP Logged |
|
|
|
jdb
Forum Guru
Arimaa player #214
Gender:
Posts: 682
|
|
Re: 2009 Arimaa Events
« Reply #25 on: Nov 29th, 2008, 4:51pm » |
Quote Modify
|
Just a quick comment, and then I'll go back to lurking on this thread. If a bot is under constant development, the bot that got its rating in the game room and the one that enters the tournament, a couple weeks later, could be very different.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2009 Arimaa Events
« Reply #26 on: Dec 3rd, 2008, 9:02am » |
Quote Modify
|
I ran some simulations to compare triple elimination with quad elimination when there are 8 players. It seems like the improvement is closer to about 4%. run3 'formats/floatTripElim' 1000 8 500 50 10000000 means: 1000 trials, 8 players, range of true ratings is 500 elo points, measured rating inaccuracy of 50 elo points, and a draw ratio of 1:10000000. Code: run3 'formats/floatTripElim' 1000 8 500 50 10000000 1 52.4% average number of rounds = 8.97 average rating from best = 32.4 run3 'formats/floatQuadElim' 1000 8 500 50 10000000 1 56.3% average number of rounds = 11.43 average rating from best = 29.5 run3 'formats/floatTripElim' 1000 8 500 100 10000000 1 51.0% average number of rounds = 9.04 average rating from best = 34.2 run3 'formats/floatQuadElim' 1000 8 500 100 10000000 1 54.9% average number of rounds = 11.41 average rating from best = 28.7 run3 'formats/floatTripElim' 1000 8 500 200 10000000 1 52.0% average number of rounds = 9.03 average rating from best = 34.2 run3 'formats/floatQuadElim' 1000 8 500 200 10000000 1 50.1% average number of rounds = 11.46 average rating from best = 33.3 run3 'formats/floatTripElim' 1000 8 500 400 10000000 1 51.3% average number of rounds = 9.00 average rating from best = 34.8 run3 'formats/floatQuadElim' 1000 8 500 400 10000000 1 53.2% average number of rounds = 11.38 average rating from best = 33.0 |
| I didn't see what happens yet if the true rating range is smaller. The programs I used are available from: http://arimaa.com/arimaa/tourn/compare/sim.tar or in ZIP format: http://arimaa.com/arimaa/tourn/compare/sim.zip
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2009 Arimaa Events
« Reply #27 on: Dec 3rd, 2008, 9:48am » |
Quote Modify
|
The 400-point inaccuracy for gameroom ratings might be the most realistic of the trials you ran. Remember in 2008 Sharp was the lowest seed with a rating of 1500, but then came in second place, ahead of OpFor who had a pre-tournament rating of 1890. That's a 390-point swing right there. Also in 2006, the top two seeds by gameroom rating were GnoBot and Loc, while Bomb was the bottom seed, but Bomb took first while GnoBot and Loc took the bottom two spots. Right now we have a situation where GnoBot and Rat each have a Bomb-beating formula, so either developer could push down Bomb to the bottom seed if they wanted, while giving their own bot the top seed. In such a situation, it makes much more sense to me to assume wildly inaccurate ratings in the simulations. As for the range of true strengths of the bots, I would guess that it has been 500 points in the past, but this year there are a lot of bots that are close to each other at the top, so we might be trying to make a finer discrimination than in the past. For the statistic "average rating from best", do you average in a zero when the best player wins, or is it the average only from the times when the best player doesn't win? If the former than the two formats aren't that different in their misses, but if the latter quadruple elimination is not only more likely to determine the best player, it is also more likely to miss only by a little. As I think about it, if triple-elimination is right over half the time and, when it is wrong is only wrong by an average of 35 rating points, it's not doing so bad. Finally, for the Computer Championship, aren't you more concerned about the average number of games than the average number of rounds, since you can only play one game at a time no matter how many games are in the round? But maybe the increase in games is essentially proportional to the increase in rounds.
|
« Last Edit: Dec 3rd, 2008, 10:03am by Fritzlein » |
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: 2009 Arimaa Events
« Reply #28 on: Dec 3rd, 2008, 10:22am » |
Quote Modify
|
If we restrict our attention to the top-4 bots (with all due respect to the developers of the others), we see that the current maximum difference in rating is 160. If for each run, the true rating range is set to this number plus the given rating inaccuracy we get the following: Code:./run3 'formats/floatTripElim' 1024 4 190 30 1000000000 1 46.4% 2 26.4% 3 17.8% 4 9.5% average number of rounds = 6.80 average rating from best = 29.5 ./run3 'formats/floatQuadElim' 1024 4 190 30 1000000000 1 52.5% 2 26.5% 3 13.7% 4 7.3% average number of rounds = 8.86 average rating from best = 23.7 ./run3 'formats/floatTripElim' 1024 4 220 60 1000000000 1 47.4% 2 29.8% 3 15.2% 4 7.6% average number of rounds = 6.81 average rating from best = 30.3 ./run3 'formats/floatQuadElim' 1024 4 220 60 1000000000 1 54.2% 2 25.7% 3 14.3% 4 5.9% average number of rounds = 8.89 average rating from best = 25.9 ./run3 'formats/floatTripElim' 1024 4 280 120 1000000000 1 53.3% 2 26.9% 3 13.4% 4 6.4% average number of rounds = 6.79 average rating from best = 30.9 ./run3 'formats/floatQuadElim' 1024 4 280 120 1000000000 1 54.3% 2 27.6% 3 12.1% 4 6.0% average number of rounds = 8.83 average rating from best = 27.8 |
|
|
|
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: 2009 Arimaa Events
« Reply #29 on: Dec 3rd, 2008, 10:34am » |
Quote Modify
|
on Dec 3rd, 2008, 9:48am, Fritzlein wrote:For the statistic "average rating from best", do you average in a zero when the best player wins, or is it the average only from the times when the best player doesn't win? |
| From a cursory glance at the code it appears that it's simply the average distance from every true rating to the best one and that the results don't come in.
|
« Last Edit: Dec 3rd, 2008, 11:11am by aaaa » |
IP Logged |
|
|
|
|