Author |
Topic: 2012 Computer Championship (Read 8138 times) |
|
ingwa
Forum Full Member
Arimaa player #573
Gender:
Posts: 13
|
|
Re: 2012 Computer Championship
« Reply #30 on: Dec 6th, 2011, 11:49am » |
Quote Modify
|
There was a discussion about this in the chat today, and for what it's worth I also agree that quadruple elimination makes sense. -Inge (co-author of Badger)
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2012 Computer Championship
« Reply #31 on: Dec 9th, 2011, 12:27pm » |
Quote Modify
|
The nice thing about the round robin format is that there is no need for seeding. However, it comes at the cost of many games. Since those extra games don't really improve the performance of the format over FTE it's hard to justify using round robin. Especially since the FTE does a good job even when the seeding is practically random. I ran some simulations: Using round robin with 8 players with a true rating distribution of 200, rating inaccuracy of 50 and 0 chance of draw. Result of 1000 tournaments. Measuring the probability of picking the player with the highest true rating. ./run4 'formats/roundRobin' 1000 8 200 50 0 1 26.1% With a measured rating inaccuracy of 0. Should not make any difference since roundRobin does not depend on measured ratings: ./run4 'formats/roundRobin' 1000 8 200 0 0 1 25.5% With a measured rating inaccuracy of 500. Should not make any difference since roundRobin does not depend on measured ratings: ./run4 'formats/roundRobin' 1000 8 200 500 0 1 25.2% Now using double round robin: ./run4 'formats/roundRobinDouble' 1000 8 200 50 0 1 31.4% ./run4 'formats/roundRobinDouble' 1000 8 200 0 0 1 31.7% ./run4 'formats/roundRobinDouble' 1000 8 200 500 0 1 29.8% Now using FTE: ./run4 'formats/floatTripElim' 1000 8 200 50 0 1 32.3% With a measured rating inaccuracy of 0. Perfect seeding. ./run4 'formats/floatTripElim' 1000 8 200 0 0 1 33.6% With a measured rating inaccuracy of 500. Almost random seeding. ./run4 'formats/floatTripElim' 1000 8 200 500 0 1 30.8% Now using FQE (Quad elimination): ./run4 'formats/floatQuadElim' 1000 8 200 50 0 1 35.3% ./run4 'formats/floatQuadElim' 1000 8 200 0 0 1 37.5% ./run4 'formats/floatQuadElim' 1000 8 200 500 0 1 32.0% Even though the results are measured to a tenth of a percent they can vary by about 2% from one run to another. So the measured error is at least +-1%. Even with bad seeding FTE performs almost as good as double round robin. FQE performs slightly better. Certainly FQE could be justified since it does improve performance with minimal additional rounds (about 11.2 rounds). But it is really cutting close since I would have liked the number of rounds to be 10 or less. However, I am willing to go with FQE next year. Will also use it this year if all the participating bot developers agree. Just for fun I tried FPE (Penta elimination) and it is still slightly better than FQE. ./run4 'formats/floatPentElim' 1000 8 200 50 0 1 35.6% ./run4 'formats/floatPentElim' 1000 8 200 0 0 1 37.9% ./run4 'formats/floatPentElim' 1000 8 200 500 0 1 33.4% If you want to try your own experiments, the simulator can be downloaded from here: http://arimaa.com/arimaa/tourn/compare/
|
|
IP Logged |
|
|
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Re: 2012 Computer Championship
« Reply #32 on: Dec 9th, 2011, 2:02pm » |
Quote Modify
|
Thanks for the all the simulations Omar and for offering to make the tournament quadruple elimination (depending on developers' votes)! According to your results it seems perfect seeding also makes a good impact vs random seeding, almost as much as adding another round. on Dec 9th, 2011, 12:27pm, omar wrote:However, I am willing to go with FQE next year. Will also use it this year if all the participating bot developers agree. |
| So it seems that's two votes already (me and ingwa although he hasn't registed bot_badger yet). Three participating developers to go (if no one else registers).
|
« Last Edit: Dec 9th, 2011, 2:19pm by rbarreira » |
IP Logged |
|
|
|
jdb
Forum Guru
Arimaa player #214
Gender:
Posts: 682
|
|
Re: 2012 Computer Championship
« Reply #33 on: Dec 9th, 2011, 6:07pm » |
Quote Modify
|
I support quad. elimination format.
|
|
IP Logged |
|
|
|
lightvector
Forum Guru
Arimaa player #2543
Gender:
Posts: 197
|
|
Re: 2012 Computer Championship
« Reply #34 on: Dec 9th, 2011, 7:41pm » |
Quote Modify
|
FQE sounds good to me. on Dec 6th, 2011, 11:13am, rbarreira wrote: I fed the 2010 WCC and 2011 WCC games (separately) into BayesElo and it gave a probability of superiority of the winner over the 2nd place of just 67% and 58% respectively. |
| It always amazes me how many games it takes to get statistical significance one way or the other.
|
|
IP Logged |
|
|
|
tize
Forum Guru
Arimaa player #3121
Gender:
Posts: 118
|
|
Re: 2012 Computer Championship
« Reply #35 on: Dec 10th, 2011, 1:48pm » |
Quote Modify
|
I'm also willing to accept a change to FQE.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2012 Computer Championship
« Reply #36 on: Dec 10th, 2011, 2:16pm » |
Quote Modify
|
Omar, thanks for being willing to add a fourth elimination. That will be great for the participants. Even though the games are a bit slow-paced for spectators, I think the additional games will please the audience as well. Each of the last four years, the first-place bot lost between zero and two times to the second place bot, but not to any other bot, while the second-place bot lost three times to the winner and zero times to anyone else. That made it relatively clear which bots were the top two and therefore should advance to qualifying, but there was presumably still some luck involved. I'm going to go out on a limb to predict that with four lives for each bot, the situation won't repeat. At least one of the top two bots (presumably sharp and marwin) will lose a game to a bot that doesn't finish in the top two.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2012 Computer Championship
« Reply #37 on: Dec 13th, 2011, 11:32pm » |
Quote Modify
|
OK, quad elimination it is. Seeding will still be based on games against the benchmark bots, as opposed to previous years results. Any suggestions for the benchmark bots.
|
|
IP Logged |
|
|
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Re: 2012 Computer Championship
« Reply #38 on: Dec 14th, 2011, 2:25am » |
Quote Modify
|
on Dec 13th, 2011, 11:32pm, omar wrote: Any suggestions for the benchmark bots. |
| I suggest removing Arimaazilla and Aamira2006P2. That still leaves Gnobot2005Blitz to differentiate between weak potential entrants that can't beat any of the better bots.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2012 Computer Championship
« Reply #39 on: Dec 17th, 2011, 9:30pm » |
Quote Modify
|
on Dec 14th, 2011, 2:25am, rbarreira wrote: I suggest removing Arimaazilla and Aamira2006P2. That still leaves Gnobot2005Blitz to differentiate between weak potential entrants that can't beat any of the better bots. |
| That would reduce the number of bots and make the qualifying phase a bit less work. I am OK with this. If there are no objection raised soon, I'll update the WCC page.
|
|
IP Logged |
|
|
|
tize
Forum Guru
Arimaa player #3121
Gender:
Posts: 118
|
|
Re: 2012 Computer Championship
« Reply #40 on: Dec 20th, 2011, 3:11pm » |
Quote Modify
|
I have no objection to reduce the number of qualification bots, actually I wouldn't object to use the bot's name sum (a=1,b=2...) modulo some nice number as long as there is no more bots entering the tournament. Because according to the test simulations that I ran here at home indicated that the winner (in a tight 4 bot tournament) will be picked 49-53 percent of the time when the seeding goes from random to perfect.
|
|
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: 2012 Computer Championship
« Reply #41 on: Dec 23rd, 2011, 2:15pm » |
Quote Modify
|
Will there be any plans to prevent games from being affected by undue time loss, maybe by means of a verification script parsing the game log?
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2012 Computer Championship
« Reply #42 on: Dec 28th, 2011, 11:46am » |
Quote Modify
|
on Dec 23rd, 2011, 2:15pm, aaaa wrote:Will there be any plans to prevent games from being affected by undue time loss, maybe by means of a verification script parsing the game log? |
| If a game times out, we will have humans look at the logs and determine what might have caused the problem, present it to the TD and go based on that. If it is a case covered in the Technical Problems section of the rules then a decision by the TD is not required and it can be resolved as directed in that section. I don't plan on writing a script to parse the game logs.
|
|
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: 2012 Computer Championship
« Reply #43 on: Dec 28th, 2011, 2:09pm » |
Quote Modify
|
The problem then is, that it is, perversely enough, exactly those bots that are programmed to recognize and take into account unfairly lost time, that end up suffering for it in silence.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2012 Computer Championship
« Reply #44 on: Dec 28th, 2011, 2:42pm » |
Quote Modify
|
on Dec 28th, 2011, 2:09pm, aaaa wrote:The problem then is, that it is, perversely enough, exactly those bots that are programmed to recognize and take into account unfairly lost time, that end up suffering for it in silence. |
| Can you elaborate on this.
|
|
IP Logged |
|
|
|
|