Author |
Topic: 2006 World Championship (Read 6105 times) |
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
2006 World Championship
« on: Nov 13th, 2005, 8:49pm » |
Quote Modify
|
Looks like it's time for a new thread so we can discuss the guts of the tourney as they spill. Having seen the pairings, I've appreciated for the first time the serious strength in this year's championship. We'll either see some big upsets in round 1, or some very big clashes already in round 2. If things go according to ratings, 2005-world-champ plays 2004&2005-third-place in round 2 of a 7 round tourney!?! Unheard of. And it's not just because Omar is underrated... nearly all the round 2 clashes look set to be almost as big. (Even round 1 has naveed vs robinson and jdb vs Ryan ...) So, having seen the depth of the field, I think this year's tourney is really up for grabs. I could readily imagine any of 10 players actually winning the comp! And well done to everyone who entered, it's really great to see even some beginners throwing in their hand. I think participating in something like this will be a great experience for all of us.
|
|
IP Logged |
|
|
|
Ryan_Cable
Forum Guru
Arimaa player #951
Gender:
Posts: 138
|
|
Re: 2006 World Championship
« Reply #1 on: Nov 14th, 2005, 2:50am » |
Quote Modify
|
The tournament has 16 players: 1 Fritzlein 2273 2 99of9 2142 3 Ryan_Cable 2105 4 PMertens 2045 5 Adanac 2039 6 robinson 2026 7 Belbo 2009 8 omar 1860 9 Paul 1827 10 BlackKnight 1818 11 naveed 1678 12 megamau 1651 13 grey_0x2A 1609 14 jdb 1584 15 acheron 1386 16 MrBrain 1369 Median 1843.5 Mean 1838.8 StDev 271.2 I wrote a python script to simulate the tournament and estimate people’s chances. I estimated people’s true-rating by taking a Gaussian distribution about their rating. For each case below, I ran the simulation for 100,000 trials, which took about 100s per case. I am not sure exactly what the expected error of these simulations is, but I would guess it is about 1/Sqrt(100000) ~= 0.00316 StDev 0 (true-rating = rating): 1 Fritzlein 0.65264 2 99of9 0.16353 3 Ryan_Cable 0.08959 4 PMertens 0.03478 5 Adanac 0.02731 6 robinson 0.01959 7 Belbo 0.01170 8 omar 0.00049 9 Paul 0.00021 10 BlackKnight 0.00015 11 naveed 0.00001 12 megamau 0 13 grey_0x2A 0 14 jdb 0 15 acheron 0 16 MrBrain 0 StDev 50: 1 Fritzlein 0.61176 2 99of9 0.17013 3 Ryan_Cable 0.09942 4 PMertens 0.03888 5 Adanac 0.03592 6 robinson 0.02624 7 Belbo 0.01601 8 omar 0.00099 9 Paul 0.00044 10 BlackKnight 0.00021 11 naveed 0 12 megamau 0 13 grey_0x2A 0 14 jdb 0 15 acheron 0 16 MrBrain 0 StDev 100: 1 Fritzlein 0.52828 2 99of9 0.17758 3 Ryan_Cable 0.11514 4 PMertens 0.05532 5 Adanac 0.05059 6 robinson 0.03991 7 Belbo 0.02780 8 omar 0.00286 9 Paul 0.00148 10 BlackKnight 0.00097 11 naveed 0.00003 12 megamau 0.00002 13 grey_0x2A 0 14 jdb 0.00002 15 acheron 0 16 MrBrain 0 StDev 200: 1 Fritzlein 0.37288 2 99of9 0.17004 3 Ryan_Cable 0.12907 4 PMertens 0.08139 5 Adanac 0.07862 6 robinson 0.06899 7 Belbo 0.05777 8 omar 0.01611 9 Paul 0.01110 10 BlackKnight 0.00949 11 naveed 0.00205 12 megamau 0.00127 13 grey_0x2A 0.00070 14 jdb 0.00045 15 acheron 0.00006 16 MrBrain 0.00001 I also tested the effect of lowering my rating to a more realistic 1800, but keeping the ranking the same. StDev 0: 1 Fritzlein 0.68330 2 99of9 0.19008 3 Ryan_Cable 0.00063 4 PMertens 0.04343 5 Adanac 0.03471 6 robinson 0.02959 7 Belbo 0.01695 8 omar 0.00077 9 Paul 0.00036 10 BlackKnight 0.00018 11 naveed 0 12 megamau 0 13 grey_0x2A 0 14 jdb 0 15 acheron 0 16 MrBrain 0 StDev 50 (I think this is probably closest to the actual odds.): 1 Fritzlein 0.64724 2 99of9 0.19530 3 Ryan_Cable 0.00125 4 PMertens 0.05255 5 Adanac 0.04239 6 robinson 0.03649 7 Belbo 0.02210 8 omar 0.00137 9 Paul 0.00085 10 BlackKnight 0.00042 11 naveed 0.00002 12 megamau 0.00001 13 grey_0x2A 0 14 jdb 0.00001 15 acheron 0 16 MrBrain 0 StDev 100: 1 Fritzlein 0.56945 2 99of9 0.20130 3 Ryan_Cable 0.00285 4 PMertens 0.07013 5 Adanac 0.05938 6 robinson 0.05246 7 Belbo 0.03601 8 omar 0.00431 9 Paul 0.00242 10 BlackKnight 0.00138 11 naveed 0.00016 12 megamau 0.00010 13 grey_0x2A 0.00001 14 jdb 0.00004 15 acheron 0 16 MrBrain 0 StDev 200: 1 Fritzlein 0.40832 2 99of9 0.19373 3 Ryan_Cable 0.01350 4 PMertens 0.09553 5 Adanac 0.08964 6 robinson 0.08267 7 Belbo 0.06632 8 omar 0.01866 9 Paul 0.01383 10 BlackKnight 0.01139 11 naveed 0.00257 12 megamau 0.00180 13 grey_0x2A 0.00111 14 jdb 0.00087 15 acheron 0.00003 16 MrBrain 0.00003 Then, I tested the effect of lowering my rating to 1800, and changing the rankings to match. StDev 0: 1 Fritzlein 0.68136 2 99of9 0.19066 3 PMertens 0.04616 4 Adanac 0.03802 5 robinson 0.02455 6 Belbo 0.01785 7 omar 0.00072 8 Paul 0.00033 9 BlackKnight 0.00020 10 Ryan_Cable 0.00014 11 naveed 0.00001 12 megamau 0 13 grey_0x2A 0 14 jdb 0 15 acheron 0 16 MrBrain 0 StDev 50: 1 Fritzlein 0.64623 2 99of9 0.19603 3 PMertens 0.05494 4 Adanac 0.04499 5 robinson 0.03186 6 Belbo 0.02322 7 omar 0.00130 8 Paul 0.00073 9 BlackKnight 0.00045 10 Ryan_Cable 0.00021 11 naveed 0.00004 12 megamau 0 13 grey_0x2A 0 14 jdb 0 15 acheron 0 16 MrBrain 0 StDev 100: 1 Fritzlein 0.56895 2 99of9 0.20362 3 PMertens 0.07017 4 Adanac 0.06153 5 robinson 0.04906 6 Belbo 0.03713 7 omar 0.00402 8 Paul 0.00236 9 BlackKnight 0.00184 10 Ryan_Cable 0.00116 11 naveed 0.00007 12 megamau 0.00007 13 grey_0x2A 0 14 jdb 0.00002 15 acheron 0 16 MrBrain 0 StDev 200: 1 Fritzlein 0.40815 2 99of9 0.19326 3 PMertens 0.09713 4 Adanac 0.08911 5 robinson 0.08172 6 Belbo 0.06930 7 omar 0.01908 8 Paul 0.01497 9 BlackKnight 0.01230 10 Ryan_Cable 0.00954 11 naveed 0.00226 12 megamau 0.00152 13 grey_0x2A 0.00089 14 jdb 0.00067 15 acheron 0.00009 16 MrBrain 0.00001 Multiplicatively the effect of my being overrated is as follows: I, robinson, omar, and Paul are notably improved. Adanac, PMertens, and Belbo are notably harmed. Others mostly vanish into the noise, though I expect jdb is substantially improved. The absolute sum of the additive effect of my being overrated is about 0.015 for all cases. StDev 0: 1 Fritzlein 1.00284 2 99of9 0.99695 3 PMertens 0.94085 4 Adanac 0.91294 5 robinson 1.20529 6 Belbo 0.94957 7 omar 1.06944 8 Paul 1.09090 9 BlackKnight 0.9 10 Ryan_Cable 4.5 11 naveed 0 12 megamau ? 13 grey_0x2A ? 14 jdb ? 15 acheron ? 16 MrBrain ? StDev 50: 1 Fritzlein 1.00156 2 99of9 0.99627 3 PMertens 0.95649 4 Adanac 0.94220 5 robinson 1.14532 6 Belbo 0.95176 7 omar 1.05384 8 Paul 1.16438 9 BlackKnight 0.93333 10 Ryan_Cable 5.95238 11 naveed 0.5 12 megamau ? 13 grey_0x2A ? 14 jdb ? 15 acheron ? 16 MrBrain ? StDev 100: 1 Fritzlein 1.00087 2 99of9 0.98860 3 PMertens 0.99942 4 Adanac 0.96505 5 robinson 1.06930 6 Belbo 0.96983 7 omar 1.07213 8 Paul 1.02542 9 BlackKnight 0.75 10 Ryan_Cable 2.45689 11 naveed 2.28571 12 megamau 1.42857 13 grey_0x2A ? 14 jdb 2 15 acheron ? 16 MrBrain ? StDev 200: 1 Fritzlein 1.00041 2 99of9 1.00243 3 PMertens 0.98352 4 Adanac 1.00594 5 robinson 1.01162 6 Belbo 0.95699 7 omar 0.97798 8 Paul 0.92384 9 BlackKnight 0.92601 10 Ryan_Cable 1.41509 11 naveed 1.13716 12 megamau 1.18421 13 grey_0x2A 1.24719 14 jdb 1.29850 15 acheron 0.33333 16 MrBrain 3 If anyone is interested in a copy of my script I would be happy to provide it, but you can probably do the same things with Omar’s scripts. I will try to post new odds after each round.
|
|
IP Logged |
|
|
|
PMertens
Forum Guru
Arimaa player #692
Gender:
Posts: 437
|
|
Re: 2006 World Championship
« Reply #2 on: Nov 14th, 2005, 3:13am » |
Quote Modify
|
I really do not feel harmed by your rating :-D
|
|
IP Logged |
|
|
|
Ryan_Cable
Forum Guru
Arimaa player #951
Gender:
Posts: 138
|
|
Re: 2006 World Championship
« Reply #3 on: Nov 14th, 2005, 3:53am » |
Quote Modify
|
That is good to hear. I do think we should improve the rating system before the next WC. There was a big discussion of this a while back, but nothing was decided.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2006 World Championship
« Reply #4 on: Nov 14th, 2005, 8:08am » |
Quote Modify
|
I think Omar is still committed to improving the rating system, but he got distracted discussing tournament formats. I'm glad he did, because (IMHO) floating double elimination with inaccurate seeding is much better than single elimination with accurate seeding. The above numbers substantiate this somewhat. Looking forward, I think improved ratings are still important, but a greater priority should be server stability and capacity. And of course even before that it is important that the computer championship, challenge match, and postal championship run smoothly again this year.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2006 World Championship
« Reply #5 on: Nov 14th, 2005, 8:29am » |
Quote Modify
|
With as many strong players as there are in this tourney, it is really hard for me to believe that I have a 60% or better shot of winning the whole thing. When the possibility of error is introduced, I feel obliged to point out that my true strength is much more likely to be 2170 than 2370. Is this just my psychological hangup that I have trouble believing in myself, or do other people also feel that the World Championship is more open than the ratings-based simulation suggests?
|
|
IP Logged |
|
|
|
Adanac
Forum Guru
Arimaa player #892
Gender:
Posts: 635
|
|
Re: 2006 World Championship
« Reply #6 on: Nov 14th, 2005, 10:18am » |
Quote Modify
|
on Nov 14th, 2005, 8:29am, Fritzlein wrote:With as many strong players as there are in this tourney, it is really hard for me to believe that I have a 60% or better shot of winning the whole thing. When the possibility of error is introduced, I feel obliged to point out that my true strength is much more likely to be 2170 than 2370. Is this just my psychological hangup that I have trouble believing in myself, or do other people also feel that the World Championship is more open than the ratings-based simulation suggests? |
| Overconfidence has been the downfall of many favourites in sports, elections, etc. so your "psychological hangup" may be more of an asset than anything. As long as you're not reading the newspaper during your games or anything, I'd give you at least a 50% shot at 1st place
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2006 World Championship
« Reply #7 on: Nov 14th, 2005, 3:05pm » |
Quote Modify
|
on Nov 14th, 2005, 2:50am, Ryan_Cable wrote:Multiplicatively the effect of my being overrated is as follows: I, robinson, omar, and Paul are notably improved. Adanac, PMertens, and Belbo are notably harmed. Others mostly vanish into the noise, though I expect jdb is substantially improved. The absolute sum of the additive effect of my being overrated is about 0.015 for all cases. |
| When you calculate ratios, it may seem like the system has a major flaw: you have doubled your odds of winning or more than doubled them by pumping up your rating to get a better seed. However, if the effect in absolute terms is to raise your chances of becoming World Champion from 0.1% to 0.2%, I don't think anyone will begrudge you the extra one-in-a-thousand chance. As for other people being helped or hurt by an inaccuracy in your rating, I think there are just too many inaccuracies in everyone's ratings to make that a useful calculation. Maybe Omar is slightly harmed by your inflated rating, but would have been slightly helped by your inflated rating if Naveed's rating hadn't been deflated, except if a person rated 1477 entered the tournament at the last minute to create a first-round bye... It makes me think of a football game which ends up 33-31, and everyone talks only about the missed field goal in the final seconds. In such a close game you could change any one of fifty different plays and reverse the overall outcome. It's a bit silly to focus on only one variable in a huge, unstable system which can't be accurately measured or predicted in the first place.
|
« Last Edit: Nov 15th, 2005, 9:07am by Fritzlein » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2006 World Championship
« Reply #8 on: Nov 14th, 2005, 3:13pm » |
Quote Modify
|
on Nov 14th, 2005, 10:18am, Adanac wrote: Overconfidence has been the downfall of many favourites in sports, elections, etc. so your "psychological hangup" may be more of an asset than anything. |
| That's a nice way to put it. Thank you. Usually sports commentators say things like, "He's playing with confidence now," as if a lack of confidence were responsible for all of some athlete's mistakes up to that point.
|
|
IP Logged |
|
|
|
Ryan_Cable
Forum Guru
Arimaa player #951
Gender:
Posts: 138
|
|
Re: 2006 World Championship
« Reply #9 on: Nov 14th, 2005, 8:33pm » |
Quote Modify
|
on Nov 14th, 2005, 8:08am, Fritzlein wrote:I'm glad he did, because (IMHO) floating double elimination with inaccurate seeding is much better than single elimination with accurate seeding. The above numbers substantiate this somewhat. |
| That is definitely true, in a single elimination tournament the chance of you or 99of9 winning drops ~7% (additively). Going to floating triple elimination would add another 5% to 7% to your combined chances, potentially leaving as little as 4% for the field. on Nov 14th, 2005, 8:29am, Fritzlein wrote:With as many strong players as there are in this tourney, it is really hard for me to believe that I have a 60% or better shot of winning the whole thing. When the possibility of error is introduced, I feel obliged to point out that my true strength is much more likely to be 2170 than 2370. Is this just my psychological hangup that I have trouble believing in myself, or do other people also feel that the World Championship is more open than the ratings-based simulation suggests? |
| The thing to remember is that this format was specifically selected to maximize the probability of picking the player with the highest true-rating. I am 99%+ confident that person is Fritzlein or 99of9 and 75% confident it is Fritzlein. I had planned to do simulations with more rating adjustments after the first round, but I will go ahead and do it now. With Fritzlein dropped 50 points and myself at 1800: (Ignore the periods, they are just a hack to get around the fact that the forum software will not display more than 5 spaces in a row, and I am to much lazy to redo everything with tables.) StDev 0: 1 Fritzlein 0.57179 2 99of9 . 0.24952 3 Ryan_Cable 0.00108 4 PMertens 0.06147 5 Adanac . 0.04780 6 robinson 0.04197 7 Belbo . 0.02432 8 omar . 0.00122 9 Paul . 0.00049 10 BlackKnight 0.00032 11 naveed . 0.00001 12 megamau 0 13 grey_0x2A 0 14 jdb . 0.00001 15 acheron 0 16 MrBrain 0 StDev 50: 1 Fritzlein 0.54342 2 99of9 . 0.24756 3 Ryan_Cable 0.00149 4 PMertens 0.06912 5 Adanac . 0.05657 6 robinson 0.04755 7 Belbo . 0.03045 8 omar . 0.00216 9 Paul . 0.00104 10 BlackKnight 0.00058 11 naveed . 0.00004 12 megamau 0.00002 13 grey_0x2A 0 14 jdb . 0 15 acheron 0 16 MrBrain 0 StDev 100: 1 Fritzlein 0.47359 2 99of9 . 0.24228 3 Ryan_Cable 0.00355 4 PMertens 0.08405 5 Adanac . 0.07460 6 robinson 0.06607 7 Belbo . 0.04462 8 omar . 0.00572 9 Paul . 0.00320 10 BlackKnight 0.00211 11 naveed . 0.00012 12 megamau 0.00006 13 grey_0x2A 0.00001 14 jdb . 0.00002 15 acheron 0 16 MrBrain 0 With Fritzlein dropped 100 points and myself at 1800: StDev 0: 1 Fritzlein 0.45215 2 99of9 . 0.31091 3 Ryan_Cable 0.00142 4 PMertens 0.07898 5 Adanac . 0.06333 6 robinson 0.05820 7 Belbo . 0.03227 8 omar . 0.00175 9 Paul . 0.00059 10 BlackKnight 0.00038 11 naveed . 0.00001 12 megamau 0.00001 13 grey_0x2A 0 14 jdb . 0 15 acheron 0 16 MrBrain 0 StDev 50: 1 Fritzlein 0.42771 2 99of9 . 0.30430 3 Ryan_Cable 0.00203 4 PMertens 0.08492 5 Adanac . 0.07194 6 robinson 0.06280 7 Belbo . 0.04085 8 omar . 0.00304 9 Paul . 0.00160 10 BlackKnight 0.00077 11 naveed . 0.00002 12 megamau 0.00002 13 grey_0x2A 0 14 jdb . 0 15 acheron 0 16 MrBrain 0 StDev 100: 1 Fritzlein 0.37609 2 99of9 . 0.28095 3 Ryan_Cable 0.00461 4 PMertens 0.10079 5 Adanac . 0.08941 6 robinson 0.07849 7 Belbo . 0.05569 8 omar . 0.00723 9 Paul . 0.00393 10 BlackKnight 0.00251 11 naveed . 0.00013 12 megamau 0.00010 13 grey_0x2A 0.00004 14 jdb . 0.00003 15 acheron 0 16 MrBrain 0 Now let’s look at Fritzlein’s record and see how ridiculous it is to think he is 100 points overrated: Name . Record Win Rate Predicted Win Rate 99of9 . 6-9 0.40 . 0.680 Ryan_Cable 1-0 1 . 0.938 (at 1800) PMertens 17-5 0.7727 . 0.788 Adanac . 11-1 0.9167 . 0.794 robinson 19-3 0.8636 . 0.806 Belbo . 8-1 0.8889 . 0.820 omar . 8-0 1 . 0.915 Except for 99of9, Fritzlein performs above expectations against everyone but PMertens, where he performs as expected (and the record against PMertens is somewhat distorted by browser crash timeouts on both sides). The record between 99of9 and Fritzlein is interesting but tragically small. If the games before 2005 are removed (in 2004, 99of9 had the top rating and Fritzlein just joined in August), the record is 6-5, 0.5455 in favor of 99of9. However, if we count only interactive games of 30s+ per move in 2005, the record is 5-1 in favor of 99of9, 0.833. Clearly, if Fritzlein has an Achilles’ heal, it is 99of9. If the World Championship were going to be decided by a match (say best of 9), 99of9 might even be a slight favorite. Now let’s look at 99of9’s record: Name . Record Win Rate Predicted Win Rate Fritzlein 9-6 0.60 . 0.320 Ryan_Cable 3-0 1 . 0.877 (at 1800) PMertens 14-1 0.9333 . 0.636 Adanac . 1-4 0.20 . 0.644 robinson 11-6 0.6471 . 0.661 Belbo . 28-7 0.80 . 0.683 omar . 9-4 0.6923 . 0.835 (Corrected: I have put 99of9 25 points less underrated, see 99of9's post below) The games against Adanac were four blitz games and one postal game. Otherwise, 99of9 outperforms against everyone except robinson where he performs as expected and the underrated omar where he performs below expectations. I think 99of9 is probably 50 to 75 points underrated relative to the population under study. If 50 to 75 points were added to 99of9’s rating, I think Fritzlein would be more likely to be underrated than overrated relative to the population under study. With Fritzlein unchanged and myself at 1800, I raised 99of9 50 points. And just to make things as accurate as possible, I raised omar 100 points as well: StDev 0: 1 Fritzlein 0.61057 2 99of9 . 0.27491 3 Ryan_Cable 0.00059 4 PMertens 0.03841 5 Adanac . 0.03085 6 robinson 0.02558 7 Belbo . 0.01395 8 omar . 0.00479 9 Paul . 0.00024 10 BlackKnight 0.00011 11 naveed . 0 12 megamau 0 13 grey_0x2A 0 14 jdb . 0 15 acheron 0 16 MrBrain 0 StDev 50: 1 Fritzlein 0.58171 2 99of9 . 0.27857 3 Ryan_Cable 0.00075 4 PMertens 0.04451 5 Adanac . 0.03637 6 robinson 0.03208 7 Belbo . 0.01831 8 omar . 0.00686 9 Paul . 0.00053 10 BlackKnight 0.00028 11 naveed . 0.00003 12 megamau 0 13 grey_0x2A 0 14 jdb . 0 15 acheron 0 16 MrBrain 0 StDev 100: 1 Fritzlein 0.51385 2 99of9 . 0.27358 3 Ryan_Cable 0.00247 4 PMertens 0.06003 5 Adanac . 0.05275 6 robinson 0.04632 7 Belbo . 0.03245 8 omar . 0.01555 9 Paul . 0.00172 10 BlackKnight 0.00114 11 naveed . 0.00006 12 megamau 0.00001 13 grey_0x2A 0.00004 14 jdb . 0.00003 15 acheron 0 16 MrBrain 0 With Fritzlein unchanged, myself at 1800, and omar raised 100, I raised 99of9 75 points: StDev 0: 1 Fritzlein 0.57088 2 99of9 . 0.32846 3 Ryan_Cable 0.00046 4 PMertens 0.03416 5 Adanac . 0.02654 6 robinson 0.02273 7 Belbo . 0.01222 8 omar . 0.00420 9 Paul . 0.00024 10 BlackKnight 0.00010 11 naveed . 0 12 megamau 0.00001 13 grey_0x2A 0 14 jdb . 0 15 acheron 0 16 MrBrain 0 StDev 50: 1 Fritzlein 0.54817 2 99of9 . 0.32487 3 Ryan_Cable 0.00087 4 PMertens 0.04079 5 Adanac . 0.03337 6 robinson 0.02801 7 Belbo . 0.01666 8 omar . 0.00668 9 Paul . 0.00035 10 BlackKnight 0.00023 11 naveed . 0 12 megamau 0 13 grey_0x2A 0 14 jdb . 0 15 acheron 0 16 MrBrain 0 StDev 100: 1 Fritzlein 0.49273 2 99of9 . 0.31409 3 Ryan_Cable 0.00212 4 PMertens 0.05623 5 Adanac . 0.04735 6 robinson 0.04284 7 Belbo . 0.02836 8 omar . 0.01362 9 Paul . 0.00147 10 BlackKnight 0.00108 11 naveed . 0.00005 12 megamau 0.00003 13 grey_0x2A 0.00001 14 jdb . 0.00002 15 acheron 0 16 MrBrain 0 With Fritzlein unchanged, myself at 1800, and omar raised 100, I raised 99of9 100 points: StDev 0: 1 Fritzlein 0.52674 2 99of9 . 0.38388 3 Ryan_Cable 0.00034 4 PMertens 0.03079 5 Adanac . 0.02366 6 robinson 0.02025 7 Belbo . 0.01038 8 omar . 0.00376 9 Paul . 0.00012 10 BlackKnight 0.00007 11 naveed . 0.00001 12 megamau 0 13 grey_0x2A 0 14 jdb . 0 15 acheron 0 16 MrBrain 0 StDev 50: 1 Fritzlein 0.51155 2 99of9 . 0.37558 3 Ryan_Cable 0.00070 4 PMertens 0.03623 5 Adanac . 0.02966 6 robinson 0.02538 7 Belbo . 0.01512 8 omar . 0.00524 9 Paul . 0.00032 10 BlackKnight 0.00019 11 naveed . 0.00003 12 megamau 0 13 grey_0x2A 0 14 jdb . 0 15 acheron 0 16 MrBrain 0 StDev 100: 1 Fritzlein 0.46331 2 99of9 . 0.35655 3 Ryan_Cable 0.00172 4 PMertens 0.05241 5 Adanac . 0.04380 6 robinson 0.03999 7 Belbo . 0.02748 8 omar . 0.01233 9 Paul . 0.00128 10 BlackKnight 0.00103 11 naveed . 0.00004 12 megamau 0.00004 13 grey_0x2A 0.00001 14 jdb . 0.00001 15 acheron 0 16 MrBrain 0 After some reflection, I think most of the statistical information about this tournament can be summed up in a few observations: For either Fritzlein or 99of9 to win the tournament, one almost certainly must beat the other at least once. To win, Fritzlein will probably have to beat 99of9 twice. To win, 99of9 is notably more likely to have to beat Fritzlein twice. For anyone else to win, that person will almost certainly need a total of at least 2 wins against Fritzlein and 99of9, and will probably need a total of at least 3 wins against them. I believe there is actually a significant chance that this person would have to beat Fritzlein and 99of9 twice each, a truly amazing feat! If either Fritzlein or 99of9 is undefeated after round 4, he will have a ~70% chance of winning the tournament (needing to go 2 out of 3 against equal or weaker opponents). If someone else is undefeated after round 4 (and Fritzlein and 99of9 are still in the tournament), he will have a ~30% chance of winning the tournament (needing to go 2 out of 3 against opponents that are probably stronger). (Slightly changed, see 99of9's post below and the correction above) Based on this, I think the odds for the tournament are roughly Fritzlein: 57%, 99of9: 37%, field: 6%. Note that all of my simulations give the field ~10% or better; I humbly disagree. Alone, Fritzlein and 99of9 are beatable, but together, in a floating double elimination tournament, they cover the road to the Championship with nearly impenetrable interlocking fires. Arguably the greatest advantage Fritzlein and 99of9 have is that they only have to face one titan, while everyone else has to face two.
|
« Last Edit: Nov 24th, 2005, 9:30pm by Ryan_Cable » |
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: 2006 World Championship
« Reply #10 on: Nov 14th, 2005, 9:48pm » |
Quote Modify
|
on Nov 14th, 2005, 8:33pm, Ryan_Cable wrote:The thing to remember is that this format was specifically selected to maximize the probability of picking the player with the highest true-rating. I am 99%+ confident that person is Fritzlein or 99of9 and 75% confident it is Fritzlein. |
| But of course you mean "true rating in 90-120 sec games". I am significantly weaker at postal than at 45s. Most of the interactive human games played on the server are 45s-60s, smack bang in the middle of my competency, my psychological tricks sometimes work. So I think that in a 90/120 tourney I am probably not underrated. On the contrary I think Fritz's postal record demonstrates that he is an arimaa player like no other. But in a tourney, you never know what might happen - for all we know, someone might have prepared a never-before-imagined opening that leaves us all for dead . Quote:Now let’s look at Fritzlein’s record and see how ridiculous it is to think he is 100 points overrated: Name Record Win Rate Predicted Win Rate 99of9 . 6-9 0.40 0.680 Ryan_Cable 1-0 1 0.938 (at 1800) PMertens 17-5 0.7727 0.788 Adanac . 11-1 0.9167 0.794 robinson 19-3 0.8636 0.806 Belbo . 8-1 0.8889 0.820 omar . 8-0 1 . 0.915 Except for 99of9, Fritzlein performs above expectations against everyone but PMertens, where he performs as expected (and the record against PMertens is somewhat distorted by browser crash timeouts on both sides). |
| I agree with you, it's quite possible that he is underrated. We will not truly know that until someone comes along that is rated similarly to him. It's very hard to rise when you already way off the scale, because something as little as a browser crash can set you back 30 points. Quote:The record between 99of9 and Fritzlein is interesting but tragically small. |
| That is my fault. As you can see from his table Fritz has a much higher work-rate than me. In my table, the quantity of games vs Fritz looks perfectly normal! Quote:If the World Championship were going to be decided by a match (say best of 9), 99of9 might even be a slight favorite. |
| Perhaps I should start a rebel "Classical World Arimaa Champion" system like in chess . I think there's a little truth in what you say about me being his possible Achilles heel. I'm lucky that despite us all cajoling him to play, one of my Achilles heels (blue22) is not in the WC (One other is Adanac as you rightly point out, and one other is Omar - see below). Quote: Now let’s look at 99of9’s record: Name Record Win Rate Predicted Win Rate Fritzlein 9-6 0.60 0.320 Ryan_Cable 3-0 1 0.877 (at 1800) PMertens 14-1 0.9333 0.636 Adanac . 1-4 0.20 0.644 robinson 11-6 0.6471 0.661 Belbo . 28-7 0.80 0.683 omar . 9-4 0.692 0.835 |
| I've corrected the Belbo and omar entries - I think you cut and pasted from Fritz's record for them. One of the games against PMertens was the Lose-Arimaa variant . I think I may have been his Achilles heel so far. Many of the games against Belbo were when he was learning. As you can see I underperform against Omar, but that's because he is underrated in your expectation calculation. Quote:I think 99of9 is probably 75 to 100 points underrated relative to the population under study. |
| I doubt it, especially at this longer time control. Remember last year I was knocked out in round 1 by naveed. Quote:To win, 99of9 is notably more likely to have to beat Fritzlein twice. |
| I have great hopes for Omar and PMertens on Fritz's side of the draw . Quote:Based on this, I think the odds for the tournament are roughly Fritzlein: 55%, 99of9: 40%, field: 5%. Note that all of my simulations give the field ~10% or better; I humbly disagree. |
| I think unexpected surprises are even more likely around tourney time, so I'd give the "field" about 20%, Fritz 50%, and me 30%. Quote:they cover the road to the Championship with nearly impenetrable interlocking fires |
| I just wanted to quote this
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2006 World Championship
« Reply #11 on: Nov 14th, 2005, 11:27pm » |
Quote Modify
|
on Nov 14th, 2005, 9:48pm, 99of9 wrote:I'd give the "field" about 20%, Fritz 50%, and me 30%. |
| I guess if I had to set down odds where I would take any side of the bet, this is what I would say too. If Ryan would lay me 19 to 1 odds against the fourteen, I'd take the field any day. Of course then I would try to make sure I lost the bet...
|
|
IP Logged |
|
|
|
Ryan_Cable
Forum Guru
Arimaa player #951
Gender:
Posts: 138
|
|
Re: 2006 World Championship
« Reply #12 on: Nov 15th, 2005, 2:17am » |
Quote Modify
|
on Nov 14th, 2005, 9:48pm, 99of9 wrote:I've corrected the Belbo and omar entries - I think you cut and pasted from Fritz's record for them. |
| Thanks, I made the corrections you noted in my post above. Cutting and pasting from three different windows, while running simulations, with 10+ windows open is a bit tricky. It slightly changed my guesstimate of the odds to roughly Fritzlein: 57%, 99of9: 37%, field: 6%. on Nov 14th, 2005, 9:48pm, 99of9 wrote:But in a tourney, you never know what might happen |
| In a tournament this big, there will definitely be upsets. However, due to the design of the FDE, there needs to be several upsets for a non-titan to win. This seems to be particularly true for the 16 player FDE. Look at Fritzlein’s description of the tournament endgame in the World Championship format for 2006 thread: http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=talk;action=display;nu m=1124140602;start=15 As seeds 1 and 2, I believe that 99of9 and Fritzlein can potentially play each other at only two specific times: The first opportunity is in round 4 in a critical battle to be the last remaining undefeated player. The only other time they can meet is at the end of the tournament playing for the win. If the round 4 game does come to pass, the tournament will almost certainly be won by a titan, because it implies that only 3 non-titans will survive round 4, and further implies that at most 2, and probably only 1 non-titan will survive round 5. However, if the round 4 game does not come to pass, the titans will never meet in a tournament that is won by a non titan, which then implies that the titans must each loose 2 games apiece to non-titans. Can the titans suffer 4 collective upsets? Of course, but it is quite unlikely. The titans can collectively play at most 13 games and loose the tournament. The first round games are against opponents rated ~800 points below them, and it will cost me 300 prediction points if either of them looses those games. If a titan looses in round 2, the highest rated opponent he is likely to play in round 3 is rated in the 1600s. Thus is very unlikely that an individual titan will loose in round 2 and in round 3 and be eliminated. Thus, in the first 3 rounds the titans will almost certainly not suffer more than 2 collective defeats. More likely is that at least one titan will remain undefeated and be in the critical game for the final undefeated spot. After round 3 there are at most 7 games left in which to knock out the titans. Collectively, the titans have three advantages: First, they are significantly stronger than non-titans. Fritzlein defeats non-titans at 3/4 or better. 99of9 defeats most non-titans at 2/3 or better. Second, they almost always get to play the weakest players with their tournament record. Third, they only have to play each other in situations where they are very well rewarded for winning. These advantages can be overcome but not 20% of the time. on Nov 14th, 2005, 9:48pm, 99of9 wrote:I have great hopes for Omar and PMertens on Fritz's side of the draw. |
| The main reason I wanted to save this analysis for after round 1 is that the number of possible future pairings become significantly smaller. If any of the round 1 games are upsets, Fritzlein does not play omar in round 2. If exactly one of seeds 3 through 7 is the only upset, 99of9 plays omar in round 2 and Fritzlein plays the winner of the upset. The only way Fritzlein plays PMertens in round 3 is if I survive to round 3 and that has a <20% chance of happening. Otherwise, it is 99of9 who should expect to play the winner of the evenly matched PMertens - Adanac game. Fritzlein most likely will play robinson in round 3. The floating nature of this tournament makes the winner more predictable, and most everything else less predictable. on Nov 14th, 2005, 11:27pm, Fritzlein wrote:I guess if I had to set down odds where I would take any side of the bet, this is what I would say too. If Ryan would lay me 19 to 1 odds against the fourteen, I'd take the field any day. Of course then I would try to make sure I lost the bet... |
| Well, I think the 9 to 1 odds given by the simulations would make a fair bet, but I don’t gamble.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: 2006 World Championship
« Reply #13 on: Nov 15th, 2005, 4:32am » |
Quote Modify
|
You've made some very good points that I hadn't considered. And I also realise that I'd misinterpreted the way the pairings would work out. I presumed the first 4 rounds were exactly like a knockout tree, but you're right that the sliding means that upsets make the pairings less predictable.
|
|
IP Logged |
|
|
|
PMertens
Forum Guru
Arimaa player #692
Gender:
Posts: 437
|
|
Re: 2006 World Championship
« Reply #14 on: Nov 15th, 2005, 5:03am » |
Quote Modify
|
by the way: I am officially impressed by that analysis ... but I still want to win
|
|
IP Logged |
|
|
|
|