Author |
Topic: 2010 World Championship Rules (Read 9602 times) |
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
2010 World Championship Rules
« on: Oct 16th, 2009, 7:08pm » |
Quote Modify
|
I would like to revive a minor rule change proposal that I made during the 2009 tournament. The ranking method that is used both to seed the finals and to pair the tournament is number of wins, with a tiebreaker of number of opponents' wins. In the past, top players have suffered a bit of added randomness based on their first-round pairing. If one player has the misfortune of being paired to an opponent who drops out of the tournament in the first round, he will get no tiebreaker points, whereas another player who is paired against a committed opponent might get two tiebreaker points by tournament end. The difference between zero and two tiebreaker points is not commensurate with the difference between having a 100% chance of winning and a 99% chance of winning. The problem of dropouts and forfeits is not something we can cure with rules, but I believe that a fixable feature of the tiebreak points is exacerbating the problem. The root of the evil is that differences in strength in close matches are much more important than differences in strength in mismatches. If I have won four of five matches, then it matters a great deal whether my opponents won three, four, or five games, whereas it matters hardly at all whether they won zero, one, or two games. To put it in terms of ratings, let's say my rating is 2000. The difference between playing an opponent rated 2200, 2000, or 1800, is a wining chance of 24%, 50%, or 76%. That's a huge span. But the difference between playing an opponent rated 1600, 1400, or 1200, is a winning percentage of 91%, 97%, or 99%. In the former case a swing of 52% winning chance results in a difference of two tiebreak points, and in the latter case a swing of 8% winning chance results in a difference of two tiebreak points. It isn't fair. The solution is to have a tiebreak formula that overweights opponents near your own score, where the relative strength is critical, and underweights mismatches where the outcome is a foregone conclusion. Here is such a formula: Let N = Number of rounds played Let P = Number of wins by player Let O = Number of wins by opponent Then T = 1/(1+10^(4*(P-O)/(N+1)) In the first two round this tiebreaker works identically to the method of sum of opponents' scores. There are no cases in which the ranking ends up different. After three rounds, however, the first differences appear. For example, let's take two players, A and B, who have both won all three games, and both have a sum of opponent scores equal to three. However, A's opponents have won 0, 1, and 2, whereas B's opponents have won 1, 1, and 1. Intuitively, A has had slightly tougher opponents overall, by virtue of having played a 2-0 opponent in the third round, whereas B got lucky to be dropped to a lower group to play a 1-1 opponent in the third round. Admittedly, B played a slightly stronger opponent in the first round, but that seems less important. By the above formula, the tiebreak points are A = 0.001 + 0.010 + 0.091 = 0.102 B = 0.010 + 0.010 + 0.010 = 0.030 So the formula rewards A for his tougher schedule. Mismatches have a negligible contribution compared to close matches. By the end of five-round tournament, the formula is not only breaking ties that the sum-of-opponents' scores fails to break, it is even overturning some rankings between players with equal score. Let's compare two four-win players C and D, with the following opponent scores by round: C: 2 + 2 + 3 + 4 + 5 = 16 D: 2 + 2 + 3 + 4 + 4 = 15 OK, from that it is totally obvious that C played the tougher schedule. But what if C had the misfortune that his first-round opponent forfeited and dropped out of the tournament? Then the sum of opponent's scores says that D should rank ahead of C. C: 0 + 2 + 3 + 4 + 5 = 14 D: 2 + 2 + 3 + 4 + 4 = 15 This is terrible luck for C! He had to play the eventual (undefeated) champion in round 5, while player D didn't have to play the top dog, and somehow this weighs less than the difference between two first-round schloobs? Preposterous! Fortuantely, my proposed formula rides to the rescue, giving a tiebreak of C: 0.002 + 0.044 + 0.177 + 0.500 + 0.823 = 1.546 D: 0.044 + 0.044 + 0.177 + 0.500 + 0.500 = 1.265 So C's tougher schedule is reflected in a better tiebreak score, and a better seed into the finals. So what do people think? I admit that my formula is more complicated than a simple sum of scores, and I admit that it doesn't completely solve the issue of forfeits. On the other hand, it does go some distance to rectifying an issue that has popped up in each of the last two years.
|
|
IP Logged |
|
|
|
Adanac
Forum Guru
Arimaa player #892
Gender:
Posts: 635
|
|
Re: 2010 World Championship Rules
« Reply #1 on: Oct 17th, 2009, 5:39am » |
Quote Modify
|
on Oct 16th, 2009, 7:08pm, Fritzlein wrote:So what do people think? I admit that my formula is more complicated than a simple sum of scores, and I admit that it doesn't completely solve the issue of forfeits. On the other hand, it does go some distance to rectifying an issue that has popped up in each of the last two years. |
| I'd be happy with minimizing the forfeit-opponent problem and it's not really that complicated of a change to the Strength of Schedule formula. I like the way it distinguishes opponents that are close to you in the standings more than opponents at either extreme. Most players on this website are good natured so I don't expect we'd get any complaining if one player missed the Top 8 because of a .001 difference in a formula they don't understand. Would it make a big difference for players in the middle of the pack? I remember that last year we had 9 players finish with records of 3-2 or better but the players ranked between #9 and #11 had very different SoS paths to their 3 victories, creating a bit of unfairness. With more players competing in 2010 (no doubt about it ) and more rounds in the Open Classic these things will hopefully iron out. We also had an incredibly low number of upsets last year, causing the same players to move up or move down whenever there were an odd number of players in their win cateogry.
|
« Last Edit: Oct 17th, 2009, 11:03am by Adanac » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 World Championship Rules
« Reply #2 on: Oct 17th, 2009, 3:50pm » |
Quote Modify
|
on Oct 17th, 2009, 5:39am, Adanac wrote:Would it make a big difference for players in the middle of the pack? I remember that last year we had 9 players finish with records of 3-2 or better but the players ranked between #9 and #11 had very different SoS paths to their 3 victories, creating a bit of unfairness. |
| Yes, it would have made a difference last year. The three players who were automatically in on score were chessandgo with five wins, and myself and arimaa_master with four wins. That left six players with three wins competing for the last five places. The actual tiebreaker was: 4. 99of9: 15 = 1 + 2 + 3 + 4 + 5 5. Adanac: 13 = 0 + 2 + 2 + 4 + 5 6. The_Jeh: 13 = 0 + 2 + 2 + 4 + 5 7. Tuks: 13 = 1 + 2 + 2 + 3 + 5 8. camelback: 12 = 0 + 2 + 3 + 3 + 4 9. woh: 11 = 1 + 2 + 2 + 2 + 4 The unfairness is that Tuks got a boost from having a first-round opponent that got him a point, while Adanac, The_Jeh, and camelback all suffered from getting paired with a first-round opponent who dropped out. Comparing Adanac and The_Jeh to Tuks, canceling out equal scores, the former two had 0-win and 4-win opponents, whereas Tuks had 1-win and 3-win opponents. For a 3-win player, the difference between a 0-win and a 1-win opponent is very small compared to the difference between a 3-win opponent and a 4-win opponent. So intuitively Adanac and The_Jeh each had a clearly tougher schedule. Between camelback and Tuks there is more of a question. Stripping out equal score, camelbacks opponents were 0-win, 3-win, and 4-win, whereas Tuks opponents were 1-win, 2-win, and 5-win. One way to look at it is that camelback had to play one worse, one equal, and one better opponent, whereas Tuks had to play two worse and one better opponent. True, Tuks's worse opponent wasn't as worse and his better opponent was more better. But in keeping with the principle that differences far away are less important than differences nearby, camelback has a good claim to a stronger schedule. Plugging into the formula T = 1/(1+10^(4*(P-O)/(N+1))) gives the following points for the various opponents 5 -> 0.956 4 -> 0.823 3 -> 0.5 2 -> 0.177 1 -> 0.044 0 -> 0.010 So the SoS converts to 4. 99of9: 2.500 5. Adanac: 2.143 6. The_Jeh: 2.143 7. camelback: 2.010 8. Tuks: 1.854 9. woh: 1.364 In both systems, the most important thing didn't change, namely that woh didn't qualify for the finals. This makes sense any way you slice it, because woh had four weaker opponents, and nobody else had more than three weaker opponents. Next year, though, having a better tiebreaker could make the difference between the right/wrong person making the finals.
|
« Last Edit: Oct 17th, 2009, 3:53pm by Fritzlein » |
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: 2010 World Championship Rules
« Reply #3 on: Oct 17th, 2009, 6:13pm » |
Quote Modify
|
It wouldn't surprise me if this tie-breaker system would give results similar to that of an intra-tournament rating system (where everybody starts with the same rating). I think the rating system I have been using could suit well for this purpose. It's Glicko with the rating uncertainty fixed and optimized for human-human games.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2010 World Championship Rules
« Reply #4 on: Oct 18th, 2009, 8:04am » |
Quote Modify
|
Thanks for putting so much thought into this Karl. I am aboard on going with your proposal. It shouldn't be much effort to modify the pairing program and so it could be done this year. A thought that crossed my mind was that since we are using better ratings now, we might want to also consider using the WHR ratings as the second tie breaker. Some of the advantages are that it is not effected by drop outs, forfeits or SoS, it encourages more HH games before the tournament to improve ratings and very easy to understand. The biggest disadvantage might be that someone could inflate their rating by repeatedly winning against fake human accounts. But it would be pretty easy to see this in the gameroom and disqualify someone who does this from registering.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 World Championship Rules
« Reply #5 on: Oct 18th, 2009, 4:05pm » |
Quote Modify
|
on Oct 18th, 2009, 8:04am, omar wrote:Thanks for putting so much thought into this Karl. I am aboard on going with your proposal. It shouldn't be much effort to modify the pairing program and so it could be done this year. |
| I'm glad it doesn't sound to hard to implement. Quote:A thought that crossed my mind was that since we are using better ratings now, we might want to also consider using the WHR ratings as the second tie breaker. |
| This is equivalent to using the initial seeding as second tiebreaker, right? I have no problems with that. It won't come into play very often in a five-round tournament, and as the tournament grows, each additional round makes it less likely the second tiebreak will ever be reached. If the second tiebreaker is reached, between two players with identical records against identical SoS, then it is reasonably likely that the sliding pairing gave slightly harder opponents to the higher seed throughout, so using the seed as a tiebreaker is justifiable. I'm not very worried about people inflating their WHR to get a better finish in the World Championship. Partly this is because it is harder to manipulate ratings undetected once bots are out the way. Mostly though, I don't worry because inaccurate seeding will be largely corrected for in the preliminaries, and moreso in the finals, to the point that pre-tournament cheating will be damped out by in-tournament results.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 World Championship Rules
« Reply #6 on: Oct 18th, 2009, 4:15pm » |
Quote Modify
|
on Oct 17th, 2009, 6:13pm, aaaa wrote:It wouldn't surprise me if this tie-breaker system would give results similar to that of an intra-tournament rating system (where everybody starts with the same rating). I think the rating system I have been using could suit well for this purpose. It's Glicko with the rating uncertainty fixed and optimized for human-human games. |
| I am amenable to using a rating system to rank the performance of individuals within a tounament, but one strike against Glicko is that later results count for more than earlier ones. Against identical opposition, three wins and three losses is worse than three losses and three wins, according to Glicko ratings. The assumption that playing strength varies over time is essential for continually-updated ratings but is a liability for event scoring. I would be more sympathetic to rating system which viewed all games as occurring simultaneously and found the maximum likelihood ratings for the tournament results plus a prior assumption that all players are equal.
|
|
IP Logged |
|
|
|
woh
Forum Guru
Arimaa player #2128
Gender:
Posts: 254
|
|
Re: 2010 World Championship Rules
« Reply #7 on: Oct 19th, 2009, 3:30am » |
Quote Modify
|
on Oct 18th, 2009, 4:15pm, Fritzlein wrote: I would be more sympathetic to rating system which viewed all games as occurring simultaneously and found the maximum likelihood ratings for the tournament results plus a prior assumption that all players are equal. |
| The WHR is capable of that. But I think it would be hard to integrate with the pairing program.
|
|
IP Logged |
|
|
|
woh
Forum Guru
Arimaa player #2128
Gender:
Posts: 254
|
|
Re: 2010 World Championship Rules
« Reply #8 on: Oct 19th, 2009, 9:33am » |
Quote Modify
|
WHR ranks the 6 players like this: 4. | Adanac | 1622.3 | 5. | 99of9 | 1613.9 | 6. | The_Jeh | 1594.0 | 7. | camelback | 1592.3 | 8. | Tuks | 1547.3 | 9. | woh | 1501.7 |
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 World Championship Rules
« Reply #9 on: Oct 19th, 2009, 10:10am » |
Quote Modify
|
That ranking is based on all the games in the tournament? Very cool, thanks for generating it. The difference between yours and mine is that Adanac has passed 99of9 in yours. Looking back at their opponents, and removing the common opponents omar and chessandgo, their schedules were Adanac: thefrankinator(0), naveed(2), Fritzlein(4) 99of9: Bildstein(1), camelback(3), arimaa_master(4) From the in-tournament number of wins, it sure looks like 99of9 had the tougher schedule, but if you had asked the players pre-tournament which schedule would be softer, I'll bet both would have said 99of9's schedule would be slightly easier. So this seems to be some anecdotal justification for using tournament-based ratings rather than sum-of-opponents score. What did you do about forfeits, byes, and withdrawn players? We would have to be careful not to penalize players for getting a bye. Also it occurs to me that ratings aren't just a tiebreaker after number of wins, but would occasionally have, for example, a 2-3 player finish ahead of a 3-2 player. I might be comfortable with that, but I expect it would be unacceptable to a lot of chess players. (A good test case for this would be the 2008 World Championship preliminaries, which had woh (4-2) in eighth place and jdb (3-3) in ninth place. Would WHR swap them based on jdb's monstrously difficult schedule? It seems to be your fate to be on the bubble, woh. ) Would you be so good as to post the whole result list from 2009 for comparison with the actual tournament standings?
|
« Last Edit: Oct 19th, 2009, 10:23am by Fritzlein » |
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: 2010 World Championship Rules
« Reply #10 on: Oct 20th, 2009, 4:58pm » |
Quote Modify
|
on Oct 18th, 2009, 4:15pm, Fritzlein wrote:I would be more sympathetic to rating system which viewed all games as occurring simultaneously and found the maximum likelihood ratings for the tournament results plus a prior assumption that all players are equal. |
| Do you mean finding optimized performance ratings of the players, Chessmetrics style? That would be very appealing, as the sole assumption concerning playing strength would be that winning odds follow a multiplicative chain rule. One would of course first have to repeatedly remove from the considered set of players, those with only wins or loses against other members. Special care should also be taken for the unlikely case that there are players with symmetrical results.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 World Championship Rules
« Reply #11 on: Oct 20th, 2009, 5:49pm » |
Quote Modify
|
on Oct 20th, 2009, 4:58pm, aaaa wrote:One would of course first have to repeatedly remove from the considered set of players, those with only wins or loses against other members. |
| When combined with a prior assumption about playing strength, the maximum likelihood ratings are finite. This is ugly because the strength of the prior is arbitrary, and we have to argue about how what strength would be enough without being too much. On the positive side, though, bringing in this extra assumption (in addition to the transitive ratio of odds assumption) means we don't have to remove any players. Without a prior distribution one can only get reasonable ratings out of results that contain loops, but there is no guarantee loops will be present. Adanac pointed out that there were few upsets in last year's preliminaries; that lack of upsets resulted in a graph with no loops at all! 1. chessandgo lost to nobody 2. Fritzlein lost only to chessandgo 3. arimaa_master and Adanac lost only to levels 1-2 4. 99of9 and The_Jeh lost only to levels 1-3 5. omar and camelback lost only to levels 1-4 6. Tuks and woh lost only to levels 1-5 7. naveed and LevB lost only to levels 1-6 8. soldier and Sana lost only to levels 1-7 9. Bildstein didn't win So we're not going to get far by iteratively peeling off players until all remaining players are involved in loops. By the way, this leveling provides an additional justification to put Adanac ahead of 99of9, but then it also puts omar (2-3) ahead of Tuks (3-2) and woh (3-2), which is harder to justify. Also these levels are top-down, and would be different if built bottom-up, or from both ends simultaneously, which makes them look somewhat arbitrary.
|
« Last Edit: Oct 20th, 2009, 6:02pm by Fritzlein » |
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: 2010 World Championship Rules
« Reply #12 on: Oct 28th, 2009, 8:58am » |
Quote Modify
|
Obviously, the number of losses should be the first consideration with respect to ranking. However, it seems hard to argue against the fact that, given two players A and B, A should always be ranked above B, if A has no more losses than B, has a path of victories to it and there is no path back.
|
|
IP Logged |
|
|
|
woh
Forum Guru
Arimaa player #2128
Gender:
Posts: 254
|
|
Re: 2010 World Championship Rules
« Reply #13 on: Nov 1st, 2009, 9:33am » |
Quote Modify
|
on Oct 19th, 2009, 10:10am, Fritzlein wrote:That ranking is based on all the games in the tournament? |
| Yes all games of the 2009 Open Classic (but not the games of the final) on Oct 19th, 2009, 10:10am, Fritzlein wrote:What did you do about forfeits, byes, and withdrawn players? |
| Forfeits were treated like regular losses and wins for their opponents. Byes were ignored. This places players with 3 wins/1 loss/1 bye behind players with a 4 wins/1 loss record and ahead of players with 3 wins and 2 losses assuming their opponent are of about the same strength. Withdrawn players are also ignored following rounds. So their rating stays about the same. If their opponents they did play in the previous rounds do well the ratings of the withdrawn players may go up a bit otherwise drop a little. In the 2009 Open Classic three players didn't show up for their first round game and were redrawn from the tournament. As you can see in the full standings this puts them amongst the players with 2-3 record. This explains why Adanac is ahead of 99of9. His opponent of the first round is no longer valuated as a player of strength 0 but as a player of about strength 2. on Oct 19th, 2009, 10:10am, Fritzlein wrote:Also it occurs to me that ratings aren't just a tiebreaker after number of wins, but would occasionally have, for example, a 2-3 player finish ahead of a 3-2 player. |
| Using ratings would only be acceptable IMO when all players have played the same number of games. Some players receiving one bye should also be acceptable. But unless there is some way to handle players not playing multiple rounds (like withdrawn players) ratings should only be used as a tiebreaker after the number of wins.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 World Championship Rules
« Reply #14 on: Nov 1st, 2009, 4:06pm » |
Quote Modify
|
Thanks for explaining and posting the full list, woh. I see that 2-3 Omar with a tough schedule didn't leap 3-2 woh with an easy schedule. I think that would make the ratings more acceptable as the actual ranking as opposed to just a tiebreaker. My instinct would be to not include forfeits in the ratings, but then a player could get hurt for having an opponent that forfeits, which is undesirable. Withdrawn players are a difficulty for ratings in any case, as you point out.
|
|
IP Logged |
|
|
|
|