Welcome, Guest. Please Login or Register.
Apr 23rd, 2024, 1:02pm

Home Home Help Help Search Search Members Members Login Login Register Register
Arimaa Forum « World Championship tournament format »


   Arimaa Forum
   Arimaa
   Events
(Moderator: supersamu)
   World Championship tournament format
« Previous topic | Next topic »
Pages: 1 ... 3 4 5 6 7  ...  9 Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: World Championship tournament format  (Read 9306 times)
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: World Championship tournament format
« Reply #60 on: May 28th, 2005, 9:47am »
Quote Quote Modify Modify

Changing
 
for (p=0; p<PLAYERS; p++) diff[p] = est[p] - real[p];
 
to
 
for (p=0; p<PLAYERS; p++) diff[p] = abs(est[p] - real[p]);
 
give a value of about 30.
 
IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: World Championship tournament format
« Reply #61 on: May 29th, 2005, 12:38am »
Quote Quote Modify Modify

But I think it's better to think of the error rather than the absolute value of the error.  One big advantage is that the distribution is a bell curve so standard statistical rules-of-thumb can be used.
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: World Championship tournament format
« Reply #62 on: May 29th, 2005, 7:36am »
Quote Quote Modify Modify

If I understand how the code works, I agree with 99of9.  You need to keep the sign on the error to get a correct calculation of standard deviation.  Taking absolute values first will give a smaller number, but that smaller number won't be what statisticians call standard deviation.
 
If you are simulating a standard deviation of 51 with a uniform distribution, then you need to pick a random number between -102 and + 102.  This should be the minimum amount of error in simulated trials of which tournament format is best.  If we think there is additional error from sources other than random variation of the system functioning as well as possible, then the range of error should increase.  However, as you may see in the other thread, I haven't yet found evidence of other such error, even though I expect it exists.
IP Logged

omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: World Championship tournament format
« Reply #63 on: May 30th, 2005, 6:54pm »
Quote Quote Modify Modify

Oh OK. I better make a mention of the fact that in my simulation programs the rating inaccuracy parameter is just used to define the range of the unifrom random numbers to add to the true ratings to get the measured ratings. I guess that number should be entered as twice the standard deviation value. So in my results with the rating inaccuracy set to 200 it really means SD of 100.
 
--- 2005.06.17 ---
 
I just realized that I said this wrong. The 'rating distribution range' defines the range of measured ratings. The 'rating inaccuracy range' defines the range of the uniform random numbers to add to the measured ratings to set the true ratings. I did it this way because I figured it would not matter which way you do it, but in practice we can limit entries in a tourament based on measured ratings and not true ratings.
 
« Last Edit: Jun 17th, 2005, 2:40pm by omar » IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: World Championship tournament format
« Reply #64 on: Jun 11th, 2005, 10:17am »
Quote Quote Modify Modify

Has anyone tried out any other tournament formats.
 
I have not tried out the double or triple eliminations formats proposed by Toby and Karl yet since they are not easy to implement.
 
However I did try out another seemingly strange tournament format. It works about as good as swissKnife, but it continues to perform well even when the number of players and the rating error range is increased.
 
The basic idea is to allow all players to play two rounds of games (details of how the pairing is done is described later) and update the players ratings using the Arimaa rating formula, but set the rating uncertainty of all players to the same fixed value (more on this later). After updating the ratings, the two players with the lowest ratings are removed and the tournament continues repeating the same procedure again. When only two players are left the player with the highest rating after the final two rounds is selected as the winner.
 
In some ways this is similar to swissKnife because it is selecting winner based primarily on ratings. But unlike swissKnife, it actually uses the tournament to refine the original ratings before making the final selection.
 
I've found that a good value to use for the rating uncertianty of the players is about 1/5 of the 'rating inaccuracy range'. But it is not very sensitive to the value used and even values that are 1/10 to 1/3 of the rating inaccuracy range produce good results.
 
For the pairing of the players I tried two variations and they both seem to produce almost similar results. The first one which I refer to as the swissSaw pairs #1-#2, #3-#4, ... in the first round and #2-#3, #4-#5, ... in the second round with #1 and lowest rated getting a bye. The second pairing method which I refer to as swissOmatic pairs the lower half against the upper half by sliding in the first round and by folding in the second round.
 
For comparision I also tried a format that I call 'roundRobinRated'. After running a round robin the ratings are updated and the highest rated player selected as the winner.
 
Here are the results on some simulations:
 
 
2000 trials with each format, 8 players, 500 point rating distribution range, 200 point rating inaccuracy range, 1 in 9999 draw ratio:
 
46.6%     run swissKnife 2000 8 500 200 9999
45.8%     run singleElimRand 2000 8 500 200 9999
45.5%     run roundRobin 2000 8 500 200 9999
46.1%     run roundRobinDouble 2000 8 500 200 9999
60.7%     run 'roundRobinRated 40' 2000 8 500 200 9999
62.8%     run 'swissOmatic 40' 2000 8 500 200 9999
63.7%     run 'swissSaw 40' 2000 8 500 200 9999
 
The value of 40 being passed to some of the tournament formats is the rating uncertianty to be used in the rating formula. It is set to 1/5 of the rating inaccuracy range.
 
 
same as above, but with 16 players instead of 8:
 
33.3%     run swissKnife 2000 16 500 200 9999
44.9%     run roundRobin 2000 16 500 200 9999
57.0%     run 'roundRobinRated 40' 2000 16 500 200 9999
59.0%     run 'swissOmatic 40' 2000 16 500 200 9999
61.8%     run 'swissSaw 40' 2000 16 500 200 9999
 
 
same as above, but with 32 players:
 
25.6%     run swissKnife 2000 32 500 200 9999
43.2%     run roundRobin 2000 32 500 200 9999
53.1%     run 'roundRobinRated 40' 2000 32 500 200 9999
52.3%     run 'swissOmatic 40' 2000 32 500 200 9999
56.9%     run 'swissSaw 40' 2000 32 500 200 9999
 
 
Feel free to download the simulation package and try out your own experiments.
 
« Last Edit: Jun 11th, 2005, 10:21am by omar » IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: World Championship tournament format
« Reply #65 on: Jun 13th, 2005, 1:49am »
Quote Quote Modify Modify

Omar, you are as creative in thinking up tournament formats as you are in thinking up Arimaa moves.  Using the pre-tournament ratings as a handicapping system is extremely interesting, because it unites some very plausible ideas: (1) We want the player crowned as World Champion to actually be the best player in the world, (2) The ratings are a pretty good indicator of who the best player is, and (3) By forcing certain matchups and by rating those games, we can iron out some rating inaccuracy.
 
Nevertheless, I am obliged to oppose the SwissSaw tournament format because it could make the World Championship tournament almost pointless.  Supposing the tournament were held right now with the top 16 players (by rating) participating.  There would be four rounds of two games each, but I would get one bye every round due to having the top rating, so I would only play four times.  With an RU of 40, I could lose at most 160 points relative to my opponents, but I would be starting 222 points ahead of 99of9.  He would have to pick up the other 63 points in his games against Robinson, and even winning 3 out of 4 wouldn't be enough for that: he would have to win all 4.  So the conditions to win for the various opponents would be thus:  Fritz is World Champion if he wins any one of four games, 99of9 is World Champion only if he wins all eight of his eight games, and everyone else has no chance, so one wonders why they even signed up to play.  And if everyone under 2000 wisely declined to participate, then only 4 of us would be playing, reducing the tournament to only 2 rounds, and making me automatic winner!  
 
But worse than that, the tournament wouldn't be played with the current ratings, it would be played with the ratings achieved by players knowing that the tournament was upcoming.  If I knew that having a ridiculously high rating going into the tournament could make me the automatic World Champion, I would never play humans at all between now and then.  Instead I would beat the bots over and over again by rote formula until I was 700 points above the highest-rated bot, i.e. until my rating was well over 2500.  If we believe ratings at present are somewhat distorted, you have only to announce that SwissSaw will be used in the next World Championship to drive those distortions to ridiculous extremes.
 
Indeed, I would like to propose a new axiom for the World Championship Tournament format:  The number of wins required to become champion (and likewise the number of losses permitted before elimination) should be the same for all entrants.  Note that I don't want everyone to have an equal chance of winning, and my axiom doesn't require it.  For example, a single-elimination tournament with 16 players paired by the folding method is most likely to be won by the best player and least likely to be won by the worst player, but everyone is on equal footing in the sense that anyone must win four times to win overall, and that one loss knocks anyone out.
 
I know, Omar, that you encouraged us to assume that problems in the rating system will have been worked out before the next tournament, but I think that that is a dangerously optimistic assumption.  We should put a new rating system in place and have at least a year to test it and refine it before relying so heavily on ratings in determining the World Championship.  And even with a superb rating system the potential for abuse would remain, for example the #3 player intentionally losing repeatedly to his friend the #2 player to pump that friend's rating way up just before the tournament, thus granting an unfair advantage (headstart) in rating relative to the previously #1 player.
 
Truly, I don't how one could ever make pre-tournament ratings important in determining the World Champion without inviting ratings abuse.  I would therefore strongly discourage not only SwissSaw, but all similar formats as well.  When one considers the possibility of collusion and other forms of ratings manipulation, such formats are likely to disadvantage everyone who doesn't try to gimmick the system.
IP Logged

omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: World Championship tournament format
« Reply #66 on: Jun 15th, 2005, 10:41am »
Quote Quote Modify Modify

Im pretty confident that the new rating system will be very immune to distortions. But until we set it up and get comfortable with it, I also would not suggest using it in a WC format that relies on ratings so heavily.  
 
Also I think I had mentioned earlier that only H-H games would be used in computing the rating for the WC qualifer. We could also add other restrictions such as only interactive games with an effective time control of 45 sec  per move or more are counted, only games against players with an RU of less than 60 are counted, etc.
 
Even still as Karl mentioned collusion could still be a problem; more so in the swissSaw type formats than other formats that use ratings. This is probably the biggest road block for a system that relies so heavily on ratings. But perhaps the rating for the WC qualifer should be based on the highest rating one can establish against one or two very strong bots. For example bot_Bomb2005CC. This would definitely prevent collusion and all players would have an equal chance of trying to master the bot and acheive the highest rating they can before the start of the tournament. So depending on how we define the rating for the WC qualifer we may be able to overcome such obstacles and possibly use a format that relies heavily on ratings.
 
For this year I am thinking of going with the single elimination format, but using fold for the pairings instead of slide and randomly assigning the colors (singleElimFold). Im still open to using another format if anyone wants to implement it and show that it performs better. The pairing part of the floating triple elimination format is not defined in a constructive way, so Im not sure how to implement it; it may require generating all possible pairings to select the best one. We've seen from the comparison of singleElimFold and singleElimSlide that the method of pairing can make a big difference. So the details of the choices made for pairing in the floating triple elimination format can make a difference in how it performs.
 
If a format that can fit into about 8 weeks or so and is better than singleElimFold isn't found in about a month or so, I am seriously considering going with singleElimFold for this year.
 
For next year I will consider a format that has a WC qualifer based on some rating scheme like the one mentioned earlier. I think it will also be good to have a "Open Classic" type tournament next year where the focus of the tournament is not so much on selecting the best player, but rather on having a faster more unpredictable tournament which more people can participate in. Im thinking that perhaps the time per move for such a tournament would be about 45 sec and the format would be single elimination with random pairing. The time per move for the WC next year could then be something higher like 2 min; and the format very focused on selecting the best player.
 
IP Logged
jdb
Forum Guru
*****



Arimaa player #214

   


Gender: male
Posts: 682
Re: World Championship tournament format
« Reply #67 on: Jun 15th, 2005, 11:08am »
Quote Quote Modify Modify

Quote:
We've seen from the comparison of singleElimFold and singleElimSlide that the method of pairing can make a big difference.

 
Also, if I understand correctly, the current singleElim formats, repair the field after every round. Sometimes the pairings are finalized at the start, as in tennis.
 
I will give a go at implementing the double elim format. I'll use Fritzlein's posting of the phoosball brackets as a guide.  
 
I am not familiar with the floating triple elim, so I'll leave that to someone else.
 
IP Logged
jdb
Forum Guru
*****



Arimaa player #214

   


Gender: male
Posts: 682
Re: World Championship tournament format
« Reply #68 on: Jun 15th, 2005, 3:56pm »
Quote Quote Modify Modify

Omar,
 
I was working on the tournament simulations and I have some questions.
 
As a test, I created a very simple tournament. It reads in the player list from the tournament state file. It sorts them, and declares the highest rated player the winner. This all works.   Smiley  
 
I have a couple questions. Huh
 
1) What rating is the number in the tournament state file? Is it the real or the measured rating(ie taken from rating list)?
 
2) I expected this tournament format to find the best player 100% of the time. This does not happen! Any idea why?
 
Thanks
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: World Championship tournament format
« Reply #69 on: Jun 15th, 2005, 7:39pm »
Quote Quote Modify Modify

Jeff caught me in the gameroom and chatted with me to clear his doubts about these questions. I'll answer them here also for others who may be interested.
 
1. The ratings passed to the tournament format program are the measured ratings. The true ratings and the measured ratings are generated by the 'run' program. The measured ratings are passed to the tournament format program in case it wants to use them. The true ratings are passed to the 'simGames' program to generate the outcomes for pairs of players.
 
2. A format that picks the highest rated player will not be 100% since it only sees the measured ratings which are generated by adding a random number to the true ratings. If the tournament format program were given the true ratings then it would be able to pick the correct player 100% of the times.
IP Logged
jdb
Forum Guru
*****



Arimaa player #214

   


Gender: male
Posts: 682
Re: World Championship tournament format
« Reply #70 on: Jun 16th, 2005, 8:01am »
Quote Quote Modify Modify

The double elim seems to be working. It uses fold repairing for every round.
 
100 trials on 16person tourny
 
1 43%
2 33%
3 11%
4 4%
5 2%
 
However there is a fair bit of variance in the results from run to run.
 
I'll try and convince WinZip to bundle everything up, to send to Omar.
 
 
IP Logged
Tarr
Forum Newbie
*



Arimaa player #1239

   


Gender: male
Posts: 5
Re: World Championship tournament format
« Reply #71 on: Jun 16th, 2005, 11:44pm »
Quote Quote Modify Modify

FULL DISCLOSURE:  I do not know how to play Arimaa.  I happened upon this forum just by googling "tournament formats" and perusing the links.  I have since read up on the origin and nature of the game, which is quite interesting.
 
But what I DO know is tournament formats.  The reason I was googling that was that I just finished the latest edit of the UPA's manual of tournament formats.  UPA stands for Ultimate Player's association - the national (USA) organization for a sport some of you may know as Ultimate Frisbee.  I am in charge of coming up with tournament formats for the national championship series.
 
Tournaments in Ultimate start at the sectional level and proceed to a national championship, with teams being eliminated at each stage.  The number of teams, and the number of teams eliminated at each stage, varies from event to event.  As such, we need many different formats to handle these variables.  The result is a manual with a ton of different formats.  
 
At any rate, the format you seem to be most interested in is 16 teams, 1 champion ("1 advances"). Here is that format.  The basic idea is to use initial round robins in small groups to sort out the seeding and get a balanced bracket.
 
Players are divided into the following four groups by initial seeding:
 
Group A: 1, 8, 12, 13
Group B: 2, 7, 11, 14
Group C: 3, 6, 10, 15
Group D: 4, 5, 9, 16
 
All players in each group play all the other players in each group.  This comprises the first three rounds of play.
 
After this, the players are re-seeded into the following bracket:
 
A1vD4
B2vC3
 
C2vB3
D1vA4
 
C1vB4
D2vA3
 
A2vD3
B1VC4
 
 (NOTE: in the format for Ultimate, the 4th place finishers in each group are eliminated, and the first place finishers in each group get a bye to the quarter finals.  Here I show all 16 teams advancing to the bracket, but it is trivial to trim out the 4th place teams and get the bracket we use in Ultimate.
 
I would actually suggest dropping the 4th place players and giving byes to the top finishers, as this adds some "teeth" to the initial group results, which could otherwise be seen as harmless preliminaries.)
 
If the winner of the tournament won their group, they will have gone 6-0 overall (5-0 if you have a bye round for group winners, like I suggest).
 
One obvious question looking at the group assignments is why they are not the more traditional 1,8,9,16 and so on.  The reason is that this switched seeding set gets the right matchups (1v16, 2v15, ... 8v9) in the round of 16, where it is more important.  The quarterfinal matchups (if seeds hold) are 1v7, 4v6, 3v5, and 2v8.  Not quite perfect, but as close as you can get without a group play rematch.  Semis are the perfect 1v4 and 2v3, and finals are of course 1v2.
 
Anyway, interested to hear any comments, and/or how this format stacks up in computer simulation against other formats of similar length (7 total rounds, 35 or 39 total games).  Comparing to a format that has more games is of course a bit unfair, since any format can be made more robust by adding more games.  For example, in this case, the final four could be made into double elimination, which would add 3 or 4 more games but add quite a bit of robustness.
 
Adding games like that is not really an option for Ultimate, since games take a while, and fatigue piles up over a weekend event.  But in Arimaa it might be a possibility.
 
One issue is what to do in the case of a cyclic tie in the initial groups.  For example, you could have three 2-1 players, and one 0-3 player.  You now need to determine who is the top finisher, and so on.  In Ultimate, point differential is used here.  I don't see an equivalent concept in Arimaa, but you could simply use initial ranking to decide it.  Just a thought.
« Last Edit: Jun 16th, 2005, 11:56pm by Tarr » IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: World Championship tournament format
« Reply #72 on: Jun 17th, 2005, 1:11am »
Quote Quote Modify Modify

Thanks for stopping by!  I hope you have a game or two of arimaa oneday... it's almost as fun as Ultimate Wink.  [Coincidentally these are my two favourite sports]
 
It definitely seems an interesting tournament format.  I think one big benefit of including pool groups as you do is that you're more likely to get the lower rankings correct.  Since the arimaa world championship has prizes for 2nd and 3rd places, perhaps we need to think about that! (Since so far we have focussed on only the rating of the winner.)
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: World Championship tournament format
« Reply #73 on: Jun 17th, 2005, 12:10pm »
Quote Quote Modify Modify

Thanks a bunch for your comments, Tarr.  We're open to any kind of good tournament format.
 
I have been fairly staunchly opposed to using round robins because of possible collusion, and possible indifference on the part of players already out of contention.  However, the fact that it is single-elimination after the opening rounds sharply diminishes the impact of any collusion, and the fact that nobody gets eliminated takes care of the indifference problem.  Even if the bottom team were eliminated, an 0-2 player could have something to play for, because a single win will usually advance, and only possibly be eliminated on tiebreaks.
 
on Jun 16th, 2005, 11:44pm, Tarr wrote:
One issue is what to do in the case of a cyclic tie in the initial groups.  For example, you could have three 2-1 players, and one 0-3 player.  You now need to determine who is the top finisher, and so on.  In Ultimate, point differential is used.

 
We had one case where we used a tiebreak rather than a playoff, namely the 2005 computer championship, and it worked pretty well, because we all thought the bot who played better won the title.  You can take the number of moves in each game, where winning quickly is an advantage, winning slowly is a disadvantage, losing slowly is an advantage, and losing quickly is a disadvantage.
 
It is great that initial rounds are used to determine appropriate seeding, because (as we have discussed in numerous places) the ratings can be wildly inaccurate, and shouldn't be too heavily relied on.
 
Overall, I think your proposal is great except for one BIG problem: The number of participants in the Arimaa World Championship is not known in advance.  We may have 11 participants or 17.  Whatever format we choose needs to be flexible enough to handle this fairly.  How would suggest we deal with this special need?
IP Logged

jdb
Forum Guru
*****



Arimaa player #214

   


Gender: male
Posts: 682
Re: World Championship tournament format
« Reply #74 on: Jun 17th, 2005, 12:41pm »
Quote Quote Modify Modify

An interesting tournament format. One good point, is every team is guaranteed four games. That way everyone feels they got their money's worth.
 
One potential problem with round robins is, what to do if a game is defaulted or cannot be played.  
 
One league I am in uses a four team double round robin for the playoffs. The top two teams then play a single game for the championship. This year one of the games was cancelled due to a snow storm. There was no way to reschedule the game.  
 
The league president decided the game was a double forfeit (both teams getting zero points). Some people felt it should have been treated as a tie game (each team getting one point). There was alot of controversy, since, as it turned out, this game decided who finished second and third.
 
For computer simulations, the probability of the best team winning would be slightly better than a straight single elimination tournament. The round robin phase effectively improves the quality of the seeding going into the elimination phase. If the ratings of the teams are well known in advance then the round robin phase only improves the accuracy a little. However, if the team's ratings are not known in advance, then it should improve over single elimination. (Just my 2 cents)
 
 
IP Logged
Pages: 1 ... 3 4 5 6 7  ...  9 Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.