Welcome, Guest. Please Login or Register.
Mar 28th, 2024, 8:19pm

Home Home Help Help Search Search Members Members Login Login Register Register
Arimaa Forum « World Championship tournament format »


   Arimaa Forum
   Arimaa
   Events
(Moderator: supersamu)
   World Championship tournament format
« Previous topic | Next topic »
Pages: 1 2 3 4 5 6  ...  9 Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: World Championship tournament format  (Read 9267 times)
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: World Championship tournament format
« Reply #45 on: May 24th, 2005, 6:09pm »
Quote Quote Modify Modify

I am curious about the swissknife pairing scheme, but I can't untar files on my current Windows machine.  I'll have to download some utility for that.
 
I wanted to weigh in further about pairing people based on ratings.  Suppose there are four players in the tournament with accurate ratings of
 
Player A: 2100
Player B: 2000
Player C: 1900
Player D: 1800
 
Now if it is know that the tournament pairings will be "sliding", i.e. #1 vs. #3 in one game and #2 vs. #4 in the other, then Player A has a strong incentive to purposely lose games to get his rating lower.  If he can lower his rating by 150 points to 1950, then he will get to play Player D instead of Player C in the first round for a significantly easier matchup.
 
Or supposing that Player A doesn't realize this opportunity and keeps his high rating.  In that case Player C has an incentive to intentionally lower his rating down to 1750 so the he will get to play Player B instead of Player A.
 
I think that we must reject in advance any system which would reward players (even if only a few of the players) for purposely throwing their games to get a lower rating before tournament begin.  Yes, there is a bias towards the higher-rated players in "folding" pairing, but that is exactly the way it should be.  A higher rating should always be beneficial or at least neutral, or else we could see some very strange sandbagging going on in advance of the tournament.
« Last Edit: May 24th, 2005, 6:11pm by Fritzlein » IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: World Championship tournament format
« Reply #46 on: May 24th, 2005, 6:13pm »
Quote Quote Modify Modify

on May 22nd, 2005, 10:32am, omar wrote:

 
Karl, while Im working on the programs for comparing tournament systems, could you please do some kind of analysys with the games archive to see how well the ratings are predicting outcome of games and see if that can be converted to a number we can use for rating inaccuracies in the simulations.

 
I'll see whether I can come up with some interesting and possibly useful numbers.
IP Logged

99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: World Championship tournament format
« Reply #47 on: May 24th, 2005, 6:38pm »
Quote Quote Modify Modify

Haha, very good Omar Wink.  Now I see why Fritz is so curious about this one!
 
Actually that is exactly what I was thinking of writing yesterday until I realised I couldn't understand the language they were written in!
 
Now that you've enabled us to specify ratings, I think I will be able to give a second quantitative criteria which we can rate the tournaments on.  (In addition to "highest rated player should win most often".)
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: World Championship tournament format
« Reply #48 on: May 24th, 2005, 7:42pm »
Quote Quote Modify Modify

on May 24th, 2005, 6:09pm, Fritzlein wrote:
I am curious about the swissknife pairing scheme, but I can't untar files on my current Windows machine.  I'll have to download some utility for that.

 
Actually I ment for the second link to be to a ZIP file. I've fixed it now.
 
On a Windows PC you will need to install Perl if you don't already have it. You can get it from:
 
http://activestate.com/Products/Download/Download.plex?id=ActivePerl
 
Get the most latest version for Windows with the MSI installer.
 
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: World Championship tournament format
« Reply #49 on: May 24th, 2005, 7:46pm »
Quote Quote Modify Modify

on May 24th, 2005, 6:38pm, 99of9 wrote:

Actually that is exactly what I was thinking of writing yesterday until I realised I couldn't understand the language they were written in!

 
You can write the program to implement a tournament format in any language you want. Just put your program in the 'formats' directory and you can start using it. Just read the 2nd README file to see what your program needs to print out.
« Last Edit: May 24th, 2005, 7:47pm by omar » IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: World Championship tournament format
« Reply #50 on: May 24th, 2005, 11:15pm »
Quote Quote Modify Modify

on May 24th, 2005, 6:38pm, 99of9 wrote:
Now that you've enabled us to specify ratings, I think I will be able to give a second quantitative criteria which we can rate the tournaments on.  (In addition to "highest rated player should win most often".)

 
 
Actually, I'm not sure I can do it unless I can fix the real ratings - is that possible as well as fixing the predicted ratings?  
 
Here's the idea:
 
If the two (or three...) top players have nearly equal real rating, but one has a slightly higher predicted rating, then their probability of winning the tournament should be as similar as possible.
 
In fact this applies to any two players in the tournament (eg players 8 and 9).  Ideally a good tournament scheme should give two players with an equal real rating but different predicted rating the same chance of winning.
 
I agree with Fritz that if there is a difference, you always want to give the advantage to the higher predicted rating, to prevent rating manipulation.   But it is also preferable to minimize the difference in the first place.
 
On this condition I predict that:
  • Swissknife will fail miserably,  
  • Sliding scale will fail number 8+9
  • Crossover will give a gradual linear bias (the higher the predicted position, the stronger the bias)
  • Round robins will perform perfectly.  
  • How will double / triple elimination do I wonder?
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: World Championship tournament format
« Reply #51 on: May 25th, 2005, 6:12am »
Quote Quote Modify Modify

on May 24th, 2005, 11:15pm, 99of9 wrote:

If the two (or three...) top players have nearly equal real rating, but one has a slightly higher predicted rating, then their probability of winning the tournament should be as similar as possible.

 
This gets back to the question of what we want the objective of the tournament to be. Some possibilites are:
 
1. Give the player with the highest real rating the greatest chance of winning. This is what I thought a WC type tournament tries to acheive.  
 
2. Give all players an equal chance of winning (just have a lottery as Arimaanator suggested Smiley ).
 
3. Give players a chance of winning the tournament proportional to their real rating. Round robin acheives this as Toby mentioned.
 
4. Give the player who showed the best performance at the event relative to their real rating the highest chance of winning. Maybe something like singleElimOrd acheives this.
 
5. Something else.
 
Different tournament might have different objectives. For example a WC type tournament I've always thought should try for objective #1. But an open classic type tournament might want to use objective #4.
 
So I think we need to decide on what the objective of the WC tournament should be.
 
IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: World Championship tournament format
« Reply #52 on: May 25th, 2005, 8:33am »
Quote Quote Modify Modify

But that is exactly my point, SwissKnife proves that if you have your sole aim as #1, then one of the best ways to do it is to not play a tournament at all.
 
Actually I wasn't suggesting #3 either.  I was suggesting that  #1 is a good aim, but needs to be complimented by success at another aim:
 
The effect of rating error on a person's chances at winning the tournament should be minimized.  i.e. if a person has a real rating of R, and a predicted rating of R+E, their chances of winning should depend almost entirely on R, and as little as possible on E.  (It's still fine to aim for the person with the highest R to have the highest possible chance of a win.)  (I would certainly suggest that the probability of winning as a function of R should be monotonic and smooth.)
 
So in the example I gave where 2 people have the same R, but different E, a good tournament would give them both similar chances of winning.
« Last Edit: May 25th, 2005, 8:38am by 99of9 » IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: World Championship tournament format
« Reply #53 on: May 26th, 2005, 6:06pm »
Quote Quote Modify Modify

on May 25th, 2005, 8:33am, 99of9 wrote:

So in the example I gave where 2 people have the same R, but different E, a good tournament would give them both similar chances of winning.

 
Yes, but only tournaments which do not seed players and do not make use of ratings will be able to give players with the exact same real rating (R) the same chance of winning. Any tournament that makes use of measured ratings (M) will not be able to satisfy this.
 
Quote:

But that is exactly my point, SwissKnife proves that if you have your sole aim as #1, then one of the best ways to do it is to not play a tournament at all.

 
Actually it turns out that the performance of swissKnife degrades very fast as the number of players increases or the rating inaccuracy increases. My posting earlier only looked at the case when the number of players was 16 and the rating inaccuracy was 50. The performance of roundRobin also degrades as the number of players increases, but very gradually and it is not effected by rating inaccuracies (since it doesn't make use of ratings). For the 16 player case the crossover point between swissKnife and roundRobin seems to be at a rating inaccuracy of about 90.
 
For the case of 16 players, rating in the range of 1500 to 2000, with a rating inaccuracy of 200 points we get:
  swissKnife: 33.3%
  roundRobin: 44.9%
 
So the difference can be quite significant. I wouldn't suggest handing over the torphy to the highest rated player without having a tournament just yet Smiley
 
So the bigger picture now says that if you have very few players and the rating inaccuracies are low, you would be better off just using swissKnife. But if you have a lot of players or if rating inaccuries are high then roundRobin is better. So we might not need an additional criterias for judging tournament formats after all. The number that Karl comes up with for the actual rating inaccuracies will be critical. I expect it to be around a 130 or so. The Arimaa rating system has an intrinsic error of about 30 and the error introduced by other factors such as selection of opponents and game speeds probably adds another 100 points.
 
Let see what Karl finds.
IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: World Championship tournament format
« Reply #54 on: May 26th, 2005, 7:23pm »
Quote Quote Modify Modify

on May 26th, 2005, 6:06pm, omar wrote:
Yes, but only tournaments which do not seed players and do not make use of ratings will be able to give players with the exact same real rating (R) the same chance of winning. Any tournament that makes use of measured ratings (M) will not be able to satisfy this.

 
I agree, nothing else will satisfy it perfectly, but some bad tournaments are worse than others.  Swissknife will perform much worse on this test than any knockout for example.
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: World Championship tournament format
« Reply #55 on: May 27th, 2005, 12:24am »
Quote Quote Modify Modify

on May 26th, 2005, 6:06pm, omar wrote:

The number that Karl comes up with for the actual rating inaccuracies will be critical. I expect it to be around a 130 or so. The Arimaa rating system has an intrinsic error of about 30 and the error introduced by other factors such as selection of opponents and game speeds probably adds another 100 points.
 
Let see what Karl finds.

 
I'm not finding anything statistical for you yet, Omar, but I have an empirical project for you.  Assume for a moment that the ratings model we use is perfectly correct, i.e. assume that bots, selection of opponents, and selection of time controls makes no difference.  Take 16 players with true strengths randomly distributed between 1500 and 2000.  Give each of those players an exactly accurate rating and an RU of 30.  Play 1000 games between those players, randomly choosing pairings, basing winning probability on the true strengths, and adjusting their ratings as if those players were playing on the server.  At the end of 1000 games, record the error in rating for each player.  Repeat 100 times for a new 16 players and new 1000 games.  I expect your 1600 recorded errors will form a bell curve of sorts -- calculate the standard deviation of this bell curve.
 
The number you get out of this trial will be the absolute rock-bottom error in ratings that you can hope for, because that is the best the system as it exists can possibly perform.  Every other factor (and I haven't even mentioned that players may actually be changing in true strength) can only increase that error.
« Last Edit: May 27th, 2005, 12:27am by Fritzlein » IP Logged

99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: World Championship tournament format
« Reply #56 on: May 27th, 2005, 1:43am »
Quote Quote Modify Modify

I'll volunteer to do this sometime this weekend if nobody beats me to it.
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: World Championship tournament format
« Reply #57 on: May 27th, 2005, 4:49pm »
Quote Quote Modify Modify

on May 27th, 2005, 12:24am, Fritzlein wrote:

The number you get out of this trial will be the absolute rock-bottom error in ratings that you can hope for, because that is the best the system as it exists can possibly perform.

 
Yes, this is what I was refering to as intrinsic error in my previous post. It is about 30 points. Back in 2003 when I was evaluating various rating formulas to use for the Arimaa rating system this was one of the criterias that I used in the simulations. Another was the convergence rate; i.e. the average number of games it takes for a players rating error to reach this level. For the Arimaa rating system the convergence rate is about 90 games.
 
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: World Championship tournament format
« Reply #58 on: May 27th, 2005, 4:50pm »
Quote Quote Modify Modify

on May 27th, 2005, 1:43am, 99of9 wrote:
I'll volunteer to do this sometime this weekend if nobody beats me to it.

 
It will be good to see if you come up with about the same numbers.
IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: World Championship tournament format
« Reply #59 on: May 28th, 2005, 6:19am »
Quote Quote Modify Modify

Standarddev 51.06979
 
 
Here is my program, feel free to look for bugs or use it in any way you see fit.  It's public domain.
 

/*
Program written by Toby Hudson, 28 May 2005.
The specifications for the program were provided by Karl Juhnke, and are given here:
 
Assume for a moment that the ratings model we use is perfectly correct, i.e. assume that bots, selection of opponents, and selection of time controls makes no difference.  Take 16 players with true strengths randomly distributed between 1500 and 2000.  Give each of those players an exactly accurate rating and an RU of 30.  Play 1000 games between those players, randomly choosing pairings, basing winning probability on the true strengths, and adjusting their ratings as if those players were playing on the server.  At the end of 1000 games, record the error in rating for each player.  Repeat 100 times for a new 16 players and new 1000 games.  I expect your 1600 recorded errors will form a bell curve of sorts -- calculate the standard deviation of this bell curve.  
*/
 
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
 
#define PLAYERS 16
int minrating=1500;
int maxrating=2000;
 
#define ENSEMBLE 100
int numgames=1000;
 
int realratings[PLAYERS];
int estimatedratings[PLAYERS];
 
int errors[ENSEMBLE][PLAYERS];
 
void ChooseRatings(int ratings[PLAYERS]) {
 int p;
 for (p=0; p<PLAYERS; p++) ratings[p] = minrating + rand()%(maxrating-minrating+1);  
}
 
void CopyRatings(int copy[PLAYERS], int orig[PLAYERS]) {
 int p;
 for (p=0; p<PLAYERS; p++) copy[p] = orig[p];  
}
 
void GetErrors(int diff[PLAYERS], int real[PLAYERS], int est[PLAYERS]) {
 int p;
 for (p=0; p<PLAYERS; p++) diff[p] = est[p] - real[p];  
}
 
double Rand_Double () {
 return ((double)rand()/(double)RAND_MAX);
}
 
double WinProb (int rA, int rB) {
 return (1.0/(1.0+pow(10.0,(rB-rA)/400.0)));
}
 
double ChooseWinner (int rA, int rB) {
 double chi;
 
 chi = Rand_Double();
 
 if(chi<WinProb(rA,rB)) return 1.0;
 else return 0.0;
}
 
void SimulateGames(int games) {
 int n;
 int pA, pB, r_estA, r_estB;
 double ww;
 
 for (n=0; n<games; n++) {
  pA = rand()%PLAYERS;
  pB = pA;
  while (pB==pA) pB = rand()%PLAYERS;
   
  ww = ChooseWinner(realratings[pA], realratings[pB]);
   
  r_estA = estimatedratings[pA];
  r_estB = estimatedratings[pB];
   
  estimatedratings[pA] = (int)(estimatedratings[pA] + 30*(   ww   - WinProb(r_estA, r_estB)) + 0.5);
  estimatedratings[pB] = (int)(estimatedratings[pB] + 30*((1.0-ww)- WinProb(r_estB, r_estA)) + 0.5);
   
  //printf("game %5d, rA %5d rB %5d, r_estA %5d r_estB %5d, wp=%8.5f ww=%5.1f, new_estA %5d new_estB %5d\n", n, realratings[pA], realratings[pB], r_estA, r_estB, WinProb(realratings[pA], realratings[pB]), ww, estimatedratings[pA], estimatedratings[pB]);
 }
}
 
void ExamineDistribution(int err[ENSEMBLE][PLAYERS]) {
 int e, p;
 int sum=0;
 double sumofsquares=0.0;
 double mean;
 double standarddev;
   
 for (e=0; e<ENSEMBLE; e++) {
  for (p=0; p<PLAYERS; p++) {
   sum += err[e][p];
   
   // printf("%d\n", err[e][p]);
  }  
 }  
 mean = (double)sum/(double)(ENSEMBLE*PLAYERS);
 
 for (e=0; e<ENSEMBLE; e++) {
  for (p=0; p<PLAYERS; p++) {
   sumofsquares += pow(err[e][p]-mean,2.0);
  }  
 }  
 
 standarddev = pow( sumofsquares/(double)(ENSEMBLE*PLAYERS-1) , 0.5);
 
 printf("Mean %e, Standarddev %e\n", mean, standarddev);
 
}
 
int main () {
 int e;
 
 for (e=0; e<ENSEMBLE; e++) {
  ChooseRatings(realratings);
  CopyRatings(estimatedratings, realratings);
  SimulateGames(numgames);
  GetErrors(errors[e],realratings,estimatedratings);
 }
 
 ExamineDistribution(errors);
}
IP Logged
Pages: 1 2 3 4 5 6  ...  9 Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.