Welcome, Guest. Please Login or Register.
May 20th, 2024, 7:21pm

Home Home Help Help Search Search Members Members Login Login Register Register
Arimaa Forum « Will the 2010 Computer Championship be open? »


   Arimaa Forum
   Arimaa
   Events
(Moderator: supersamu)
   Will the 2010 Computer Championship be open?
« Previous topic | Next topic »
Pages: 1 2 3 4  ...  11 Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: Will the 2010 Computer Championship be open?  (Read 13767 times)
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Will the 2010 Computer Championship be open?
« Reply #15 on: Aug 14th, 2009, 6:54pm »
Quote Quote Modify Modify

If you play enough games, you can get as long a wining streak as you like, regardless of the rating difference.  The table below shows how many games one should expect to play on average to get a certain length of winning streak given a fixed rating difference:
 
diff\streak 1     2     3    .    4
 -500    18.8  371.6  6998.0  131461.2
 -400    11.0  132.0  1463.0   16104.0
 -300     6.6   50.5   341.1    2265.6
 -200     4.2   21.5    93.6     393.7
 -100     2.8   10.5    31.9 .    91.5
    0     2.0    6.0    14.0 .    30.0
  100     1.6    4.0     7.8 .    13.8
  200     1.3    3.0     5.3 .     8.3
  300     1.2    2.6     4.2 .     6.1
  400     1.1    2.3     3.6 .     5.1
  500     1.1    2.2     3.4 .     4.6

 
Of course this assumes the rating formula is correct, including independence of consecutive trials, which clearly doesn't hold, but it should give a general sense of the futility of weak bots trying to get a four-game wining streak against benchmark bots that are too strong for them.
 
Admittedly, a developer who is willing to have his bot play 400 games during qualifying would have an advantage over a developer who was only willing to have his bot play 40.  However, all that extra time would only get a bot one or two notches further down the ladder of benchmark bots.
 
Is that too much encouragement for playing excessive games?  Would excessive games be worse than the finality of specifying exactly four games against each benchmark bot with those results to determine seeding regardless of later improvement?  Maybe.  I am not sure.
IP Logged

ChrisB
Forum Guru
*****



Arimaa player #2339

   


Gender: male
Posts: 147
Re: Will the 2010 Computer Championship be open?
« Reply #16 on: Aug 15th, 2009, 12:55am »
Quote Quote Modify Modify

on Aug 14th, 2009, 6:54pm, Fritzlein wrote:
Is that too much encouragement for playing excessive games?  Would excessive games be worse than the finality of specifying exactly four games against each benchmark bot with those results to determine seeding regardless of later improvement?  Maybe.  I am not sure.

 
Perhaps we could limit the number of games against each benchmark bot to, say, three times the maximum-counted streak.  That is, if the maximum-counted streak is four, we could have a cap of, say, 12 games against each benchmark bot.  That would give the developer some opportunity to improve the bot, if it initially doesn't do well against a benchmark bot.
 
Then, if two candidate bots have the same sum-of-the-streaks score, a tiebreaker could be the total games played against all the benchmark bots.
IP Logged

tize
Forum Guru
*****



Arimaa player #3121

   


Gender: male
Posts: 118
Re: Will the 2010 Computer Championship be open?
« Reply #17 on: Aug 15th, 2009, 5:25am »
Quote Quote Modify Modify

Why not just count the last 4 games, seams easier to check. This will allow for bugs to be fixed and a bot having a streak of 4 wouldn't continue playing anyway.
 
Also a status page should exist to easily see the present score. And also that every bot developer could see their seeding and if the bot is qualified.
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Will the 2010 Computer Championship be open?
« Reply #18 on: Aug 15th, 2009, 9:04am »
Quote Quote Modify Modify

That's a good point about the status page, tize.  While we are discussing about what qualification format to use, we should keep in mind how difficult it would be for Omar to implement.  The latest four games against the benchmark bots would be quite easy to implement, so I like it.  Unfortunately it loses the requirement of alternating colors.
 
My suggestion of mandating alternating colors is not so easy to enforce in a query, but something that would be fairly easy to do in a query is taking the last two games as Silver and the last two games as Gold.  But then that would imply getting two winning streaks of two games each, which can be much easier than getting one winning streak of four games, so the ease of implementation undermines the intent this way too.
 
Omar's notion of taking a fixed number of games is not undermined by enforcing color: taking the first two games as Gold and the first two games as Silver (instead of the first four games) is consistent with the intent of measuring performance.
IP Logged

omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: Will the 2010 Computer Championship be open?
« Reply #19 on: Aug 15th, 2009, 4:29pm »
Quote Quote Modify Modify

Wow, that's an interesting oberservation Karl; if the opponents are fixed and the number of times you must play each opponent is fixed then the opponent ratings don't matter, just the number of games won does. I tested it and you're right:
  p2 +1800 -1500 gives 1629
  p2 -1800 +1500 gives 1629
 
My reason for fixing the number of games was so that an entrant bot would not play many, many games against the screening bot(s) that it knows how to defeat in order to inflate the rating.
 
What if the requirement was that during the screening period you must play each screening bot at least once with each color and not more than 5 times (numbers could vary) with each color and only the most recent games which have a corresponding opposite color game are counted. This way the entrant bots do have some leeway in picking their opponents to try and maximize their performance rating, but can't go wild trying to inflate ratings from the same defeated screening bots. Note that there is no explicit requirement that any screening bot must be defeated in order to qualify. The entrant bots are just competing with each other to maximize their performance rating. I would further refine this to set the max number of games for lower rated bots to lower values and for higher rated bots to higher values. For example against a 1300 rated screening bot the max games that can be played may be 5 with each color and against a 1800 rated screening bot the max games may be 10 with each color.
 
I can setup a status page to show the current ranking of the bots based on the performance rating.
 
Thank you Karl for volunteering to start the page for 2010 Computer Championship tournament rules.
IP Logged
jdb
Forum Guru
*****



Arimaa player #214

   


Gender: male
Posts: 682
Re: Will the 2010 Computer Championship be open?
« Reply #20 on: Aug 15th, 2009, 8:20pm »
Quote Quote Modify Modify

on Aug 15th, 2009, 4:29pm, omar wrote:
Wow, that's an interesting oberservation Karl; if the opponents are fixed and the number of times you must play each opponent is fixed then the opponent ratings don't matter, just the number of games won does. I tested it and you're right:
  p2 +1800 -1500 gives 1629
  p2 -1800 +1500 gives 1629
 
My reason for fixing the number of games was so that an entrant bot would not play many, many games against the screening bot(s) that it knows how to defeat in order to inflate the rating.
 
What if the requirement was that during the screening period you must play each screening bot at least once with each color and not more than 5 times (numbers could vary) with each color and only the most recent games which have a corresponding opposite color game are counted. This way the entrant bots do have some leeway in picking their opponents to try and maximize their performance rating, but can't go wild trying to inflate ratings from the same defeated screening bots. Note that there is no explicit requirement that any screening bot must be defeated in order to qualify. The entrant bots are just competing with each other to maximize their performance rating. I would further refine this to set the max number of games for lower rated bots to lower values and for higher rated bots to higher values. For example against a 1300 rated screening bot the max games that can be played may be 5 with each color and against a 1800 rated screening bot the max games may be 10 with each color.
 
I can setup a status page to show the current ranking of the bots based on the performance rating.
 
Thank you Karl for volunteering to start the page for 2010 Computer Championship tournament rules.

 
So if I understand correctly, if the opponents are restricted to just the screening bots, and there is  maximum number of games for each individual screening bot, then the number of total wins is a valid performance rating?
 
 
 
 
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Will the 2010 Computer Championship be open?
« Reply #21 on: Aug 16th, 2009, 6:28am »
Quote Quote Modify Modify

on Aug 15th, 2009, 8:20pm, jdb wrote:
So if I understand correctly, if the opponents are restricted to just the screening bots, and there is  maximum number of games for each individual screening bot, then the number of total wins is a valid performance rating?

Against a fixed schedule, the number of wins determines the performance rating.  The order of wins is irrelevant.  If I beat Bomb2005Blitz and lose to ArimaaScoreP1 it is equivalent to beating ArimaaScoreP1 and losing to Bomb2005Blitz.  You might say the former performance is more erratic and the latter more consistent, but by the performance rating formula they are the same quality.
 
This is not a mathematical truth so much as a philosophical commitment.  Everyone just seems to agree that wins are fundamental and ratings are derived.  Ratings permit us to compare wins against different opposition (e.g. is 4 of 5 against ArimaaScoreP1 better or worse than 1 of 5 against Bomb2005Blitz?), but when the opposition is constant (e.g. playing five games against each opponent), equal wins must produce equal performance ratings, or else people will say the measurement is broken.  Therefore, starting from that philosophical commitment, a formula for performance rating has been derived that respects the "equal wins means equal performance" mandate.
 
Omar, I strongly recommend that you not calculate performance ratings at all; it introduces an unnecessary confusion and complication.  For starters, calculating performance ratings would require us to fix the ratings of the benchmark bots in order to be fair.  The outcome should not depend on whether a benchmark bot's rating was higher or lower at the particular time a qualifying bot played it.  But we don't know what level to fix their ratings at, and we don't want them to have fixed ratings in general.  Furthermore there is no benefit to dragging ratings into it, because number of wins provides exactly the same ranking of bots that calculating a performance rating would.  And finally, if we allow multiple plays to try to get a better result, then that will make a calculated performance rating not a true measure anyway, i.e. it would be inflated relative to the fixed ratings we chose for the benchmark bots.
 
My latest (still tentative) thought:
1) Fix eight benchmark bots, spanning the range from the weakest to the strongest Omar has available.
2) Fix a starting date.  Before that date no games count.
3) Fix a minimum number of games per benchmark bot per color, and a maximum number of games per benchmark bot per color.  I suggest two for the minimum and five for the maximum.
4) The point total for each qualifying bot is calculated as follows.  For each for benchmark bot, for each color, count the number of wins in the two most recent games against the qualifying bot that are after the starting date but before the maximum of five games played.  Thus if the qualifying bot has played the benchmark bot twenty times with that color since the starting date, count only the fourth and fifth games.  The best possible score is 32 and the worst possible is 0.
5) For a tiebreaker, count the number of qualifying games played, with a lower number being better.  The best possible tiebreaker is 32 and the worst possible is 80.
6) For a second tiebreaker, measure the time of the last qualifying game to be played, where earlier is better.
 
So, the objective for developers will be to get a two-game winning streak against each benchmark bot with each color.  There is a five-game window in which to achieve this two-game winning streak, so fixable bugs that cause a loss are not fatal.  On the other hand, we have ruled out the qualifying bot playing incessantly until it gets lucky.
 
There is some small element of gambling.  If your first three games against a particular bot with a particular color are loss-win-loss, should you play two more?  You could improve your score with a win-win, but would lower it with a loss-loss.
 
Of course, if your two most recent games are loss-win, then playing one more can't hurt your score and might help by producing the desired two-game winning streak.  And if at any point you get two wins in a row, you should stop playing, not only so as not to risk points for no benefit, but also to preserve a lower tiebreak score.
 
Counter-suggestions are welcome.
IP Logged

jdb
Forum Guru
*****



Arimaa player #214

   


Gender: male
Posts: 682
Re: Will the 2010 Computer Championship be open?
« Reply #22 on: Aug 16th, 2009, 10:13am »
Quote Quote Modify Modify

Quote:
My latest (still tentative) thought:
1) Fix eight benchmark bots, spanning the range from the weakest to the strongest Omar has available.
2) Fix a starting date.  Before that date no games count.
3) Fix a minimum number of games per benchmark bot per color, and a maximum number of games per benchmark bot per color.  I suggest two for the minimum and five for the maximum.
4) The point total for each qualifying bot is calculated as follows.  For each for benchmark bot, for each color, count the number of wins in the two most recent games against the qualifying bot that are after the starting date but before the maximum of five games played.  Thus if the qualifying bot has played the benchmark bot twenty times with that color since the starting date, count only the fourth and fifth games.  The best possible score is 32 and the worst possible is 0.
5) For a tiebreaker, count the number of qualifying games played, with a lower number being better.  The best possible tiebreaker is 32 and the worst possible is 80.
6) For a second tiebreaker, measure the time of the last qualifying game to be played, where earlier is better.

 
Looks like a good start.
 
If the second tie breaker is number of games played, there is little benefit to setting a maximum number of games against a bot. If an entrant plays 200 extra games against a bot, it will show up in the second tie breaker.
 
Setting a minimum number of games is only useful if an entrant is much weaker than a bot. Assuming there is no restriction on the maximum number of games, an entrant might as well keep trying until they eventually get a win.
 
If there is a maximum number of games, bot developers will wait to play qualifying games until their bot is as strong as possible. Without the maximum restriction, the only penalty in the second tie breaker. Developers would be much more likely to try their entries against the reference bots earlier. This would allow the standings page to be more meaningful.
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Will the 2010 Computer Championship be open?
« Reply #23 on: Aug 16th, 2009, 2:39pm »
Quote Quote Modify Modify

JDB, are you saying that you would like number of games played to be second tie-breaker rather than first?
 
The thought behind setting a maximum is to not overly reward persistence compared to skill, but maybe its not a problem to give a large persistence bonus.
IP Logged

jdb
Forum Guru
*****



Arimaa player #214

   


Gender: male
Posts: 682
Re: Will the 2010 Computer Championship be open?
« Reply #24 on: Aug 16th, 2009, 3:02pm »
Quote Quote Modify Modify

No, that was my mistake.  
 
The tiebreakers should be as written in your original post.
 
IP Logged
tize
Forum Guru
*****



Arimaa player #3121

   


Gender: male
Posts: 118
Re: Will the 2010 Computer Championship be open?
« Reply #25 on: Aug 17th, 2009, 4:48am »
Quote Quote Modify Modify

Is there a real benefit of having a minimum number of games.
 
A bot that only having played 25 games, should that bot be automatic out. Even if all games where won?
IP Logged
aaaa
Forum Guru
*****



Arimaa player #958

   


Posts: 768
Re: Will the 2010 Computer Championship be open?
« Reply #26 on: Aug 17th, 2009, 6:34am »
Quote Quote Modify Modify

A developer could fraudulently make moves on behalf of his bot.
IP Logged
Janzert
Forum Guru
*****



Arimaa player #247

   


Gender: male
Posts: 1016
Re: Will the 2010 Computer Championship be open?
« Reply #27 on: Aug 17th, 2009, 6:45am »
Quote Quote Modify Modify

I've been out of town for a bit and haven't had a chance to do more than skim over the discussion so far. One thing I'd like to raise though is that I would really like to see the previous years champion given an automatic slot. It could still be ranked by the qualification process but even if it falls below the lowest slot I'd like to have it still put in as the lowest seeded entry.
 
Janzert
IP Logged
Arimabuff
Forum Guru
*****



Arimaa player #2764

   


Gender: male
Posts: 589
Re: Will the 2010 Computer Championship be open?
« Reply #28 on: Aug 17th, 2009, 7:05am »
Quote Quote Modify Modify

on Aug 17th, 2009, 6:34am, aaaa wrote:
A developer could fraudulently make moves on behalf of his bot.

Not during the final phase of the championship (which is what really counts). Otherwise, what good would that do to fraudulently qualify a bot that's not good enough? It won't win anyway. Besides a bot that has a shot at making the final, is most likely better than its programmer is.
« Last Edit: Aug 17th, 2009, 7:11am by Arimabuff » IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Will the 2010 Computer Championship be open?
« Reply #29 on: Aug 18th, 2009, 9:28am »
Quote Quote Modify Modify

on Aug 17th, 2009, 4:48am, tize wrote:
Is there a real benefit of having a minimum number of games.
 
A bot that only having played 25 games, should that bot be automatic out. Even if all games where won?

You are right, having a minimum doesn't make sense.  If a bot can accumulate enough points without playing every opponent, then more power to it.
IP Logged

Pages: 1 2 3 4  ...  11 Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.