Welcome, Guest. Please Login or Register.
Nov 22nd, 2024, 5:01pm

Home Home Help Help Search Search Members Members Login Login Register Register
Arimaa Forum « Rating of the bots in the ladder »


   Arimaa Forum
   Arimaa
   General Discussion
(Moderator: supersamu)
   Rating of the bots in the ladder
« Previous topic | Next topic »
Pages: 1  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: Rating of the bots in the ladder  (Read 1215 times)
NIC1138
Forum Guru
*****




Arimaa player #65536

   
WWW

Gender: male
Posts: 149
Rating of the bots in the ladder
« on: Apr 26th, 2007, 12:37am »
Quote Quote Modify Modify

Why are the ratings of the bots stepping down in the beginning of the level 3? Are they actually playing with the last from the level 2? They didn't seem easier to me... Or is it just me learning the new paradigm?? (or are  the inter-bots games halted?)
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Rating of the bots in the ladder
« Reply #1 on: Apr 26th, 2007, 7:08am »
Quote Quote Modify Modify

The inter-bot games have been halted, so the ratings of the bots have drifted away from what they would be based purely on bot vs. bot games.
 
Dividing the bot ladder into levels has some strange effects on the ratings.  For example, Arimaalon is rated 1310 at the top of level one, whereas ShallowBlue is rated 1223 at the bottom of level two.  Yet Arimaalon uses the same evaluation function as ShallowBlue, at a shallower depth.  We can be quite confident that ShallowBlue is actually the stronger bot.
 
Further evidence of ratings weirdness is that Arimaalon was accidentally misplaced on level three of the bot ladder for a long time, where it compiled a rating of 1161.  This is fairly suggestive of a hypothesis: the lower the level of the ladder a bot gets stuck on, the more that bot's rating will be inflated.
 
This hypothesis makes sense to me, because newcomers typically have very inflated ratings.  Everyone starts with a 1500 rating, but the average strength of player for the first game is closer to 1100.  This discrepancy means that the level one bots are all over-rated by nearly 400 points.  Indeed, the main reason they aren't fully overrated by 400 points is that sometimes a more experienced human will go back and play extra games against the level one bots, stealing points in the process.
 
Folks who play the level two bots are often still overrated, but not by nearly as much.  They have often lost some points to the level one bots on the way up the ladder, and often they have significantly gained in playing strength just from having played a few games.  Therefore the bottom bots on level two, although they are stronger than the bots on level one, may end up with lower ratings.
 
There is a similar effect between level two and level three.  Humans will often get stuck on level two, losing many times to BombP1 and CluelessP1.  After a series of such losses (and one win!) the human is rated lower than before the series, but has greater playing strength.  Thus they start on the bottom bots of level three with deflated ratings, stealing points from the weaker P2 bots.  Humans also tend to move on as soon as passing BombP1 and CluelessP1, leaving those bots with inflated ratings, rather than playing extra games to win back the points they donated.
 
On the other hand, although GnobotP2 is probably underrated and BombP1 is probably overrated, I am not sure which is actually the stronger bot.  Maybe BombP1's selective search extensions past one ply are worth more than GnobotP2's full-width search of an extra ply, but with no extensions.
IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Rating of the bots in the ladder
« Reply #2 on: Apr 26th, 2007, 7:16am »
Quote Quote Modify Modify

Incidentally, I do not favor bringing back the bot vs. bot games.  Yes, such games would make the relative ratings of the bots more accurate.  Games between bots would shift rating points away from the lower-level bots and towards the higher-level bots.  However, this would have the undesirable effect of contributing to general rating inflation.
 
Right now, because the level one bots are overrated, it blunts the inflationary effect of newcomers playing two games, losing both, and leaving permanently, having donated 58 rating points in the process.  There still may be people who leave after losing two games, but now they donate only 38 points or so.
 
In the current scheme there is probably a bit of ratings inflation, which is fine, but probably not as much inflation as there would be if bot vs. bot games were busy passing rating points higher up the scale.
IP Logged

aaaa
Forum Guru
*****



Arimaa player #958

   


Posts: 768
Re: Rating of the bots in the ladder
« Reply #3 on: Apr 26th, 2007, 9:10am »
Quote Quote Modify Modify

Since it seems obvious that the starting rating should be significantly lower than the average rating, shouldn't rating inflation be in fact welcomed till the discrepancy has been minimized?
 
An idea I have to fight rating drift is to not give a new player any rating at all and not cause the ratings of opponents to change until that player has both won and lost a game (or drew one game), after which a rating is established that is based on all the games played thus far (where for that purpose equally unrated opponents are considered to do have a starting rating).
IP Logged
NIC1138
Forum Guru
*****




Arimaa player #65536

   
WWW

Gender: male
Posts: 149
Re: Rating of the bots in the ladder
« Reply #4 on: Apr 26th, 2007, 9:46am »
Quote Quote Modify Modify

I'm sorry I brought up the rating inflation subject again!! I didn't see it coming! Smiley
 
But since we are here... We can't give new players a 0 rating, because this would cause deflation in the long term... (unless they lose a couple of games, don't win or lose RU and go away... but this can be accomplished with only changing the rating after some games)
 
Now, If we imagine every day we have like 10 new players coming in with 1500RU and playing and going away, we are stabilishing a kind of contour condition... Everybody's else ratings will be related to his. That is in opposition to, for example, fix Fritzlein's rating in 2500, or fix a bot...
 
But let's forget inflation!! my problem is that bump in the ratings of the bots! Sad It happened to me:I played a lot of  BombP1 and CluelessP1, and now I'm climbing up...  
 
Can we see this as a kind of overshooting?
 
If we had a perfect linear ladder, someone climbing it would always be a little underrated, just like a response of a second-order linear filter to an "integrated step function"! But if there is a bump, a discontinuity with the stronger bot being much more strong than the last ones, then we might have some kind of overshooting!...
 
If this is right, we could attenuate this bump introducing an intermediary bot before the strong ones (bomb and clueless p1).
 
IP Logged
NIC1138
Forum Guru
*****




Arimaa player #65536

   
WWW

Gender: male
Posts: 149
Re: Rating of the bots in the ladder
« Reply #5 on: Apr 26th, 2007, 11:09pm »
Quote Quote Modify Modify

...I take all that back. Those two bots are much tougher... The ratings are just fine! Grin
IP Logged
aaaa
Forum Guru
*****



Arimaa player #958

   


Posts: 768
Re: Rating of the bots in the ladder
« Reply #6 on: May 4th, 2007, 6:22pm »
Quote Quote Modify Modify

Here's a crazy thought: Why not just do away with all the different levels in the ladder? After all, one can only play the current lowest undefeated bot in a level anyway.
IP Logged
NIC1138
Forum Guru
*****




Arimaa player #65536

   
WWW

Gender: male
Posts: 149
Re: Rating of the bots in the ladder
« Reply #7 on: May 4th, 2007, 8:03pm »
Quote Quote Modify Modify

Well, I do feel that the bots in different "levels" play in different ways...
IP Logged
Pages: 1  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.