Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> General Discussion >> Omar = OmarFast   ,   bot_bomb = bot_spe
(Message started by: 99of9 on Dec 22nd, 2004, 12:53pm)

Title: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by 99of9 on Dec 22nd, 2004, 12:53pm
Interestingly Omar's rating is almost equal to OmarFast's rating, and bot_bomb's is almost equal to bot_speedy's.

Perhaps quality is nothing to do with time control :-)

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by omar on Dec 27th, 2004, 1:06am
I played a lot of fast games with the 'omar' account and lost to speedy. I think that's why its close to the omarFast account.

In the future I would like to classify games into different speed catagories like fast, regular, slow and postal and then have seperate ratings for each. Thus each person would have 4 different ratings; one for each speed catagory.

Omar


Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by MrBrain on Dec 27th, 2004, 5:57pm
I love that idea.

I think what you have as "Slow" could be the official time control for an arimaa match.  I would recommend no slower than 2 minutes per move.  If the idea of choosing the winner at the end of the game time limit by who's used less time (rather than the scoring function) is implemented, we could have a time control for an official match of 2/2/100/10/5.  This time control could be used for all official human-human, computer-computer, and human-computer games.  This would ensure that no game would exceed 5 hours.  The end-effect of this control with the new deciding mechanism would be that in a very long game (more than 70 or so moves), players would have to eventually move faster to avoid using more than half of the time.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by MrBrain on Dec 27th, 2004, 6:05pm
But actually, each catagory would have to fall within a range.  Here's a proposal:

0:45 per move or faster -- fast
0:46 - 1:30 per move -- regular
1:31 - 1:00:00 per move -- slow
More than 1 hour per move -- postal

How's that seem?

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by MrBrain on Dec 27th, 2004, 7:11pm
Actually, I don't like the connotation of "slow".  Call them instead:
Bullet,
Fast,
Regular,
Postal

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by Fritzlein on Dec 28th, 2004, 12:43pm
There was a wacky idea floated in the discussion of ICC and FICS ratings, which I think has some merit.  The idea is that we want to distinguish skill at fast play from skill at slow play, but we don't want to have to maintain four different ratings (or five or six as ICC does).  So we fix a super fast time, say 15 seconds per move, and have anything that fast or faster contribute only to the fast rating.  Also we fix a super-slow time, say 4 minutes per move, and have anything that slow or slower contribute only to the slow rating.  For any time control in between, i.e. for most games, we have it contribute to both ratings in geometric proportion.  The formula for how much the slow rating is affected would be (lg(seconds-per-move/15))/4, where lg is the base 2 logarithm.  For example
time controlfast ratingslow rating
15 seconds100%0%
30 seconds75%25%
45 seconds60.4%39.6%
1 minute50%50%
2 minutes25%75%
3 minutes10.4%89.6%
4 minutes0%100%

You may wonder why I chose 15 seconds per move as the fast end rather than the fastest time control currently available.  My hunch is that if you start having a "fast" rating, and you offer 15 seconds per move as an option for people, then they will play a lot of it.  On the Internet chess club, the most popular time control is 5 minutes per game, and a great many games are played at even faster speeds, so I think there would be a popular demand for 15-second games.  I certainly would like to try out speedy at that time control!  :-)

I chose 4 minutes per move as the slow end in order distinguish postal games from slow tournament games.  Actually, I doubt there will be many games slower than 2 minutes per move except for postal games.  It might make as much sense to put the upper limit at 3 minutes per move, but I'm partial to widely spacing the extremes.

[Edit: I messed up the following description in my original post.]

Anyway, to calculate an example, suppose Bomb has a fast rating of 2200 and a slow rating of 1600, whereas I have a fast rating of 2100 and a slow rating of 2000.   Suppose I play  Bomb at a time control 2 minutes per move.  Then my effective rating is 2000*0.75 + 2100*0.25 and Bomb's effective rating is 1600*0.75 + 2200*0.25.   So for this particular speed my edge in rating is 2025 to 1750.  If I lose, I would lose about 25 points, and 75% of that adjustment would go on our respective slow ratings, while 25% of that adjustment would go on our respective fast ratings.

Well, maybe the math is too weird, but I like the general idea.  Bots could play at a variety of time controls without needing extra account (e.g. bomb/speedy)  and without messing up the ratings (e.g. clueless vs. speedy at 30 seconds has as much chance of winning as clueless vs. bomb at 120 seconds, but the former gives clueless more rating points for no reason).   All time controls are distinguished, in that they affect the ratings in different proportion.  The scale is continuous without weird jumps like 59 seconds per move affecting one rating and 60 seconds per move affecting a completely different rating, plus you don't have to have four (or more) different ratings.  What do y'all think?

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by fotland on Dec 28th, 2004, 1:41pm
I think 15 second games will give speedy a big edge, but give it a try :)

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by Fritzlein on Dec 28th, 2004, 4:57pm
Yeah, it's tough playing that fast against a computer, but it sure is fun.

We need a way for bots to offer and/or accept multiple time controls.  It would be great if speedy would take matches at a time control 30 seconds or less, say, so that people could choose whether to go for the bullet game or merely blitz.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by 99of9 on Jan 3rd, 2005, 6:18am
I think Fritz's idea is GREAT.

This would totally solve our time-based-ratings conundrum without introducing millions of different volatile ratings.  It also doesn't look too hard to introduce.

The only real assumption is that if someone is better at 1 min than they are at 3 min, then they will be even better still at 30 sec.  That seems quite reasonable to me.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by MrBrain on Jan 3rd, 2005, 7:52am
I also like the Fritzlein idea.  Perhaps just have one more rating - postal?  If it's just folded into the slow rating, this seems sufficient though.  Might need to tweak the percentages.  45 seconds going almost 40% to slow rating seems not quite right.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by MrBrain on Jan 3rd, 2005, 7:57am
How about this:

time fast slow
0:15 100% 0%
0:30 95% 5%
0:45 85% 15%
1:00 70% 30%
1:15 50% 50%
1:30 30% 70%
2:00 15% 85%
2:30 5% 95%
3:00, postal 0% 100%

It's not quite as mathematical, but it does incorporate time controls used in the past, and it seems a little closer to how the different time controls feel (at least to me) from playing the game.  What do you think?

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by MrBrain on Jan 3rd, 2005, 8:20am
And all this recent talk about time controls gives me an idea as to how to incorporate a time control into the wording of the Arimaa challenge itself (something I think needs to be done).

The human designee can choose (before the match) the time control that they like best with the following two restrictions:
1.  The amount of time per move shall not exceed 3 minutes (the first 100% slow time control).
2.  The amount of time before the game must end shall not be less than 6 hours.  (Whether you do 3 hours max per player, or 6 hours total with the time-decision mechanism.)

By doing the above, you can avoid what happened last year, which is that Omar got bored with the 3-minute time control.  While the bots are still not up to human level, the human can choose a faster control so as to not drag the games out.  Perhaps 15 years from now, when bots may be quite strong, the human can choose the maximum time to think to maximize their chances against the killer bot.  But while the flexibility is nice for the humans, the two restrictions ensure that a time control is not chosen that is purposely unfair to the bot.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by Fritzlein on Jan 3rd, 2005, 11:40am

on 01/03/05 at 07:57:29, MrBrain wrote:
How about this:

time fast slow
0:15 100% 0%
0:30 95% 5%
0:45 85% 15%
1:00 70% 30%
1:15 50% 50%
1:30 30% 70%
2:00 15% 85%
2:30 5% 95%
3:00, postal 0% 100%

It's not quite as mathematical, but it does incorporate time controls used in the past, and it seems a little closer to how the different time controls feel (at least to me) from playing the game.  What do you think?


The first thing that jumps out at me is that there should be a much bigger distinction between 15-second games and 30-second games, not just 5%.  The difference in how it feels to play at those two time controls is huge.  In my opinion, it's as great as the difference between a 30-second game and a 1-minute game, or between a 1-minute game and a 2-minute game.  That's why I suggest a logarithmic scale.  An extra 15 seconds per move doesn't matter so much if you've already got 90 seconds to think, but it's huge if you had only 15 seconds in the first place.  In general doubling the time to think should double its closeness to being postal.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by MrBrain on Jan 3rd, 2005, 12:18pm
Well, again, I'm just going on how the time controls "feel".  To me, 30 seconds doesn't seem anything other than really fast.  It doesn't seem more than 5% slow.  Having 45 seconds going 40% to slow seems really out of whack to me.  I guess it maybe is my preference for being able to calculate variations.  1:15 seemed very fast to me during the championship.  I was rushed on almost every move.  That's why I don't think it can be more than 50% slow.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by MrBrain on Jan 4th, 2005, 7:26am
And using your same reasoning, if we had a 7-second control, then 15 seconds would be 20% or 25% "slow".  I think just because we're being somewhat insane and adding a 15-second control (which I will probably never play), that shouldn't be an excuse to make 30 seconds then count 25% towards slow.  30 seconds and 15 seconds are both very "fast".  To me it's the difference between ridiculous speed and ludicrous speed.  Just because we have ludicrous speed, it doesn't mean that ridiculous is slow.

You have to have some limit where it's just totally fast.  Before anyone even mentioned 15 seconds, 30 seconds would have been a reasonable fast "boundary".  Anything below could have just been 100% fast.  I'm making a concession in my table by even calling 30 seconds 5% slow.  25% is way to much in my opinion.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by Fritzlein on Jan 4th, 2005, 9:32am
I know you like to play at slower speeds, Mr. Brain, and I appreciate your arguments for having the World Championship games at a slow speed, but it's going a bit too far to call fifteen seconds per move a ludicrous speed for anyone else to like to play at.  You know perfectly well that lots of people like to play fast games, and in particular that a large majority of games on the Internet Chess Club are played at time controls even faster than 15 seconds per move.  

But in any event, under the logaithmic scheme that I propose, the endpoints are essentially irrelevant.  They could be extrapolated to a time control that is indeed ludicrous, without changing anything important.

Maybe the whole idea of having endpoints is just a distraction.  As we learn in high school algebra, two points define a line, but you can also define a line via a point and a slope.  If the endpoints are point of contention, we could think of slow-fast ratings in point-slope terms instead.

We could define the "normal" time control as one minute per move, and give people a single number as their rating based on that time control.  Then we add the assumption that some people benefit more than others from having extra time to think (and equivalently that they are hurt more than others by having less time to think), and use that to define a "slope".

For example, let's say my rating is should be 2000 at one minute per move, and my slope is -50 rating points per doubling the time.  At two minutes per move my rating would be 1950 while at 30 seconds per move my rating would be 2050.  You may think it is ridiculous to play at 15 seconds per move, but if I actually wanted to play that fast, we would extrapolate that my rating at that speed would be 2100.  I would join you in thinking that it is ridiculous to play at 3.75 seconds per move, but if we did, we can guess that my rating would be 2200 at that speed.

Now whether we try to determine two endpoints or try to determine a point and a slope, either way we determine a line.  The only way endpoints would be relevant is if people played games beyond the endpoints.  If all games were played between 30 seconds per move and two minutes per move, it wouldn't matter practically whether we chose those as our endpoints or 1 second per move and 1 week per move as our endpoints.  (Well, it would flatten all the slopes somewhat, but that's just changing the scale, like multiplying everything in the current system by two wouldn't have any practical effect.)

The reason to have endpoints at all would be that we wanted to cap the effect of time on a game.  We might reason that beyond giving both players four minutes per move to think, giving extra time isn't going to change the balance of playing strength.  By the same token we could say that shortening the time below 15 seconds per move also won't change the relative chances of the two players.  I'm not sure that I buy either of these arguments.  In particular, I guarantee you from personal experience that there is a significant difference in Bomb's playing strength between 15 seconds per move and 30 seconds per move.  To you both time controls are "faster than I want to play", but those of us who are interested in playing that fast need to draw distinctions.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by MrBrain on Jan 4th, 2005, 10:24am
That's a very good explanation.  But I disagree that you can really fit one's performance linearly onto a time-logarithmic scale.  Realistically, there is a limit to performance change at very high speeds, and at very long controls.  This means there will be horizontal asymtotes at the ends of a graph of performance vs. time control for any entity.  (For a computer program, the fast-end asymtote will be reached more slowly, but there is an asymtote there nonetheless.)

Compare our two schemes.  Mine tries to take into account a limit on what a reasonable speed (for a human) is, on both ends.  Yours does not.  If one was to plot the performance (rating) of most players at different speeds versus the percentages that both you and I have proposed, I would wager that the more linear graph would be with my percentages.

(Is there any way that we can extract performance data from existing games at different controls?  This would likely be complicated by the facts that players improve over time, and that they tend to play the same time control repeatedly over short periods.)

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by MrBrain on Jan 4th, 2005, 10:30am

on 01/04/05 at 09:32:20, Fritzlein wrote:
For example, let's say my rating is should be 2000 at one minute per move, and my slope is -50 rating points per doubling the time.  At two minutes per move my rating would be 1950 while at 30 seconds per move my rating would be 2050.  You may think it is ridiculous to play at 15 seconds per move, but if I actually wanted to play that fast, we would extrapolate that my rating at that speed would be 2100.  I would join you in thinking that it is ridiculous to play at 3.75 seconds per move, but if we did, we can guess that my rating would be 2200 at that speed.


And according to this type of reasoning (not taking into account leveling off of performance), one could incorrectly conclude that given enough time to think, that someone who's slope is steep will be able to beat any player who's slope is not as steep.  On the ICC, my chess rating difference between bullet (1 minute) and standard (15 minutes) is about 900 points.  Therefore, I could conclude that if I were to play a game against Garry Kasparov with a 1-week-per-move control, I should be able to win the game.  This is obviously not the case.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by MrBrain on Jan 4th, 2005, 10:37am
Well, despite the disagreement here, I still think we're on the right track.  If we want to do the "normal" rating with another parameter for rating change per doubling of time, just as a starting point to try it out, I'd not have any problem with that.  And by the way, my reference to "ridiculous" and "ludicrous" was not to criticize, but to be a little humorous (i.e. SpaceBalls) and simply to make a point.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by Fritzlein on Jan 4th, 2005, 11:13am
One reason to have a cutoff at the fast end of the time control is that some time is used up by Internet lag, moving the pieces with the mouse, the client displaying the previous move, etc.  At some point one is measuring physical reflexes or connection speed or something other than thinking fast about Arimaa.  As much as I like fast games, I might want to consider, say, 10 seconds per move as the fastest measurable speed, given the current client, Internet, and server situation.

One reason to have a cutoff at the slow end is that at some point one stops using all the time available.  Omar reports that at 3 minutes per move against Bomb, he started doing other things in the middle of the game.  When we start the postal tournament at one move per day, I seriously doubt that I will ever spend more than ten or fifteen minutes thinking about a particular move; I expect that usually I will spend less.  I guess I could see extending the slow endpoint to 6 or 8 minutes per move at a atretch.

Assuming that a player's strength is linear in the logarithm of time allowed is certainly an approximation, and I agree that we can expect that the approximation gets worse and worse as you go towards the extremes, which is a more general reason for wanting to have a cutoff.  It's a model which is guaranteed to break down if you test it rigorously (just like the Elo model of chess performance itself breaks down under scrutiny).  The advantage of the proposal is that it does something somewhat reasonable to recognize that fast time controls affect some players more than others (particularly computers versus humans) and yet doesn't need four or five different ratings per person, with abrupt cutoffs between the categories.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by 99of9 on Jan 4th, 2005, 11:14am

on 01/04/05 at 10:24:39, MrBrain wrote:
I disagree that you can really fit one's performance linearly onto a time-logarithmic scale.  Realistically, there is a limit to performance change at very high speeds, and at very long controls.  This means there will be horizontal asymtotes at the ends of a graph of performance vs. time control for any entity.  (For a computer program, the fast-end asymtote will be reached more slowly, but there is an asymtote there nonetheless.)


Please attempt to justify this asymtote hypothesis - I don't believe it.

The depth a computer program can search as a function of time limit is theoretically exactly logarithmic.  It's hard to say exactly how depth is related to strength, but it's probably fair to guess that they're basically proportional.

So yes, a computer's performance gets better logarithmically with increasing time controls.  

A human's ability changes even more as a function of time (ie a higher gradient on the log plot).  This is clear when you consider that when you get down to very short times you become unable to even complete the full 4 steps in time.  I'd argue it's not an asymptotic plateau, it's in fact closer to an asymptotic fall away!!!

By the way, we are confusing the issue when we say that computers get "better" at fast time controls.  What we mean there is that humans get worse quicker so fall behind.

Unless there is good empirical evidence otherwise, in my opinion it is better to go with Fritzlein's 2-parameter model rather than Brain's 9-parameter model.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by Fritzlein on Jan 4th, 2005, 11:25am

on 01/04/05 at 11:14:23, 99of9 wrote:
The depth a computer program can search as a function of time limit is theoretically exactly logarithmic.  It's hard to say exactly how depth is related to strength, but it's probably fair to guess that they're basically proportional.


Actually, I believe that there have been studies done on the playing strength of chess programs which conclude that their rating (using a fixed time control and humans as a yardstick) is roughtly linear in search depth.  This would suggest that a variable time control has an effect on computers proportional to the logarithm of search time.  However, I read this long ago, before chess computers were World Championship caliber, so I wouldn't be surprised if there has been some asymptotic falling off at the high end.

[Edit] A quick Google search gave me this link, which disputes the notion of asymptotic falling off, and implies that a linear relation between search depth and playing strength may well hold indefinitely:

http://supertech.lcs.mit.edu/~heinz/dt/node49.html

[Further Edit] 99of9 points out that my proposal would over time give everyone a different slope, but that the varying slopes would only be relative.  Computers would have negative slopes, not because they get worse at higher speeds, but because humans get better faster.

An alternative would be to use computer self-play to fix their own positive slope, (which might be around 100 points per doubling of search time)  and thusly anchor the slopes of the entire system in a way similar to the plan for using computers to anchor the ratings in another thread.   Presumably then fixed-depth searchers like Arimaazilla would have a zero slope and Humans would have a steep slope of 200 or more points per doubling of time.

As first blush, my intuition is that computers would actually be more accurate in anchoring the scale of the slopes than in anchoring the scale of the ratings, because I expect any computer with any reasonable time management to have approximately the same slope.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by MrBrain on Jan 4th, 2005, 12:30pm
Don't confuse search depth with rating.  Yes, a computer can think more ply, but so can its opponent, human or mechanical.  Again, if you compare two players, and there's a slope difference between the two at a certain speed, common sense dictates that that slope difference cannot continue indefinitely.  Just look at my chess example to see why.  And as was correctly pointed out, there are limits for humans (interface, boredom).  These same limits may not apply for computers, since they can move much faster and don't get bored.  But regardless, I still think we should have some limit that approximates what happens in reality.

As for the asymtotes, a simple example should suffice:  If we play a game with a 10-day limit per move, player A should expect a certain win % versus player B.  If we make it a 20-day limit, is this really going to change the % by as much as 1-minute versus 2-minutes?  Of course not.  Why?  It's leveling out.  There's some limit to how good a player can be, regardless of how much time per move is alotted.  Again, I'm never going to beat Kasparov, even if I had a year to think about each move.  Why?  Because the rating gain I get by going from 1 to 15 minutes for the game cannot hold up indefinitely.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by omar on Jan 5th, 2005, 1:29am
Great discussion.

I was contemplating about what times to use for the boundaries between the different catagories of speed and also how many speed catagories to have. This method of using two ratings with percent contributions solves those problems.

I think we have to keep in mind that the pace of the game is not just controled by time per move, but also the starting reserve. So we might need to do something like divide the number of seconds in the starting reserve by some number like 45 that represents the average number of moves per game and add that to the time per move (in seconds). Also I know from experience that having a higher max reserve limit makes the game feel more slower (ie 1/2/100/5 feels slower than 1/2/100/2) but Im not sure how to incorporate that.

One other thing to keep in mind is that maybe some humans play best at say a pace of 1 min per move but get worse as speeds get faster or slower than that. A line may not be the best model for humans. So in this case maintaining different speed catagories would model the situation better (the more catagories the better it can be modeled). Or maybe with the other method we can model it with two line or a more complex function. For computers though I think a single line model is a pretty good fit.

The catagory division method has the advantage that we don't have to predetermine how to model the effects of speed on ratings. Also we can have more catagories if we want the model to be better. But it also has a lot of disadvantages; like having to maintain a lot of rating numbers and predetermining boundries between the catagories.

So as you can probably tell, Im still contemplating which method to use.


Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by MrBrain on Jan 5th, 2005, 7:22am
In general, I don't feel that starting reserve is a big issue, but it does come into play at the start of the game (e.g. Omar vs. Brain 2004 WC).

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by clauchau on Jan 6th, 2005, 1:31pm
Yes, interesting discussion. One improvement with the available time controls is needed in any case I think - the gameroom should allow the two players to play according to distinct time controls.

Firstly because otherwise we are getting isolated groups of players -- one group for each speed -- and their ratings may be going to get relatively wrong between different groups. Let's get accurate by allowing two different time settings for each opponent.

Secondly because I'd like to play at 1 min per move against bots playing at 15 sec per move.

Then, as the current discussion hints at, we might update the ratings accordingly, taking the times into account. The lightning bots ratings wouldn't decrease much against slow-playing humans.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by omar on Jan 12th, 2005, 4:05pm
Hi Claude,

I think you are suggesting that with each game we maintain two seperate time controls; one for each player. We can then use this when computing the ratings. I kind of like the idea, but implementation wise its going to require a lot of changes at all levels; database, game server and game clients.

Let me throw out another proposal that trys to acheive the same thing, but a little differently.

After a game is over, we can calucluate from the event log the average number of seconds each player took to make the move. If you open up a recently played game and scroll down the chat area you will see the average move times that were calculated from the event log.

I could much more easily record these numbers with each game in the database. A program could be written to fill in these numbers for the older games. We could then use these numbers in calculating the ratings.

Remember that with each game we also record what each players rating was going into the game.

Our current rating system uses something like this:
 myRatingAfter = rating(myRatingBefore, oppRatingBefore, didIWin)

With the new performance rating system that we had discussed earlier a players rating is calculated on the fly using the players complete game history and not just the rating after the previous game. It uses something like this:
 myRatingAfterGameX = rating(oppRatingFromGameX, didIWinGameX, weightOfGameX, oppRatingFromGameX-1, didIWinGameX-1, weightOfGameX-1, ... up to Game1)

Notice that only the opponents ratings are used in computing our rating. Also weights are used to give varying amount of importance to the games. We discussed a lot about how to set the weights based on factors such as how many unique opponents you have played, how old the game is, etc.

I suppose we could also use the performance rating system to ask a question like 'what is my rating at 20 seconds per move'. To answer such a question we could adjust the weights of the games so that games where the player used exactly 20 seconds per move (average move time) are given a full weight of 1 and games that are farther (time wise) from the time in question are given a lesser wight (down to a limit of 0). These weights are multiplied by the weights that would otherwise be used. We could also compute some uncertianty of the rating at the given time, based on the average of the weights. So if the uncertianty was 1 then it would mean that in all the games used in computing the rating the player happend to have a time per move that is the same as the time in question.

The rating of a player that we store with the game is computed based on the average time per move the player took in that game. For example if a player took 52 seconds per move in this game then we use the performance rating of that player computed for 52 seconds per move. We might also compute and store the rating uncertianty with the game and could also use the uncertianty when determining the weight of the game.

So we don't need to store any ratings with each player. Rather we just compute them on the fly from the players game history. We should also be able to compute what the players rating would be at a given time per move speed. When computing the rating of a player that is displayed in the gameroom it is computed without regard to any times associated with past games; thus the time component of all the weights is 1. When comuting the rating of a player that is stored with the game we use the average time per move the player took in that game to bias the rating to give more weight to previous games where the player took similar time.

We could have a form where given the player name and average time per move it shows what the players rating is at that speed along with the uncertianty at that speed. For example if a player has played all the games at 60 second per move and we ask what the players rating would be at 30 seconds per move, it would be the same rating, but the uncertianty would be higher. The uncertianty would be higher still if we asked what the rating would be at 15 seconds per move; the rating would still be the same.


Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by Fritzlein on Jan 18th, 2005, 3:23am
Omar, I'm suspicious of rating the game based on how much time each player actually uses, rather than on how much time was allotted.  I just now played Bomb/2004 at 3 minutes per move, but once I got a strong position I didn't need to use all my time, so I moved faster.  By your proposal this would affect my rating at a faster speed, not my rating at 3 minutes per move.  I could win a half dozen games against Bomb and they would all count towards my (let's say) 90 second rating, but then one time when I get in trouble and have to use up my entire time and lose anyway, that loss would be the only game counting towards my 3-minute rating.  Your proposed system might well conclude that the slower I play the worse I am, when the causality is in fact reverse.

I like the idea of allowing games with a time handicap.  However, that would force us to try to create a system where the ratings at different time controls were on the same scale.  Presumably then both bots and humans would have ratings that increased steadily with time allowed.

On the other hand, I don't like the notion of having to calculate a player's rating at a given speed from scratch for every single speed.  If I understand what you were proposing, Omar, the calculated rating at each speed, if plotted on a graph, wouldn't necessarily be linear or any reasonable curve.  To keep the calculation time sane, I would strongly suggest making the assumption of a linear increase in rating with the logarithm of time allowed.  Then we can pay attention to the endpoints only.

Incorporating this into the other changes to the rating system we had talked about would then involve only adding one extra saved parameter per player per game.  Instead of saving each player's rating for each game, you would save each player's "lightning rating" and "postal rating".  If you wanted to also save the "effective rating" for each player for that game you could do so, but it wouldn't be necessary since it could be inferred from the time control and the saved endpoint ratings.

If I understand how your testratings perl script worked (for the new system, pre-time-control-dependent), you just tried different ratings until you found the best match for the game history as weighted.  To add a time control dependency, you would instead have to be testing pairs of endpoint ratings (rather than a single rating) to find the best fit.  Unfortunately the best fit can no longer be insured by making the total wins match expected wins, since more than one pair of endpoints will do that exactly.  To sop up the extra degree of freedom you would have to divide the actual wins into fast wins and slow wins by the proportionality suggested earlier, and also have the expected wins be thusly divided, and then make both sides match.

One potential problem is that if every single game a player has played is at the same time control, there will be no way to get a handle on the slope.  I suggest solving that by having one of our fictitious draws be at lightning speed and the other at postal speed.

Is it clear what I'm suggesting overall?  If I'm not mistaken the formulas I'm suggesting could be retroactively applied to our archived data.  The results might be somewhat weird because most bots play at one time control only, but I'm curious what numbers would pop out nonetheless.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by omar on Jan 22nd, 2005, 11:23am
That's a really good point Karl. Humans do tend to play a bit faster when they are in a winning position and take more time to think when they are losing. It would tend to be a bit misleading. However, I don't think the distortion will be too much and maybe we can even compensate for it by adding 10% more to the time when the player wins and subtracting 10% when the player loses.

Regardless of what system we eventually go with, we first need to implement it and test it out. I will eventually provide the code for any system that I propose. That way people can try it out, check the code for any bugs and get familiar with it. I will also provide a condensed version of the games database so we can run the systems on the real data.

This way we can try out all the proposed systems before selecting one.



Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by 99of9 on Jun 8th, 2005, 8:05pm
I've done some simulations.

Although it's possible that there are still a bug or two in my code (let me know if you want the code to help look for them!), I wanted to share the early results.

I used Fritz's fast/slow interpolation method, with cutoffs at 15s on the fast side, and 8 minutes on the slow side.  The 8 minute thing is as a concession to MrBrain's argument's in this thread.  I agree with him that a 4 minute game is different to a postal game.  Correspondence players could reasonably argue that they feel "rushed" in a 4 minute game.

Because this simulation just uses the game results, it does not include things like:
1) Rating Uncertainties increasing 1 point per week.
2) Bots inserted into the system with a rating matching that of their "parent".  All new players are given a rating of 1500.

Other technical issues:
3) Abandoned games do not count in these ratings at all, they're treated like unrated games.
4) I agree with omar that the initial time reserve matters.  It has been shared out over the number of moves in the game to determine the effective number of seconds per move.  As has the 60s for the first move (minus the normal time per move which is not awarded on the first move).
5) Results are sorted according to "Slow" (8 minute+) rating.
6) I left out anyone who didn't play any rated games.

I will save my commentary on the actual ratings for a subsequent post.  For now, here they are:


SLOW FAST NAME
2111 2130 Fritzlein
2036 1608 Belbo
2000 1610 omar
1963 1972 99of9
1896 1832 bot_speedy
1853 1706 bleitner
1830 1625 bot_Bomb2004CC
1815 1529 clauchau
1813 1913 robinson
1813 1742 mouse
1791 1649 RonWeasley
1781 1683 Adanac
1777 1709 bot_bomb
1745 1625 Paul
1725 1797 omarFast
1717 1529 fotland
1709 1586 BlackKnight
1690 1619 ytri
1678 1561 naveed
1677 1805 PMertens
1677 1450 bot_firsttry
1672 1574 OLTI
1652 1638 Ryan_Cable
1647 1541 bot_clueless
1641 1590 bot_Bomb2005CC
1636 1520 schmoe
1634 1578 bot_IIIT
1630 1582 bot_Arimaanator
1624 1550 Hannibal
1616 1518 dtj
1612 1594 bot_Bomb2005Fast
1598 1539 deselby
1597 1452 CeeJay
1596 1584 DorianGaray
1594 1534 Groumpf
1594 1522 inylong
1589 1547 Asturianuco
1587 1517 qsasha
1585 1454 Tore
1581 1531 xabiron
1577 1522 jdb
1576 1484 bot_Occam
1573 1639 bot_Bomb2005Blitz
1573 1527 adsyed01
1573 1521 camperman
1573 1516 TheMadHair
1572 1541 YunK
1570 1528 Keitam
1562 1566 ajedrezDude2
1561 1785 bot_lightning
1556 1521 Jonked
1556 1487 kissl
1556 1470 rajah
1555 1490 88of8
1554 1521 tj1777
1552 1532 TheUglyDog
1552 1508 travis
1549 1519 Paranoia
1546 1528 bot_Clueless2005CC
1546 1510 kerdamdam
1545 1559 spela
1543 1517 pikachamp
1541 1496 bot_Clueless2005Fast
1541 1479 arvindn
1539 1541 Jonathan
1537 1523 BLooodyANgel
1536 1537 mohabbatse
1535 1514 Jojo
1533 1462 trevor
1533 1412 boris_toplak
1531 1557 NativeOne
1531 1521 zman
1529 1482 blue22
1529 1461 Darkenrahl
1529 1438 jemicobel
1528 1511 fdailey
1527 1517 Agt
1526 1538 bot_Bomb2005P2
1526 1510 Guest1637
1526 1363 bot_Clueless2005Blitz
1525 1517 Booyah
1524 1515 KovirWyttcliffGerra
1521 1513 Tobi
1518 1478 bot_GnoBot2005Blitz
1517 1478 Threlicus
1514 1509 domi6
1514 1509 Simon
1514 1509 someone
1514 1505 msz
1513 1508 ktartandude
1512 1536 Yusei
1512 1514 bot_Clueless2005P1
1509 1509 Ander
1509 1506 Mauret
1509 1469 Spunk
1508 1505 eyvin
1508 1503 Josh
1507 1504 ntroncos
1505 1510 dbriggs
1504 1538 PatoGuy
1503 1491 sutzli
1502 1519 bot_Aamira
1502 1501 Miki
1502 1485 novacat
1499 1496 CraigS
1499 1463 Vinvin
1498 1499 bot_Clueless2005P2
1497 1503 Renaissance
1497 1498 bot_Loc2005P2
1497 1497 chess77
1497 1491 UltraWeak
1494 1494 jurc
1493 1521 gsyed
1493 1243 bot_Loc2005Blitz
1492 1497 cloakski
1491 1494 Kuritzky
1490 1524 Lucky
1490 1492 Scott
1488 1495 flea13
1488 1491 GenghisKhan
1486 1539 JoelMcNary
1486 1495 logosity
1486 1494 chengjj
1486 1463 bot_ArimaanatorFast
1484 1512 junaid
1484 1494 Asarel
1484 1494 bot_txapeldun
1484 1490 Guest1869
1484 1490 Sebastien
1484 1489 bot_GnoBot2005CC
1484 1468 bot_speedtrap
1483 1493 bobby
1483 1493 Guest1313
1483 1493 JesperTK
1483 1491 rotem
1483 1486 Monkeybush
1483 1475 Ytterbium
1483 1465 Dacar
1482 1488 Michael
1482 1488 Russell
1482 1479 wassupbiloxi
1481 1487 carolaina
1480 1488 dethwing
1480 1488 tryingarimaa
1480 1474 GoldenBear
1479 1676 kamikazeking
1479 1502 Yaron
1479 1487 mattc
1479 1481 Minkus_27
1478 1485 jmartinezot
1478 1480 ziroby
1478 1469 Arpad
1477 1492 robstar
1477 1492 terra
1477 1480 Troak
1476 1485 U_WIL_LOOS
1476 1467 AmitChaturvedi
1475 1497 Damien
1475 1490 Giszmo
1474 1494 Guest207
1474 1479 Juha
1474 1463 Kaffinator
1473 1507 bot_Brain
1473 1443 carlsquared
1472 1482 lyc123
1472 1476 Guest602
1471 1492 Magik
1471 1475 xxFLAWLESSxx
1470 1476 Adrian
1470 1456 bot_Bomb2005P1
1469 1491 naveed4
1469 1334 bot_GnoBot
1468 1471 Ralek
1466 1478 palladino86
1466 1463 richard
1466 1449 Janzert

1466 1447 antoniotheripper
1465 1441 Aaron
1464 1487 nightshadejim
1464 1486 Lev
1464 1476 bot_GnoBot2005P2
1464 1476 miri
1464 1476 Yold
1463 1508 siddiqna
1463 1485 angrytuna
1463 1478 RainMan
1463 1477 Kobold
1463 1476 thinkinmetal
1463 1474 Zach457
1463 1458 botkiller
1462 1485 jshira
1462 1468 Stanza
1461 1474 the_hooligan
1460 1536 6sense
1459 1488 ecaronus
1459 1473 nofxz
1459 1466 bot_Loc2005P1
1458 1490 taral
1458 1457 Diep
1457 1483 ih8evilstuff
1457 1483 Kristijan
1456 1484 Guest383
1455 1470 bot_2xv7
1455 1468 Hopalong
1454 1482 Guest1097
1454 1482 tarot
1454 1471 v_dhanasekar
1452 1488 illz
1452 1484 Kanakuk
1452 1483 ixpfah
1452 1395 bot_Arimaazilla
1452 1385 bot_Arimaazon
1451 1482 bot_haizhi
1451 1481 bot_Viper
1451 1481 lemmy
1451 1461 Moi
1451 1460 cam43031
1450 1647 haizhi
1450 1458 whiteKnight
1449 1480 knosuke2001
1449 1469 Orc
1448 1492 ugaiW
1447 1479 Chad_Starr
1447 1479 sdude26
1447 1463 xquezme
1446 1479 Joolz
1446 1467 yusuf
1446 1459 swamplor
1446 1454 Mr_Rabbit
1445 1479 Guest1601
1445 1479 Squirrel
1445 1478 Farkov
1445 1444 kraj
1444 1479 Getafix
1444 1458 keita
1444 1448 FANTAZM
1443 1479 Guest1762
1443 1463 joel
1443 1463 laotsi
1442 1463 santiago
1442 1462 Scheffer
1442 1330 bot_Loc2005Fast
1441 1464 agner_g
1441 1462 Guest2764
1441 1461 transience
1441 1436 IMath
1440 1463 Rtan
1440 1462 sillyhat
1439 1460 MiNd_Of_ThE_uSeR

1438 1460 Guest2182
1438 1459 BusinessEnd
1438 1452 Frank
1438 1449 kousukejp
1438 1294 bot_GnoBot2005Fast
1437 1463 rhy
1437 1460 eyeoft
1436 1505 maker
1436 1462 Kitte
1436 1461 comabomber
1434 1501 AjedrezDude
1434 1475 Triffnix
1434 1342 WagnerK
1431 1488 brownsugar726
1431 1438 Magrathean
1431 1325 Gregorius
1428 1399 bot_GnoBot2005P1
1427 1472 Guest1824
1426 1472 Brick_Salad
1426 1453 N8IVTXUN
1425 1471 wogdog
1424 1478 Sophi05
1424 1454 brad
1424 1435 mapu
1424 1430 lihanzo
1424 1401 Sameer
1423 1450 AllAmericanAlligator
1423 1450 AnthonyR
1422 1471 eggsalad
1422 1466 Vanilla
1421 1469 Mop
1420 1440 pcpdams
1419 1470 karmaGfa
1418 1482 Shell
1418 1469 DarkWIng
1417 1431 IamCoach
1416 1532 craig
1416 1452 tough_to_win
1414 1471 Blyx
1414 1466 bsdude
1413 1444 Robert
1410 1461 Aamir
1410 1461 adannada
1410 1438 Jimmy_Newtron
1410 1416 Merlin
1410 1387 chlydra
1409 1466 Greytle
1409 1466 Ratte
1409 1404 boronbye
1408 1443 jello
1405 1476 quicky
1405 1435 Arimaardvark
1404 1463 gamer
1404 1436 nbarriga
1404 1415 hhornet
1397 1457 novicehex
1397 1429 matjaz
1396 1459 jip
1396 1435 ImranG
1392 1432 zaf
1389 1458 chess_master
1389 1445 coachbudka
1389 1430 Keith
1389 1429 Terrabang
1389 1404 oali
1388 1423 Bluewolf
1385 1457 freddy
1385 1406 Keeps
1384 1412 megamau
1383 1365 leo
1382 1412 TipAndMe2
1381 1421 RSA
1381 1388 Buellgirl
1380 1428 alexispoquiz

1380 1426 nyc769
1378 1422 Albright
1377 1454 Shogubaba
1377 1368 Shadowolf
1375 1452 lesmisrules
1374 1452 TRauMa
1369 1496 asyed94
1368 1407 Nate1729
1364 1447 KAMUI
1364 1445 Yuri
1363 1447 Jiodek
1362 1412 henriksh
1361 1522 MrBrain
1361 1428 david
1359 1409 doe
1356 1412 csquared
1354 1466 Imran
1354 1405 bot_Loc2005CC
1354 1337 Vincent
1353 1434 Mitja
1351 1404 aanghelescu
1350 1423 CrazyMan04
1350 1379 The_Jeh
1348 1401 HEx
1347 1399 Lou
1343 1379 bot_pmertens_1
1342 1399 Eduardo
1338 1444 bot_Loc
1333 1435 gerde
1332 1385 Sirwol
1330 1437 Brillo
1329 1472 Rabbitball
1328 1411 Tom
1328 1392 msdawy
1326 1379 tanukitzu
1323 1386 handsomestofall
1319 1414 yarnalito
1318 1427 semelis
1318 1385 Guest2229
1317 1386 jt4ur
1316 1429 Suke
1315 1432 sanno
1315 1401 markluffel
1314 1400 essemegy
1311 1417 jarrausi
1307 1406 gern
1306 1412 Monedero
1302 1373 sooreams
1300 1423 bot_Sleepy
1299 1368 Shadow_Knight
1297 1358 Saviola
1294 1367 icyrail
1290 1418 tomcstein
1290 1410 HybrdShdw
1290 1361 Sage
1288 1418 Derek
1288 1393 BenW
1284 1363 minhaj
1280 1416 TripleD
1280 1374 bot_leapfrog
1275 1400 Deathlace
1275 1359 Mythmon
1265 1350 netzmacht
1253 1334 graken
1250 1372 asheesh
1246 1729 Arimanator
1243 1400 bot_GnoBot2004CC
1241 1403 LuckyLarry
1235 1400 halidecyphon
1234 1399 Wouter
1231 1381 iceflower
1228 1395 mykmox
1221 1392 sigma1
1208 1389 Seanner27
1190 1349 bot_Arimaalon
1189 1381 BlackJackal
1189 1377 pso
1184 1352 bot_ShallowBlue
1184 1297 Rileyjal
1178 1346 eric
1172 1382 sip8980

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by 99of9 on Jun 8th, 2005, 9:00pm
On looking at the results, I think Fritz's scheme is probably a success.  One thing to keep in mind as you read this post is how to mix fast and slow ratings for intermediate time controls:


slow  fast  control
0%    100% 15s or under
20%   80%  30s
40%   60%  1min
60%   40%  2min
80%   20%  4min
100%  0%   8min or over


Here are a few points of analysis which I think are interesting:

1) Some of the most obviously fast/slow oriented people do indeed show up as significantly different on the two scales.  

On the "Slow" side we have Omar (+390), Belbo (+428),  Clauchau (+286), and Fotland, bleitner, CeeJay, RonWeasley, ...

On the "Fast" side we have Arimanator (-483), Sip8980 (-210), KamikazeKing (-197), Haizhi (-197), ... (PMertens, Rabbitball, and Aamir Syed are also strongly on this side of the equation)

2)  It is interesting that the slow list (as shown in previous post) starts with our two world champions, Fritzleing, Belbo, and follows up with Omar.  I agree with this ratings assessment that these 3 players are themost dangerous at long time controls.  In fact these players were 3 of the top 4 in the postal tourney.  JDB also did very well, but had significantly easier opponents because of his low rating coming in.

3) Don't pay too much attention to the differences in ratings of bots.  As Fritz mentioned earlier, because they are usually constrained to only play one time control, only their weighted average at their preferred control matters.  But they act correctly at sucking/giving a correct mixture of ratings to people who play poorly/well at their particular time control.

4) Here's the current order of favourites these ratings suggest for WC time controls if we conducted a tournament now (2 min = 60% slow / 40% fast):
Fritzlein
99of9
Belbo
robinson
Omar
bleitner
mouse

5) PLEASE NOBODY TAKE THIS PERSONALLY, YOU MAY NOT LIKE THE EXAMPLE, BUT IT IS THE BEST EXAMPLE AVAILABLE TO ME.
Because this system in some sense keeps track of fast and slow ratings separately, strong "slow" players (eg Omar) already have a low "fast" rating, so are not penalised much when they play blitz games and lose.  This means that for example, bomb_blitz doesn't steal all Omar's points, and is instead only moderately rewarded for being able to beat Omar at blitz (which is, at the moment, not such a hard thing to achieve :-)).  A corrolary is that people who can beat bomb at blitz (eg Arimanator) do not then channel those many points on to themselves.  Arimanator earns a reasonable blitz rating for beating a bot which can beat omar at blitz, but  does not earn anything on his slow rating for it.

6) An anomaly explained:  MrBrain may be expected to have a real rating that is higher on slow than it is on fast.  Problem is, he resigned all his postals, so the exact opposite shows up.

7) It's interesting that this effect is bigger than any that Fritz has found in his analysis of the ratings inaccuracies.  Perhaps this should be the first thing we focus on in reforming the rating system?

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by jdb on Jun 8th, 2005, 9:48pm
First of all, a fine piece of work. Your analysis was very illuminating for me.

In my opinion, the percentages and cutoffs for the times, look about right. Certainly usable for a trial run.

How does the interpolated rating compare to the rating using only games at a certain time control? For example, Omar's interpolated rating is 1688 for 30 sec games. (Is this correct?) How does this compare to his 30sec rating? Or is it not feasible to compute a rating based only on a certain time control? Could the difference in these ratings be used as a measure of how well the interpolation system works?

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by PMertens on Jun 8th, 2005, 10:05pm
I like it.

Especially the part that people like Omar dont go down in rating for losing 15s games when its absolutely clear how good they really are.

About that sucking from / giving to bots: would it be possible to seperate vbot/vhuman ranking ?
I can bash certain bots in certain proven ways for hours - that does not make me a good player either.

kudos for the work

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by 99of9 on Jun 8th, 2005, 10:11pm
Yes, that kind of analysis could certainly be used to check how good the interpolation method is (ie Whether performance varies proportional to the logarithm of time.  This was Fritz's approximation based on some knowledge of bot performance in chess.)

I guess to be totally accurate though you'd have to base all *opponent's* ratings on only their 30s games as well.  Now the problem becomes that people wouldn't have played enough games to establish accurate ratings for the time control.  Perhaps there would be sufficient 45s games to try this, I'm not sure.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by 99of9 on Jun 8th, 2005, 11:01pm
Paul, Fritz did a fairly complete analysis of that with the current real ratings system, and found that ratings were not strongly affected by vs bot or vs human play.  I am of the opinion that once you learn methods to beat bots, you automatically obtain a few tactics you can use against humans, so you are automatically better.  In other words, skill is reasonably transferrable.  When Fritz arrived on the Arimaa scene, he claimed he would not be good against humans because he had only trained up against David's Bot Offline.  Obviously Fritz had learnt a few things... and his rating never stopped rising.

I agree that a player's skill/rating can be a bit different vs bots and vs humans, but then again, my rating vs fritz is different to my rating vs robinson, because I find Robinson's style hard to understand, so he springs more traps on me - even though Fritz is clearly a better player than Robinson against the rest of the pool of players.

Anyway, as requested, here is the same list, this time compiled in a simulation excluding all games vs bots... basically treating them as unrated games.  There are certainly differences with the first list, but many of the differences come about from the fact that there are just a whole lot less HvH games played, so the statistics are much more uncertain.  Also the humans don't receive the constant inflation we get when newbies arrive, lose points to bots, then quit arimaa (the points are then "redistributed" from the bots by regular players).


SLOW FAST NAME
1845 1946 Fritzlein
1769 1822 99of9
1750 1457 omar
1730 1780 robinson
1701 1594 Spunk
1697 1516 clauchau
1662 1581 Belbo
1647 1401 jdb
1611 1521 rajah
1602 1481 Adanac
1600 1560 Paul
1596 1552 Asturianuco
1594 1534 Groumpf
1589 1559 kerdamdam
1588 1527 qsasha
1585 1527 boris_toplak
1584 1542 siddiqna
1577 1554 dethwing
1570 1527 adsyed01
1566 1519 naveed
1564 1549 Jonathan
1557 1495 Magrathean
1552 1534 YunK
1552 1508 travis
1549 1651 PMertens
1545 1559 spela
1543 1517 xabiron
1543 1517 pikachamp
1543 1517 brad
1537 1523 BLooodyANgel
1536 1524 Yaron
1531 1479 Yuri
1529 1518 asyed94
1529 1509 mouse
1528 1511 fdailey
1526 1504 Blyx
1525 1535 Orc
1522 1561 omarFast
1518 1371 Tore
1512 1500 RonWeasley
1512 1487 BlackKnight
1511 1521 Yusei
1508 1503 Josh
1507 1494 botkiller
1506 1503 ntroncos
1500 1463 ytri
1498 1536 6sense
1497 1687 kamikazeking
1497 1503 WagnerK
1496 1490 jarrausi
1496 1453 antoniotheripper
1495 1468 adannada
1494 1469 Sameer
1494 1516 craig
1494 1497 nbarriga
1494 1470 camperman
1492 1497 cloakski
1492 1446 trevor
1489 1501 OLTI
1488 1422 bleitner
1486 1524 Lucky
1484 1464 Vinvin
1483 1510 dtj
1483 1477 arvindn
1483 1393 Arimanator
1482 1484 gsyed
1481 1473 Moi
1478 1483 Scott
1478 1470 Rileyjal
1477 1466 Ytterbium
1476 1491 Renaissance
1475 1460 BenW
1475 1465 RainMan
1474 1494 Guest207
1471 1466 fotland
1470 1457 TheMadHair
1469 1500 Juha
1469 1359 megamau
1468 1485 AjedrezDude
1466 1453 AmitChaturvedi
1465 1450 kissl
1464 1476 miri
1464 1476 RSA
1464 1476 Robert
1464 1476 Yold
1463 1477 Kobold
1462 1485 jshira
1458 1490 taral
1458 1472 HEx
1457 1483 Kristijan
1457 1483 Keeps
1457 1483 ih8evilstuff
1457 1483 robstar
1457 1483 handsomestofall
1457 1489 xquezme
1456 1484 Guest383
1455 1423 Aaron
1454 1471 v_dhanasekar
1454 1397 Ryan_Cable
1453 1468 Darkenrahl
1452 1488 illz
1452 1481 Arimaardvark
1451 1500 brownsugar726
1450 1435 whiteKnight
1450 1481 ImranG
1449 1499 CeeJay
1448 1492 ugaiW
1448 1480 TipAndMe2
1448 1448 kraj
1444 1458 keita
1442 1500 PatoGuy
1439 1461 carolaina
1434 1458 yarnalito
1433 1470 Nate1729
1430 1350 jemicobel
1429 1447 novicehex
1427 1430 lihanzo
1426 1337 carlsquared
1424 1478 Sophi05
1422 1454 Jimmy_Newtron
1422 1471 blue22
1421 1414 haizhi
1418 1469 naveed4
1416 1468 Greytle
1414 1470 coachbudka
1411 1441 pcpdams
1409 1462 CrazyMan04
1404 1448 Monedero
1397 1429 matjaz
1376 1483 Imran
1361 1465 junaid
1307 1406 gern
1305 1184 Gregorius
1299 1369 eric
1262 1363 Keith
1222 1454 MrBrain
1216 1470 Rabbitball

Title: Re: Omar = OmarFast   ,   bot_bomb = bot_spe
Post by PMertens on Jun 9th, 2005, 1:04am

Quote:
Paul, Fritz did a fairly complete analysis of that with the current real ratings system, and found that ratings were not strongly affected by vs bot or vs human play


I know ;-) And the way I remembered it I was on the one extreme side of his chart  8)

Well - since your new calculations make me look even worse than the previous ones ... just forget that I asked for it  ;D

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by omar on Jun 12th, 2005, 11:22am
Thanks for trying this out Toby. I was planning to get back to this once we finished the WC format discussion. I would still like to try out the model where the actual time used per move is considered.

If you can send me the code for this I can upload it to the arimaa site so others can download it and try it out also. Or if it's not too long, just post it here.


Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by Fritzlein on Jun 13th, 2005, 12:25am
Thanks a bundle for trying this out!  The results don't seem implausible.  The big differences in fast and slow ratings for some people give us an incentive to get this sorted out as a high priority for the rating system, probably even higher than the "selection of opponents" problem and the "bots don't learn" problem.

A technical question: Why doesn't bot_Bomb2005Blitz have a slow rating of 1500?  If everyone starts with both ratings at 1500, and every game that bot_Bomb2005Blitz plays counts 100% towards fast and 0% towards slow, then shouldn't the slow rating of bot_Bomb2005Blitz remain unchanged throughout?


on 06/08/05 at 21:00:10, 99of9 wrote:
3) Don't pay too much attention to the differences in ratings of bots.  As Fritz mentioned earlier, because they are usually constrained to only play one time control, only their weighted average at their preferred control matters.  But they act correctly at sucking/giving a correct mixture of ratings to people who play poorly/well at their particular time control.


Yes, it would seem that _every_ bot should have a "slow" rating lower than its "fast" rating.  The fact that this doesn't happen is plausibly attributable to almost every bot playing all its games at the same speed, and that some of the bot games played at varying speeds were actually bot vs. bot games.

But there is one case where there might be enough data for a meaningful test, namely the championship bot Bomb2005.  Could you run the program again treating Bomb2005CC and Bomb2005Fast and Bomb2005Blitz all as the same bot playing at three different time controls?


Quote:
7) It's interesting that this effect is bigger than any that Fritz has found in his analysis of the ratings inaccuracies.  Perhaps this should be the first thing we focus on in reforming the rating system?


I will repeat the analysis I did when I grouped the games by opponent type (vs. bot / vs. human), only this time I will group the games by speed (30 seconds and under / 45 seconds and over).

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by 99of9 on Jun 13th, 2005, 1:47am

on 06/13/05 at 00:25:49, Fritzlein wrote:
A technical question: Why doesn't bot_Bomb2005Blitz have a slow rating of 1500?  If everyone starts with both ratings at 1500, and every game that bot_Bomb2005Blitz plays counts 100% towards fast and 0% towards slow, then shouldn't the slow rating of bot_Bomb2005Blitz remain unchanged throughout?

I shared out the extra time due to the initial time reserve and the 60s first move, throughout the game (ie the number of moves in the game matters).  Most 15s games actually end up being about 17s games.


Quote:
Yes, it would seem that _every_ bot should have a "slow" rating lower than its "fast" rating.

This phenomena in real ratings would certainly not show up in this system if a bot was rated above 1500 and only ever played 2 min time controls.  But it wouldn't adversely affect humans anyway (unless later it suddenly started playing a different time control).

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by Fritzlein on Jun 13th, 2005, 1:55am
When I say that every bot should have a slow rating lower than its fast rating, I mean that that's how bots actually perform relative to humans.  But I don't expect fast/slow ratings to behave properly when bots only play at a single time control.

I was saying that we shouldn't be alarmed if some bots have a slow rating higher than their fast rating, since playing at only one time control can cause weirdness.  Also I was saying that the only good test of whether the system correctly shows that bots are better fast than slow is to run the numbers with all three Bomb2005 bots considered as a single bot.  It will be only one data point, but unlike all the other data points, it will be worth something.  :-)

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by 99of9 on Jun 13th, 2005, 2:04am
Yes, you are right, that's what I meant by "real fast/slow rating".  Sorry I didn't adequately communicate my agreement.  I will have a go at bomb sometime - I think your assumption will be shown correct, but there are ways to understand if it isn't.

Title: Re: Omar = OmarFast   ,   bot_bomb = bot
Post by 99of9 on Sep 3rd, 2008, 5:10am
Just for interest I ran this program over an up to date database, here are the results.

HvH slow ratings

1      chessandgo      2122
2      Fritzlein      2089
3      RonWeasley      1985
4      UltraWeak      1919
5      99of9      1874
6      Adanac      1867
7      arimaa_master      1855
8      jdb      1807
9      blue22      1782
10      mistre      1780
11      clauchau      1778
12      omar      1767
13      PMertens      1735
14      Belbo      1713
15      Spunk      1701
16      robinson      1698
17      Soter      1695
18      mdk      1673
19      Rabbit      1672
20      thorin      1669
21      Swynndla      1659
22      OLTI      1654
23      The_Jeh      1643
24      ArifSyed      1641
25      Brendan      1629
26      Ryan_Cable      1625
27      Arimabuff      1622
28      Tanker_JD      1621
29      woh      1617
30      petitprince      1614
31      rajah      1611
32      IceD      1606
33      xquezme      1603
34      Asturianuco      1596
35      Grey      1596
36      Groumpf      1594
37      JacquesB      1593
38      acroninj      1592
39      Ump      1591
40      qsasha      1588
41      mightybyte      1585
42      boris_toplak      1585
43      siddiqna      1584
44      ChrisB      1583
45      LAbiuso      1583
46      Stonkie      1583
47      xepo      1581
48      DorianGaray      1578
49      dethwing      1577
50      art      1576
51      Tuks      1572
52      adsyed01      1570
53      ArimaaCap      1567
54      kurthyl      1565
55      KT2006      1564
56      Guest5409      1564
57      Jonathan      1564
58      arishiki      1561
59      BlackKnight      1561
60      jonaspojken      1560
61      klabe      1560
62      ttt      1560
63      DanTilkin      1559
64      Magrathean      1557
65      ntroncos      1554
66      YunK      1552
67      KamiKazeKiwi3      1552
68      travis      1552
69      Archonn      1548
70      tize      1548
71      BlackShadow      1546
72      Paul      1545
73      Asubfive      1545
74      rusty      1545
75      Yzaxtol      1545
76      spela      1545
77      brad      1543
78      xabiron      1543
79      pikachamp      1543
80      GordonBlack      1541
81      Tau      1540
82      knarl      1539
83      Gesuma      1538
84      i_am_you      1537
85      gert7      1537
86      glitch      1537
87      ZeroOne      1537
88      marcgb      1537
89      BLooodyANgel      1537
90      Agt      1537
91      Ice      1537
92      Virgeist      1536
93      Raymond      1536
94      Yaron      1536
95      mikbuster      1535
96      Heidissimo      1535
97      kerdamdam      1532
98      nauboone      1532
99      Yuri      1531
100      DieInnereMelone      1530


HvH fast ratings

1      Fritzlein      2075
2      chessandgo      2022
3      PMertens      1854
4      robinson      1838
5      Adanac      1811
6      99of9      1786
7      mdk      1785
8      Soter      1771
9      arimaa_master      1746
10      IceD      1698
11      Brendan      1680
12      Arimabuff      1649
13      Ryan_Cable      1645
14      MrObvious      1641
15      ArifSyed      1640
16      naveed      1639
17      Swynndla      1630
18      kamikazeking      1620
19      Tuks      1612
20      ttt      1608
21      UltraWeak      1606
22      Spunk      1594
23      Guest5409      1585
24      willwould      1585
25      DorianGaray      1585
26      seanmcl      1585
27      Belbo      1583
28      challenger      1582
29      mistre      1579
30      BlackKnight      1570
31      Gesuma      1570
32      xepo      1569
33      Adlai      1568
34      Raymond      1566
35      BlackShadow      1565
36      Yzaxtol      1563
37      omarFast      1561
38      nbarriga      1560
39      spela      1559
40      Gorgapor      1555
41      dethwing      1554
42      jonaspojken      1554
43      Asturianuco      1552
44      Jonathan      1549
45      chessdiva27      1549
46      gert7      1548
47      petitprince      1546
48      Lautresault      1545
49      nauboone      1545
50      tize      1544
51      kurthyl      1543
52      OLTI      1543
53      Kraizy_Dave      1543
54      Sir_Twit      1543
55      sakano      1542
56      siddiqna      1542
57      Tau      1541
58      aXiom      1539
59      DieInnereMelone      1538
60      ntroncos      1537
61      JCricket      1537
62      art      1535
63      emeryaj      1535
64      Rancca      1535
65      Orc      1535
66      megamau      1535
67      Groumpf      1534
68      YunK      1534
69      snetnis      1534
70      xquezme      1532
71      Grey      1531
72      jl_      1531
73      archigavr      1531
74      BilalQ      1529
75      Garyth      1528
76      ArimaaCap      1527
77      GordonBlack      1527
78      qsasha      1527
79      boris_toplak      1527
80      adsyed01      1527
81      Fleisch      1526
82      Ice      1525
83      knarl      1525
84      Virgeist      1524
85      Lucky      1524
86      Yaron      1524
87      BLooodyANgel      1523
88      LifeBlade      1523
89      i_am_you      1523
90      glitch      1523
91      marcgb      1523
92      ZeroOne      1523
93      smonroy      1522
94      acroninj      1521
95      drleper      1521
96      rajah      1521
97      Yusei      1521
98      mightybyte      1521
99      pikachamp      1517
100      xabiron      1517



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.