|
||||||||||||||||||||||||
Title: Omar = OmarFast , bot_bomb = bot_spe Post by 99of9 on Dec 22nd, 2004, 12:53pm Interestingly Omar's rating is almost equal to OmarFast's rating, and bot_bomb's is almost equal to bot_speedy's. Perhaps quality is nothing to do with time control :-) |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by omar on Dec 27th, 2004, 1:06am I played a lot of fast games with the 'omar' account and lost to speedy. I think that's why its close to the omarFast account. In the future I would like to classify games into different speed catagories like fast, regular, slow and postal and then have seperate ratings for each. Thus each person would have 4 different ratings; one for each speed catagory. Omar |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by MrBrain on Dec 27th, 2004, 5:57pm I love that idea. I think what you have as "Slow" could be the official time control for an arimaa match. I would recommend no slower than 2 minutes per move. If the idea of choosing the winner at the end of the game time limit by who's used less time (rather than the scoring function) is implemented, we could have a time control for an official match of 2/2/100/10/5. This time control could be used for all official human-human, computer-computer, and human-computer games. This would ensure that no game would exceed 5 hours. The end-effect of this control with the new deciding mechanism would be that in a very long game (more than 70 or so moves), players would have to eventually move faster to avoid using more than half of the time. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by MrBrain on Dec 27th, 2004, 6:05pm But actually, each catagory would have to fall within a range. Here's a proposal: 0:45 per move or faster -- fast 0:46 - 1:30 per move -- regular 1:31 - 1:00:00 per move -- slow More than 1 hour per move -- postal How's that seem? |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by MrBrain on Dec 27th, 2004, 7:11pm Actually, I don't like the connotation of "slow". Call them instead: Bullet, Fast, Regular, Postal |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by Fritzlein on Dec 28th, 2004, 12:43pm There was a wacky idea floated in the discussion of ICC and FICS ratings, which I think has some merit. The idea is that we want to distinguish skill at fast play from skill at slow play, but we don't want to have to maintain four different ratings (or five or six as ICC does). So we fix a super fast time, say 15 seconds per move, and have anything that fast or faster contribute only to the fast rating. Also we fix a super-slow time, say 4 minutes per move, and have anything that slow or slower contribute only to the slow rating. For any time control in between, i.e. for most games, we have it contribute to both ratings in geometric proportion. The formula for how much the slow rating is affected would be (lg(seconds-per-move/15))/4, where lg is the base 2 logarithm. For example
You may wonder why I chose 15 seconds per move as the fast end rather than the fastest time control currently available. My hunch is that if you start having a "fast" rating, and you offer 15 seconds per move as an option for people, then they will play a lot of it. On the Internet chess club, the most popular time control is 5 minutes per game, and a great many games are played at even faster speeds, so I think there would be a popular demand for 15-second games. I certainly would like to try out speedy at that time control! :-) I chose 4 minutes per move as the slow end in order distinguish postal games from slow tournament games. Actually, I doubt there will be many games slower than 2 minutes per move except for postal games. It might make as much sense to put the upper limit at 3 minutes per move, but I'm partial to widely spacing the extremes. [Edit: I messed up the following description in my original post.] Anyway, to calculate an example, suppose Bomb has a fast rating of 2200 and a slow rating of 1600, whereas I have a fast rating of 2100 and a slow rating of 2000. Suppose I play Bomb at a time control 2 minutes per move. Then my effective rating is 2000*0.75 + 2100*0.25 and Bomb's effective rating is 1600*0.75 + 2200*0.25. So for this particular speed my edge in rating is 2025 to 1750. If I lose, I would lose about 25 points, and 75% of that adjustment would go on our respective slow ratings, while 25% of that adjustment would go on our respective fast ratings. Well, maybe the math is too weird, but I like the general idea. Bots could play at a variety of time controls without needing extra account (e.g. bomb/speedy) and without messing up the ratings (e.g. clueless vs. speedy at 30 seconds has as much chance of winning as clueless vs. bomb at 120 seconds, but the former gives clueless more rating points for no reason). All time controls are distinguished, in that they affect the ratings in different proportion. The scale is continuous without weird jumps like 59 seconds per move affecting one rating and 60 seconds per move affecting a completely different rating, plus you don't have to have four (or more) different ratings. What do y'all think? |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by fotland on Dec 28th, 2004, 1:41pm I think 15 second games will give speedy a big edge, but give it a try :) |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by Fritzlein on Dec 28th, 2004, 4:57pm Yeah, it's tough playing that fast against a computer, but it sure is fun. We need a way for bots to offer and/or accept multiple time controls. It would be great if speedy would take matches at a time control 30 seconds or less, say, so that people could choose whether to go for the bullet game or merely blitz. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by 99of9 on Jan 3rd, 2005, 6:18am I think Fritz's idea is GREAT. This would totally solve our time-based-ratings conundrum without introducing millions of different volatile ratings. It also doesn't look too hard to introduce. The only real assumption is that if someone is better at 1 min than they are at 3 min, then they will be even better still at 30 sec. That seems quite reasonable to me. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by MrBrain on Jan 3rd, 2005, 7:52am I also like the Fritzlein idea. Perhaps just have one more rating - postal? If it's just folded into the slow rating, this seems sufficient though. Might need to tweak the percentages. 45 seconds going almost 40% to slow rating seems not quite right. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by MrBrain on Jan 3rd, 2005, 7:57am How about this: time fast slow 0:15 100% 0% 0:30 95% 5% 0:45 85% 15% 1:00 70% 30% 1:15 50% 50% 1:30 30% 70% 2:00 15% 85% 2:30 5% 95% 3:00, postal 0% 100% It's not quite as mathematical, but it does incorporate time controls used in the past, and it seems a little closer to how the different time controls feel (at least to me) from playing the game. What do you think? |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by MrBrain on Jan 3rd, 2005, 8:20am And all this recent talk about time controls gives me an idea as to how to incorporate a time control into the wording of the Arimaa challenge itself (something I think needs to be done). The human designee can choose (before the match) the time control that they like best with the following two restrictions: 1. The amount of time per move shall not exceed 3 minutes (the first 100% slow time control). 2. The amount of time before the game must end shall not be less than 6 hours. (Whether you do 3 hours max per player, or 6 hours total with the time-decision mechanism.) By doing the above, you can avoid what happened last year, which is that Omar got bored with the 3-minute time control. While the bots are still not up to human level, the human can choose a faster control so as to not drag the games out. Perhaps 15 years from now, when bots may be quite strong, the human can choose the maximum time to think to maximize their chances against the killer bot. But while the flexibility is nice for the humans, the two restrictions ensure that a time control is not chosen that is purposely unfair to the bot. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by Fritzlein on Jan 3rd, 2005, 11:40am on 01/03/05 at 07:57:29, MrBrain wrote:
The first thing that jumps out at me is that there should be a much bigger distinction between 15-second games and 30-second games, not just 5%. The difference in how it feels to play at those two time controls is huge. In my opinion, it's as great as the difference between a 30-second game and a 1-minute game, or between a 1-minute game and a 2-minute game. That's why I suggest a logarithmic scale. An extra 15 seconds per move doesn't matter so much if you've already got 90 seconds to think, but it's huge if you had only 15 seconds in the first place. In general doubling the time to think should double its closeness to being postal. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by MrBrain on Jan 3rd, 2005, 12:18pm Well, again, I'm just going on how the time controls "feel". To me, 30 seconds doesn't seem anything other than really fast. It doesn't seem more than 5% slow. Having 45 seconds going 40% to slow seems really out of whack to me. I guess it maybe is my preference for being able to calculate variations. 1:15 seemed very fast to me during the championship. I was rushed on almost every move. That's why I don't think it can be more than 50% slow. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by MrBrain on Jan 4th, 2005, 7:26am And using your same reasoning, if we had a 7-second control, then 15 seconds would be 20% or 25% "slow". I think just because we're being somewhat insane and adding a 15-second control (which I will probably never play), that shouldn't be an excuse to make 30 seconds then count 25% towards slow. 30 seconds and 15 seconds are both very "fast". To me it's the difference between ridiculous speed and ludicrous speed. Just because we have ludicrous speed, it doesn't mean that ridiculous is slow. You have to have some limit where it's just totally fast. Before anyone even mentioned 15 seconds, 30 seconds would have been a reasonable fast "boundary". Anything below could have just been 100% fast. I'm making a concession in my table by even calling 30 seconds 5% slow. 25% is way to much in my opinion. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by Fritzlein on Jan 4th, 2005, 9:32am I know you like to play at slower speeds, Mr. Brain, and I appreciate your arguments for having the World Championship games at a slow speed, but it's going a bit too far to call fifteen seconds per move a ludicrous speed for anyone else to like to play at. You know perfectly well that lots of people like to play fast games, and in particular that a large majority of games on the Internet Chess Club are played at time controls even faster than 15 seconds per move. But in any event, under the logaithmic scheme that I propose, the endpoints are essentially irrelevant. They could be extrapolated to a time control that is indeed ludicrous, without changing anything important. Maybe the whole idea of having endpoints is just a distraction. As we learn in high school algebra, two points define a line, but you can also define a line via a point and a slope. If the endpoints are point of contention, we could think of slow-fast ratings in point-slope terms instead. We could define the "normal" time control as one minute per move, and give people a single number as their rating based on that time control. Then we add the assumption that some people benefit more than others from having extra time to think (and equivalently that they are hurt more than others by having less time to think), and use that to define a "slope". For example, let's say my rating is should be 2000 at one minute per move, and my slope is -50 rating points per doubling the time. At two minutes per move my rating would be 1950 while at 30 seconds per move my rating would be 2050. You may think it is ridiculous to play at 15 seconds per move, but if I actually wanted to play that fast, we would extrapolate that my rating at that speed would be 2100. I would join you in thinking that it is ridiculous to play at 3.75 seconds per move, but if we did, we can guess that my rating would be 2200 at that speed. Now whether we try to determine two endpoints or try to determine a point and a slope, either way we determine a line. The only way endpoints would be relevant is if people played games beyond the endpoints. If all games were played between 30 seconds per move and two minutes per move, it wouldn't matter practically whether we chose those as our endpoints or 1 second per move and 1 week per move as our endpoints. (Well, it would flatten all the slopes somewhat, but that's just changing the scale, like multiplying everything in the current system by two wouldn't have any practical effect.) The reason to have endpoints at all would be that we wanted to cap the effect of time on a game. We might reason that beyond giving both players four minutes per move to think, giving extra time isn't going to change the balance of playing strength. By the same token we could say that shortening the time below 15 seconds per move also won't change the relative chances of the two players. I'm not sure that I buy either of these arguments. In particular, I guarantee you from personal experience that there is a significant difference in Bomb's playing strength between 15 seconds per move and 30 seconds per move. To you both time controls are "faster than I want to play", but those of us who are interested in playing that fast need to draw distinctions. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by MrBrain on Jan 4th, 2005, 10:24am That's a very good explanation. But I disagree that you can really fit one's performance linearly onto a time-logarithmic scale. Realistically, there is a limit to performance change at very high speeds, and at very long controls. This means there will be horizontal asymtotes at the ends of a graph of performance vs. time control for any entity. (For a computer program, the fast-end asymtote will be reached more slowly, but there is an asymtote there nonetheless.) Compare our two schemes. Mine tries to take into account a limit on what a reasonable speed (for a human) is, on both ends. Yours does not. If one was to plot the performance (rating) of most players at different speeds versus the percentages that both you and I have proposed, I would wager that the more linear graph would be with my percentages. (Is there any way that we can extract performance data from existing games at different controls? This would likely be complicated by the facts that players improve over time, and that they tend to play the same time control repeatedly over short periods.) |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by MrBrain on Jan 4th, 2005, 10:30am on 01/04/05 at 09:32:20, Fritzlein wrote:
And according to this type of reasoning (not taking into account leveling off of performance), one could incorrectly conclude that given enough time to think, that someone who's slope is steep will be able to beat any player who's slope is not as steep. On the ICC, my chess rating difference between bullet (1 minute) and standard (15 minutes) is about 900 points. Therefore, I could conclude that if I were to play a game against Garry Kasparov with a 1-week-per-move control, I should be able to win the game. This is obviously not the case. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by MrBrain on Jan 4th, 2005, 10:37am Well, despite the disagreement here, I still think we're on the right track. If we want to do the "normal" rating with another parameter for rating change per doubling of time, just as a starting point to try it out, I'd not have any problem with that. And by the way, my reference to "ridiculous" and "ludicrous" was not to criticize, but to be a little humorous (i.e. SpaceBalls) and simply to make a point. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by Fritzlein on Jan 4th, 2005, 11:13am One reason to have a cutoff at the fast end of the time control is that some time is used up by Internet lag, moving the pieces with the mouse, the client displaying the previous move, etc. At some point one is measuring physical reflexes or connection speed or something other than thinking fast about Arimaa. As much as I like fast games, I might want to consider, say, 10 seconds per move as the fastest measurable speed, given the current client, Internet, and server situation. One reason to have a cutoff at the slow end is that at some point one stops using all the time available. Omar reports that at 3 minutes per move against Bomb, he started doing other things in the middle of the game. When we start the postal tournament at one move per day, I seriously doubt that I will ever spend more than ten or fifteen minutes thinking about a particular move; I expect that usually I will spend less. I guess I could see extending the slow endpoint to 6 or 8 minutes per move at a atretch. Assuming that a player's strength is linear in the logarithm of time allowed is certainly an approximation, and I agree that we can expect that the approximation gets worse and worse as you go towards the extremes, which is a more general reason for wanting to have a cutoff. It's a model which is guaranteed to break down if you test it rigorously (just like the Elo model of chess performance itself breaks down under scrutiny). The advantage of the proposal is that it does something somewhat reasonable to recognize that fast time controls affect some players more than others (particularly computers versus humans) and yet doesn't need four or five different ratings per person, with abrupt cutoffs between the categories. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by 99of9 on Jan 4th, 2005, 11:14am on 01/04/05 at 10:24:39, MrBrain wrote:
Please attempt to justify this asymtote hypothesis - I don't believe it. The depth a computer program can search as a function of time limit is theoretically exactly logarithmic. It's hard to say exactly how depth is related to strength, but it's probably fair to guess that they're basically proportional. So yes, a computer's performance gets better logarithmically with increasing time controls. A human's ability changes even more as a function of time (ie a higher gradient on the log plot). This is clear when you consider that when you get down to very short times you become unable to even complete the full 4 steps in time. I'd argue it's not an asymptotic plateau, it's in fact closer to an asymptotic fall away!!! By the way, we are confusing the issue when we say that computers get "better" at fast time controls. What we mean there is that humans get worse quicker so fall behind. Unless there is good empirical evidence otherwise, in my opinion it is better to go with Fritzlein's 2-parameter model rather than Brain's 9-parameter model. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by Fritzlein on Jan 4th, 2005, 11:25am on 01/04/05 at 11:14:23, 99of9 wrote:
Actually, I believe that there have been studies done on the playing strength of chess programs which conclude that their rating (using a fixed time control and humans as a yardstick) is roughtly linear in search depth. This would suggest that a variable time control has an effect on computers proportional to the logarithm of search time. However, I read this long ago, before chess computers were World Championship caliber, so I wouldn't be surprised if there has been some asymptotic falling off at the high end. [Edit] A quick Google search gave me this link, which disputes the notion of asymptotic falling off, and implies that a linear relation between search depth and playing strength may well hold indefinitely: http://supertech.lcs.mit.edu/~heinz/dt/node49.html [Further Edit] 99of9 points out that my proposal would over time give everyone a different slope, but that the varying slopes would only be relative. Computers would have negative slopes, not because they get worse at higher speeds, but because humans get better faster. An alternative would be to use computer self-play to fix their own positive slope, (which might be around 100 points per doubling of search time) and thusly anchor the slopes of the entire system in a way similar to the plan for using computers to anchor the ratings in another thread. Presumably then fixed-depth searchers like Arimaazilla would have a zero slope and Humans would have a steep slope of 200 or more points per doubling of time. As first blush, my intuition is that computers would actually be more accurate in anchoring the scale of the slopes than in anchoring the scale of the ratings, because I expect any computer with any reasonable time management to have approximately the same slope. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by MrBrain on Jan 4th, 2005, 12:30pm Don't confuse search depth with rating. Yes, a computer can think more ply, but so can its opponent, human or mechanical. Again, if you compare two players, and there's a slope difference between the two at a certain speed, common sense dictates that that slope difference cannot continue indefinitely. Just look at my chess example to see why. And as was correctly pointed out, there are limits for humans (interface, boredom). These same limits may not apply for computers, since they can move much faster and don't get bored. But regardless, I still think we should have some limit that approximates what happens in reality. As for the asymtotes, a simple example should suffice: If we play a game with a 10-day limit per move, player A should expect a certain win % versus player B. If we make it a 20-day limit, is this really going to change the % by as much as 1-minute versus 2-minutes? Of course not. Why? It's leveling out. There's some limit to how good a player can be, regardless of how much time per move is alotted. Again, I'm never going to beat Kasparov, even if I had a year to think about each move. Why? Because the rating gain I get by going from 1 to 15 minutes for the game cannot hold up indefinitely. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by omar on Jan 5th, 2005, 1:29am Great discussion. I was contemplating about what times to use for the boundaries between the different catagories of speed and also how many speed catagories to have. This method of using two ratings with percent contributions solves those problems. I think we have to keep in mind that the pace of the game is not just controled by time per move, but also the starting reserve. So we might need to do something like divide the number of seconds in the starting reserve by some number like 45 that represents the average number of moves per game and add that to the time per move (in seconds). Also I know from experience that having a higher max reserve limit makes the game feel more slower (ie 1/2/100/5 feels slower than 1/2/100/2) but Im not sure how to incorporate that. One other thing to keep in mind is that maybe some humans play best at say a pace of 1 min per move but get worse as speeds get faster or slower than that. A line may not be the best model for humans. So in this case maintaining different speed catagories would model the situation better (the more catagories the better it can be modeled). Or maybe with the other method we can model it with two line or a more complex function. For computers though I think a single line model is a pretty good fit. The catagory division method has the advantage that we don't have to predetermine how to model the effects of speed on ratings. Also we can have more catagories if we want the model to be better. But it also has a lot of disadvantages; like having to maintain a lot of rating numbers and predetermining boundries between the catagories. So as you can probably tell, Im still contemplating which method to use. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by MrBrain on Jan 5th, 2005, 7:22am In general, I don't feel that starting reserve is a big issue, but it does come into play at the start of the game (e.g. Omar vs. Brain 2004 WC). |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by clauchau on Jan 6th, 2005, 1:31pm Yes, interesting discussion. One improvement with the available time controls is needed in any case I think - the gameroom should allow the two players to play according to distinct time controls. Firstly because otherwise we are getting isolated groups of players -- one group for each speed -- and their ratings may be going to get relatively wrong between different groups. Let's get accurate by allowing two different time settings for each opponent. Secondly because I'd like to play at 1 min per move against bots playing at 15 sec per move. Then, as the current discussion hints at, we might update the ratings accordingly, taking the times into account. The lightning bots ratings wouldn't decrease much against slow-playing humans. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by omar on Jan 12th, 2005, 4:05pm Hi Claude, I think you are suggesting that with each game we maintain two seperate time controls; one for each player. We can then use this when computing the ratings. I kind of like the idea, but implementation wise its going to require a lot of changes at all levels; database, game server and game clients. Let me throw out another proposal that trys to acheive the same thing, but a little differently. After a game is over, we can calucluate from the event log the average number of seconds each player took to make the move. If you open up a recently played game and scroll down the chat area you will see the average move times that were calculated from the event log. I could much more easily record these numbers with each game in the database. A program could be written to fill in these numbers for the older games. We could then use these numbers in calculating the ratings. Remember that with each game we also record what each players rating was going into the game. Our current rating system uses something like this: myRatingAfter = rating(myRatingBefore, oppRatingBefore, didIWin) With the new performance rating system that we had discussed earlier a players rating is calculated on the fly using the players complete game history and not just the rating after the previous game. It uses something like this: myRatingAfterGameX = rating(oppRatingFromGameX, didIWinGameX, weightOfGameX, oppRatingFromGameX-1, didIWinGameX-1, weightOfGameX-1, ... up to Game1) Notice that only the opponents ratings are used in computing our rating. Also weights are used to give varying amount of importance to the games. We discussed a lot about how to set the weights based on factors such as how many unique opponents you have played, how old the game is, etc. I suppose we could also use the performance rating system to ask a question like 'what is my rating at 20 seconds per move'. To answer such a question we could adjust the weights of the games so that games where the player used exactly 20 seconds per move (average move time) are given a full weight of 1 and games that are farther (time wise) from the time in question are given a lesser wight (down to a limit of 0). These weights are multiplied by the weights that would otherwise be used. We could also compute some uncertianty of the rating at the given time, based on the average of the weights. So if the uncertianty was 1 then it would mean that in all the games used in computing the rating the player happend to have a time per move that is the same as the time in question. The rating of a player that we store with the game is computed based on the average time per move the player took in that game. For example if a player took 52 seconds per move in this game then we use the performance rating of that player computed for 52 seconds per move. We might also compute and store the rating uncertianty with the game and could also use the uncertianty when determining the weight of the game. So we don't need to store any ratings with each player. Rather we just compute them on the fly from the players game history. We should also be able to compute what the players rating would be at a given time per move speed. When computing the rating of a player that is displayed in the gameroom it is computed without regard to any times associated with past games; thus the time component of all the weights is 1. When comuting the rating of a player that is stored with the game we use the average time per move the player took in that game to bias the rating to give more weight to previous games where the player took similar time. We could have a form where given the player name and average time per move it shows what the players rating is at that speed along with the uncertianty at that speed. For example if a player has played all the games at 60 second per move and we ask what the players rating would be at 30 seconds per move, it would be the same rating, but the uncertianty would be higher. The uncertianty would be higher still if we asked what the rating would be at 15 seconds per move; the rating would still be the same. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by Fritzlein on Jan 18th, 2005, 3:23am Omar, I'm suspicious of rating the game based on how much time each player actually uses, rather than on how much time was allotted. I just now played Bomb/2004 at 3 minutes per move, but once I got a strong position I didn't need to use all my time, so I moved faster. By your proposal this would affect my rating at a faster speed, not my rating at 3 minutes per move. I could win a half dozen games against Bomb and they would all count towards my (let's say) 90 second rating, but then one time when I get in trouble and have to use up my entire time and lose anyway, that loss would be the only game counting towards my 3-minute rating. Your proposed system might well conclude that the slower I play the worse I am, when the causality is in fact reverse. I like the idea of allowing games with a time handicap. However, that would force us to try to create a system where the ratings at different time controls were on the same scale. Presumably then both bots and humans would have ratings that increased steadily with time allowed. On the other hand, I don't like the notion of having to calculate a player's rating at a given speed from scratch for every single speed. If I understand what you were proposing, Omar, the calculated rating at each speed, if plotted on a graph, wouldn't necessarily be linear or any reasonable curve. To keep the calculation time sane, I would strongly suggest making the assumption of a linear increase in rating with the logarithm of time allowed. Then we can pay attention to the endpoints only. Incorporating this into the other changes to the rating system we had talked about would then involve only adding one extra saved parameter per player per game. Instead of saving each player's rating for each game, you would save each player's "lightning rating" and "postal rating". If you wanted to also save the "effective rating" for each player for that game you could do so, but it wouldn't be necessary since it could be inferred from the time control and the saved endpoint ratings. If I understand how your testratings perl script worked (for the new system, pre-time-control-dependent), you just tried different ratings until you found the best match for the game history as weighted. To add a time control dependency, you would instead have to be testing pairs of endpoint ratings (rather than a single rating) to find the best fit. Unfortunately the best fit can no longer be insured by making the total wins match expected wins, since more than one pair of endpoints will do that exactly. To sop up the extra degree of freedom you would have to divide the actual wins into fast wins and slow wins by the proportionality suggested earlier, and also have the expected wins be thusly divided, and then make both sides match. One potential problem is that if every single game a player has played is at the same time control, there will be no way to get a handle on the slope. I suggest solving that by having one of our fictitious draws be at lightning speed and the other at postal speed. Is it clear what I'm suggesting overall? If I'm not mistaken the formulas I'm suggesting could be retroactively applied to our archived data. The results might be somewhat weird because most bots play at one time control only, but I'm curious what numbers would pop out nonetheless. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by omar on Jan 22nd, 2005, 11:23am That's a really good point Karl. Humans do tend to play a bit faster when they are in a winning position and take more time to think when they are losing. It would tend to be a bit misleading. However, I don't think the distortion will be too much and maybe we can even compensate for it by adding 10% more to the time when the player wins and subtracting 10% when the player loses. Regardless of what system we eventually go with, we first need to implement it and test it out. I will eventually provide the code for any system that I propose. That way people can try it out, check the code for any bugs and get familiar with it. I will also provide a condensed version of the games database so we can run the systems on the real data. This way we can try out all the proposed systems before selecting one. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by 99of9 on Jun 8th, 2005, 8:05pm I've done some simulations. Although it's possible that there are still a bug or two in my code (let me know if you want the code to help look for them!), I wanted to share the early results. I used Fritz's fast/slow interpolation method, with cutoffs at 15s on the fast side, and 8 minutes on the slow side. The 8 minute thing is as a concession to MrBrain's argument's in this thread. I agree with him that a 4 minute game is different to a postal game. Correspondence players could reasonably argue that they feel "rushed" in a 4 minute game. Because this simulation just uses the game results, it does not include things like: 1) Rating Uncertainties increasing 1 point per week. 2) Bots inserted into the system with a rating matching that of their "parent". All new players are given a rating of 1500. Other technical issues: 3) Abandoned games do not count in these ratings at all, they're treated like unrated games. 4) I agree with omar that the initial time reserve matters. It has been shared out over the number of moves in the game to determine the effective number of seconds per move. As has the 60s for the first move (minus the normal time per move which is not awarded on the first move). 5) Results are sorted according to "Slow" (8 minute+) rating. 6) I left out anyone who didn't play any rated games. I will save my commentary on the actual ratings for a subsequent post. For now, here they are: SLOW FAST NAME 2111 2130 Fritzlein 2036 1608 Belbo 2000 1610 omar 1963 1972 99of9 1896 1832 bot_speedy 1853 1706 bleitner 1830 1625 bot_Bomb2004CC 1815 1529 clauchau 1813 1913 robinson 1813 1742 mouse 1791 1649 RonWeasley 1781 1683 Adanac 1777 1709 bot_bomb 1745 1625 Paul 1725 1797 omarFast 1717 1529 fotland 1709 1586 BlackKnight 1690 1619 ytri 1678 1561 naveed 1677 1805 PMertens 1677 1450 bot_firsttry 1672 1574 OLTI 1652 1638 Ryan_Cable 1647 1541 bot_clueless 1641 1590 bot_Bomb2005CC 1636 1520 schmoe 1634 1578 bot_IIIT 1630 1582 bot_Arimaanator 1624 1550 Hannibal 1616 1518 dtj 1612 1594 bot_Bomb2005Fast 1598 1539 deselby 1597 1452 CeeJay 1596 1584 DorianGaray 1594 1534 Groumpf 1594 1522 inylong 1589 1547 Asturianuco 1587 1517 qsasha 1585 1454 Tore 1581 1531 xabiron 1577 1522 jdb 1576 1484 bot_Occam 1573 1639 bot_Bomb2005Blitz 1573 1527 adsyed01 1573 1521 camperman 1573 1516 TheMadHair 1572 1541 YunK 1570 1528 Keitam 1562 1566 ajedrezDude2 1561 1785 bot_lightning 1556 1521 Jonked 1556 1487 kissl 1556 1470 rajah 1555 1490 88of8 1554 1521 tj1777 1552 1532 TheUglyDog 1552 1508 travis 1549 1519 Paranoia 1546 1528 bot_Clueless2005CC 1546 1510 kerdamdam 1545 1559 spela 1543 1517 pikachamp 1541 1496 bot_Clueless2005Fast 1541 1479 arvindn 1539 1541 Jonathan 1537 1523 BLooodyANgel 1536 1537 mohabbatse 1535 1514 Jojo 1533 1462 trevor 1533 1412 boris_toplak 1531 1557 NativeOne 1531 1521 zman 1529 1482 blue22 1529 1461 Darkenrahl 1529 1438 jemicobel 1528 1511 fdailey 1527 1517 Agt 1526 1538 bot_Bomb2005P2 1526 1510 Guest1637 1526 1363 bot_Clueless2005Blitz 1525 1517 Booyah 1524 1515 KovirWyttcliffGerra 1521 1513 Tobi 1518 1478 bot_GnoBot2005Blitz 1517 1478 Threlicus 1514 1509 domi6 1514 1509 Simon 1514 1509 someone 1514 1505 msz 1513 1508 ktartandude 1512 1536 Yusei 1512 1514 bot_Clueless2005P1 1509 1509 Ander 1509 1506 Mauret 1509 1469 Spunk 1508 1505 eyvin 1508 1503 Josh 1507 1504 ntroncos 1505 1510 dbriggs 1504 1538 PatoGuy 1503 1491 sutzli 1502 1519 bot_Aamira 1502 1501 Miki 1502 1485 novacat 1499 1496 CraigS 1499 1463 Vinvin 1498 1499 bot_Clueless2005P2 1497 1503 Renaissance 1497 1498 bot_Loc2005P2 1497 1497 chess77 1497 1491 UltraWeak 1494 1494 jurc 1493 1521 gsyed 1493 1243 bot_Loc2005Blitz 1492 1497 cloakski 1491 1494 Kuritzky 1490 1524 Lucky 1490 1492 Scott 1488 1495 flea13 1488 1491 GenghisKhan 1486 1539 JoelMcNary 1486 1495 logosity 1486 1494 chengjj 1486 1463 bot_ArimaanatorFast 1484 1512 junaid 1484 1494 Asarel 1484 1494 bot_txapeldun 1484 1490 Guest1869 1484 1490 Sebastien 1484 1489 bot_GnoBot2005CC 1484 1468 bot_speedtrap 1483 1493 bobby 1483 1493 Guest1313 1483 1493 JesperTK 1483 1491 rotem 1483 1486 Monkeybush 1483 1475 Ytterbium 1483 1465 Dacar 1482 1488 Michael 1482 1488 Russell 1482 1479 wassupbiloxi 1481 1487 carolaina 1480 1488 dethwing 1480 1488 tryingarimaa 1480 1474 GoldenBear 1479 1676 kamikazeking 1479 1502 Yaron 1479 1487 mattc 1479 1481 Minkus_27 1478 1485 jmartinezot 1478 1480 ziroby 1478 1469 Arpad 1477 1492 robstar 1477 1492 terra 1477 1480 Troak 1476 1485 U_WIL_LOOS 1476 1467 AmitChaturvedi 1475 1497 Damien 1475 1490 Giszmo 1474 1494 Guest207 1474 1479 Juha 1474 1463 Kaffinator 1473 1507 bot_Brain 1473 1443 carlsquared 1472 1482 lyc123 1472 1476 Guest602 1471 1492 Magik 1471 1475 xxFLAWLESSxx 1470 1476 Adrian 1470 1456 bot_Bomb2005P1 1469 1491 naveed4 1469 1334 bot_GnoBot 1468 1471 Ralek 1466 1478 palladino86 1466 1463 richard 1466 1449 Janzert 1466 1447 antoniotheripper 1465 1441 Aaron 1464 1487 nightshadejim 1464 1486 Lev 1464 1476 bot_GnoBot2005P2 1464 1476 miri 1464 1476 Yold 1463 1508 siddiqna 1463 1485 angrytuna 1463 1478 RainMan 1463 1477 Kobold 1463 1476 thinkinmetal 1463 1474 Zach457 1463 1458 botkiller 1462 1485 jshira 1462 1468 Stanza 1461 1474 the_hooligan 1460 1536 6sense 1459 1488 ecaronus 1459 1473 nofxz 1459 1466 bot_Loc2005P1 1458 1490 taral 1458 1457 Diep 1457 1483 ih8evilstuff 1457 1483 Kristijan 1456 1484 Guest383 1455 1470 bot_2xv7 1455 1468 Hopalong 1454 1482 Guest1097 1454 1482 tarot 1454 1471 v_dhanasekar 1452 1488 illz 1452 1484 Kanakuk 1452 1483 ixpfah 1452 1395 bot_Arimaazilla 1452 1385 bot_Arimaazon 1451 1482 bot_haizhi 1451 1481 bot_Viper 1451 1481 lemmy 1451 1461 Moi 1451 1460 cam43031 1450 1647 haizhi 1450 1458 whiteKnight 1449 1480 knosuke2001 1449 1469 Orc 1448 1492 ugaiW 1447 1479 Chad_Starr 1447 1479 sdude26 1447 1463 xquezme 1446 1479 Joolz 1446 1467 yusuf 1446 1459 swamplor 1446 1454 Mr_Rabbit 1445 1479 Guest1601 1445 1479 Squirrel 1445 1478 Farkov 1445 1444 kraj 1444 1479 Getafix 1444 1458 keita 1444 1448 FANTAZM 1443 1479 Guest1762 1443 1463 joel 1443 1463 laotsi 1442 1463 santiago 1442 1462 Scheffer 1442 1330 bot_Loc2005Fast 1441 1464 agner_g 1441 1462 Guest2764 1441 1461 transience 1441 1436 IMath 1440 1463 Rtan 1440 1462 sillyhat 1439 1460 MiNd_Of_ThE_uSeR 1438 1460 Guest2182 1438 1459 BusinessEnd 1438 1452 Frank 1438 1449 kousukejp 1438 1294 bot_GnoBot2005Fast 1437 1463 rhy 1437 1460 eyeoft 1436 1505 maker 1436 1462 Kitte 1436 1461 comabomber 1434 1501 AjedrezDude 1434 1475 Triffnix 1434 1342 WagnerK 1431 1488 brownsugar726 1431 1438 Magrathean 1431 1325 Gregorius 1428 1399 bot_GnoBot2005P1 1427 1472 Guest1824 1426 1472 Brick_Salad 1426 1453 N8IVTXUN 1425 1471 wogdog 1424 1478 Sophi05 1424 1454 brad 1424 1435 mapu 1424 1430 lihanzo 1424 1401 Sameer 1423 1450 AllAmericanAlligator 1423 1450 AnthonyR 1422 1471 eggsalad 1422 1466 Vanilla 1421 1469 Mop 1420 1440 pcpdams 1419 1470 karmaGfa 1418 1482 Shell 1418 1469 DarkWIng 1417 1431 IamCoach 1416 1532 craig 1416 1452 tough_to_win 1414 1471 Blyx 1414 1466 bsdude 1413 1444 Robert 1410 1461 Aamir 1410 1461 adannada 1410 1438 Jimmy_Newtron 1410 1416 Merlin 1410 1387 chlydra 1409 1466 Greytle 1409 1466 Ratte 1409 1404 boronbye 1408 1443 jello 1405 1476 quicky 1405 1435 Arimaardvark 1404 1463 gamer 1404 1436 nbarriga 1404 1415 hhornet 1397 1457 novicehex 1397 1429 matjaz 1396 1459 jip 1396 1435 ImranG 1392 1432 zaf 1389 1458 chess_master 1389 1445 coachbudka 1389 1430 Keith 1389 1429 Terrabang 1389 1404 oali 1388 1423 Bluewolf 1385 1457 freddy 1385 1406 Keeps 1384 1412 megamau 1383 1365 leo 1382 1412 TipAndMe2 1381 1421 RSA 1381 1388 Buellgirl 1380 1428 alexispoquiz 1380 1426 nyc769 1378 1422 Albright 1377 1454 Shogubaba 1377 1368 Shadowolf 1375 1452 lesmisrules 1374 1452 TRauMa 1369 1496 asyed94 1368 1407 Nate1729 1364 1447 KAMUI 1364 1445 Yuri 1363 1447 Jiodek 1362 1412 henriksh 1361 1522 MrBrain 1361 1428 david 1359 1409 doe 1356 1412 csquared 1354 1466 Imran 1354 1405 bot_Loc2005CC 1354 1337 Vincent 1353 1434 Mitja 1351 1404 aanghelescu 1350 1423 CrazyMan04 1350 1379 The_Jeh 1348 1401 HEx 1347 1399 Lou 1343 1379 bot_pmertens_1 1342 1399 Eduardo 1338 1444 bot_Loc 1333 1435 gerde 1332 1385 Sirwol 1330 1437 Brillo 1329 1472 Rabbitball 1328 1411 Tom 1328 1392 msdawy 1326 1379 tanukitzu 1323 1386 handsomestofall 1319 1414 yarnalito 1318 1427 semelis 1318 1385 Guest2229 1317 1386 jt4ur 1316 1429 Suke 1315 1432 sanno 1315 1401 markluffel 1314 1400 essemegy 1311 1417 jarrausi 1307 1406 gern 1306 1412 Monedero 1302 1373 sooreams 1300 1423 bot_Sleepy 1299 1368 Shadow_Knight 1297 1358 Saviola 1294 1367 icyrail 1290 1418 tomcstein 1290 1410 HybrdShdw 1290 1361 Sage 1288 1418 Derek 1288 1393 BenW 1284 1363 minhaj 1280 1416 TripleD 1280 1374 bot_leapfrog 1275 1400 Deathlace 1275 1359 Mythmon 1265 1350 netzmacht 1253 1334 graken 1250 1372 asheesh 1246 1729 Arimanator 1243 1400 bot_GnoBot2004CC 1241 1403 LuckyLarry 1235 1400 halidecyphon 1234 1399 Wouter 1231 1381 iceflower 1228 1395 mykmox 1221 1392 sigma1 1208 1389 Seanner27 1190 1349 bot_Arimaalon 1189 1381 BlackJackal 1189 1377 pso 1184 1352 bot_ShallowBlue 1184 1297 Rileyjal 1178 1346 eric 1172 1382 sip8980 |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by 99of9 on Jun 8th, 2005, 9:00pm On looking at the results, I think Fritz's scheme is probably a success. One thing to keep in mind as you read this post is how to mix fast and slow ratings for intermediate time controls: slow fast control 0% 100% 15s or under 20% 80% 30s 40% 60% 1min 60% 40% 2min 80% 20% 4min 100% 0% 8min or over Here are a few points of analysis which I think are interesting: 1) Some of the most obviously fast/slow oriented people do indeed show up as significantly different on the two scales. On the "Slow" side we have Omar (+390), Belbo (+428), Clauchau (+286), and Fotland, bleitner, CeeJay, RonWeasley, ... On the "Fast" side we have Arimanator (-483), Sip8980 (-210), KamikazeKing (-197), Haizhi (-197), ... (PMertens, Rabbitball, and Aamir Syed are also strongly on this side of the equation) 2) It is interesting that the slow list (as shown in previous post) starts with our two world champions, Fritzleing, Belbo, and follows up with Omar. I agree with this ratings assessment that these 3 players are themost dangerous at long time controls. In fact these players were 3 of the top 4 in the postal tourney. JDB also did very well, but had significantly easier opponents because of his low rating coming in. 3) Don't pay too much attention to the differences in ratings of bots. As Fritz mentioned earlier, because they are usually constrained to only play one time control, only their weighted average at their preferred control matters. But they act correctly at sucking/giving a correct mixture of ratings to people who play poorly/well at their particular time control. 4) Here's the current order of favourites these ratings suggest for WC time controls if we conducted a tournament now (2 min = 60% slow / 40% fast): Fritzlein 99of9 Belbo robinson Omar bleitner mouse 5) PLEASE NOBODY TAKE THIS PERSONALLY, YOU MAY NOT LIKE THE EXAMPLE, BUT IT IS THE BEST EXAMPLE AVAILABLE TO ME. Because this system in some sense keeps track of fast and slow ratings separately, strong "slow" players (eg Omar) already have a low "fast" rating, so are not penalised much when they play blitz games and lose. This means that for example, bomb_blitz doesn't steal all Omar's points, and is instead only moderately rewarded for being able to beat Omar at blitz (which is, at the moment, not such a hard thing to achieve :-)). A corrolary is that people who can beat bomb at blitz (eg Arimanator) do not then channel those many points on to themselves. Arimanator earns a reasonable blitz rating for beating a bot which can beat omar at blitz, but does not earn anything on his slow rating for it. 6) An anomaly explained: MrBrain may be expected to have a real rating that is higher on slow than it is on fast. Problem is, he resigned all his postals, so the exact opposite shows up. 7) It's interesting that this effect is bigger than any that Fritz has found in his analysis of the ratings inaccuracies. Perhaps this should be the first thing we focus on in reforming the rating system? |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by jdb on Jun 8th, 2005, 9:48pm First of all, a fine piece of work. Your analysis was very illuminating for me. In my opinion, the percentages and cutoffs for the times, look about right. Certainly usable for a trial run. How does the interpolated rating compare to the rating using only games at a certain time control? For example, Omar's interpolated rating is 1688 for 30 sec games. (Is this correct?) How does this compare to his 30sec rating? Or is it not feasible to compute a rating based only on a certain time control? Could the difference in these ratings be used as a measure of how well the interpolation system works? |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by PMertens on Jun 8th, 2005, 10:05pm I like it. Especially the part that people like Omar dont go down in rating for losing 15s games when its absolutely clear how good they really are. About that sucking from / giving to bots: would it be possible to seperate vbot/vhuman ranking ? I can bash certain bots in certain proven ways for hours - that does not make me a good player either. kudos for the work |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by 99of9 on Jun 8th, 2005, 10:11pm Yes, that kind of analysis could certainly be used to check how good the interpolation method is (ie Whether performance varies proportional to the logarithm of time. This was Fritz's approximation based on some knowledge of bot performance in chess.) I guess to be totally accurate though you'd have to base all *opponent's* ratings on only their 30s games as well. Now the problem becomes that people wouldn't have played enough games to establish accurate ratings for the time control. Perhaps there would be sufficient 45s games to try this, I'm not sure. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by 99of9 on Jun 8th, 2005, 11:01pm Paul, Fritz did a fairly complete analysis of that with the current real ratings system, and found that ratings were not strongly affected by vs bot or vs human play. I am of the opinion that once you learn methods to beat bots, you automatically obtain a few tactics you can use against humans, so you are automatically better. In other words, skill is reasonably transferrable. When Fritz arrived on the Arimaa scene, he claimed he would not be good against humans because he had only trained up against David's Bot Offline. Obviously Fritz had learnt a few things... and his rating never stopped rising. I agree that a player's skill/rating can be a bit different vs bots and vs humans, but then again, my rating vs fritz is different to my rating vs robinson, because I find Robinson's style hard to understand, so he springs more traps on me - even though Fritz is clearly a better player than Robinson against the rest of the pool of players. Anyway, as requested, here is the same list, this time compiled in a simulation excluding all games vs bots... basically treating them as unrated games. There are certainly differences with the first list, but many of the differences come about from the fact that there are just a whole lot less HvH games played, so the statistics are much more uncertain. Also the humans don't receive the constant inflation we get when newbies arrive, lose points to bots, then quit arimaa (the points are then "redistributed" from the bots by regular players). SLOW FAST NAME 1845 1946 Fritzlein 1769 1822 99of9 1750 1457 omar 1730 1780 robinson 1701 1594 Spunk 1697 1516 clauchau 1662 1581 Belbo 1647 1401 jdb 1611 1521 rajah 1602 1481 Adanac 1600 1560 Paul 1596 1552 Asturianuco 1594 1534 Groumpf 1589 1559 kerdamdam 1588 1527 qsasha 1585 1527 boris_toplak 1584 1542 siddiqna 1577 1554 dethwing 1570 1527 adsyed01 1566 1519 naveed 1564 1549 Jonathan 1557 1495 Magrathean 1552 1534 YunK 1552 1508 travis 1549 1651 PMertens 1545 1559 spela 1543 1517 xabiron 1543 1517 pikachamp 1543 1517 brad 1537 1523 BLooodyANgel 1536 1524 Yaron 1531 1479 Yuri 1529 1518 asyed94 1529 1509 mouse 1528 1511 fdailey 1526 1504 Blyx 1525 1535 Orc 1522 1561 omarFast 1518 1371 Tore 1512 1500 RonWeasley 1512 1487 BlackKnight 1511 1521 Yusei 1508 1503 Josh 1507 1494 botkiller 1506 1503 ntroncos 1500 1463 ytri 1498 1536 6sense 1497 1687 kamikazeking 1497 1503 WagnerK 1496 1490 jarrausi 1496 1453 antoniotheripper 1495 1468 adannada 1494 1469 Sameer 1494 1516 craig 1494 1497 nbarriga 1494 1470 camperman 1492 1497 cloakski 1492 1446 trevor 1489 1501 OLTI 1488 1422 bleitner 1486 1524 Lucky 1484 1464 Vinvin 1483 1510 dtj 1483 1477 arvindn 1483 1393 Arimanator 1482 1484 gsyed 1481 1473 Moi 1478 1483 Scott 1478 1470 Rileyjal 1477 1466 Ytterbium 1476 1491 Renaissance 1475 1460 BenW 1475 1465 RainMan 1474 1494 Guest207 1471 1466 fotland 1470 1457 TheMadHair 1469 1500 Juha 1469 1359 megamau 1468 1485 AjedrezDude 1466 1453 AmitChaturvedi 1465 1450 kissl 1464 1476 miri 1464 1476 RSA 1464 1476 Robert 1464 1476 Yold 1463 1477 Kobold 1462 1485 jshira 1458 1490 taral 1458 1472 HEx 1457 1483 Kristijan 1457 1483 Keeps 1457 1483 ih8evilstuff 1457 1483 robstar 1457 1483 handsomestofall 1457 1489 xquezme 1456 1484 Guest383 1455 1423 Aaron 1454 1471 v_dhanasekar 1454 1397 Ryan_Cable 1453 1468 Darkenrahl 1452 1488 illz 1452 1481 Arimaardvark 1451 1500 brownsugar726 1450 1435 whiteKnight 1450 1481 ImranG 1449 1499 CeeJay 1448 1492 ugaiW 1448 1480 TipAndMe2 1448 1448 kraj 1444 1458 keita 1442 1500 PatoGuy 1439 1461 carolaina 1434 1458 yarnalito 1433 1470 Nate1729 1430 1350 jemicobel 1429 1447 novicehex 1427 1430 lihanzo 1426 1337 carlsquared 1424 1478 Sophi05 1422 1454 Jimmy_Newtron 1422 1471 blue22 1421 1414 haizhi 1418 1469 naveed4 1416 1468 Greytle 1414 1470 coachbudka 1411 1441 pcpdams 1409 1462 CrazyMan04 1404 1448 Monedero 1397 1429 matjaz 1376 1483 Imran 1361 1465 junaid 1307 1406 gern 1305 1184 Gregorius 1299 1369 eric 1262 1363 Keith 1222 1454 MrBrain 1216 1470 Rabbitball |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot_spe Post by PMertens on Jun 9th, 2005, 1:04am Quote:
I know ;-) And the way I remembered it I was on the one extreme side of his chart 8) Well - since your new calculations make me look even worse than the previous ones ... just forget that I asked for it ;D |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by omar on Jun 12th, 2005, 11:22am Thanks for trying this out Toby. I was planning to get back to this once we finished the WC format discussion. I would still like to try out the model where the actual time used per move is considered. If you can send me the code for this I can upload it to the arimaa site so others can download it and try it out also. Or if it's not too long, just post it here. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by Fritzlein on Jun 13th, 2005, 12:25am Thanks a bundle for trying this out! The results don't seem implausible. The big differences in fast and slow ratings for some people give us an incentive to get this sorted out as a high priority for the rating system, probably even higher than the "selection of opponents" problem and the "bots don't learn" problem. A technical question: Why doesn't bot_Bomb2005Blitz have a slow rating of 1500? If everyone starts with both ratings at 1500, and every game that bot_Bomb2005Blitz plays counts 100% towards fast and 0% towards slow, then shouldn't the slow rating of bot_Bomb2005Blitz remain unchanged throughout? on 06/08/05 at 21:00:10, 99of9 wrote:
Yes, it would seem that _every_ bot should have a "slow" rating lower than its "fast" rating. The fact that this doesn't happen is plausibly attributable to almost every bot playing all its games at the same speed, and that some of the bot games played at varying speeds were actually bot vs. bot games. But there is one case where there might be enough data for a meaningful test, namely the championship bot Bomb2005. Could you run the program again treating Bomb2005CC and Bomb2005Fast and Bomb2005Blitz all as the same bot playing at three different time controls? Quote:
I will repeat the analysis I did when I grouped the games by opponent type (vs. bot / vs. human), only this time I will group the games by speed (30 seconds and under / 45 seconds and over). |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by 99of9 on Jun 13th, 2005, 1:47am on 06/13/05 at 00:25:49, Fritzlein wrote:
I shared out the extra time due to the initial time reserve and the 60s first move, throughout the game (ie the number of moves in the game matters). Most 15s games actually end up being about 17s games. Quote:
This phenomena in real ratings would certainly not show up in this system if a bot was rated above 1500 and only ever played 2 min time controls. But it wouldn't adversely affect humans anyway (unless later it suddenly started playing a different time control). |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by Fritzlein on Jun 13th, 2005, 1:55am When I say that every bot should have a slow rating lower than its fast rating, I mean that that's how bots actually perform relative to humans. But I don't expect fast/slow ratings to behave properly when bots only play at a single time control. I was saying that we shouldn't be alarmed if some bots have a slow rating higher than their fast rating, since playing at only one time control can cause weirdness. Also I was saying that the only good test of whether the system correctly shows that bots are better fast than slow is to run the numbers with all three Bomb2005 bots considered as a single bot. It will be only one data point, but unlike all the other data points, it will be worth something. :-) |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by 99of9 on Jun 13th, 2005, 2:04am Yes, you are right, that's what I meant by "real fast/slow rating". Sorry I didn't adequately communicate my agreement. I will have a go at bomb sometime - I think your assumption will be shown correct, but there are ways to understand if it isn't. |
||||||||||||||||||||||||
Title: Re: Omar = OmarFast , bot_bomb = bot Post by 99of9 on Sep 3rd, 2008, 5:10am Just for interest I ran this program over an up to date database, here are the results. HvH slow ratings 1 chessandgo 2122 2 Fritzlein 2089 3 RonWeasley 1985 4 UltraWeak 1919 5 99of9 1874 6 Adanac 1867 7 arimaa_master 1855 8 jdb 1807 9 blue22 1782 10 mistre 1780 11 clauchau 1778 12 omar 1767 13 PMertens 1735 14 Belbo 1713 15 Spunk 1701 16 robinson 1698 17 Soter 1695 18 mdk 1673 19 Rabbit 1672 20 thorin 1669 21 Swynndla 1659 22 OLTI 1654 23 The_Jeh 1643 24 ArifSyed 1641 25 Brendan 1629 26 Ryan_Cable 1625 27 Arimabuff 1622 28 Tanker_JD 1621 29 woh 1617 30 petitprince 1614 31 rajah 1611 32 IceD 1606 33 xquezme 1603 34 Asturianuco 1596 35 Grey 1596 36 Groumpf 1594 37 JacquesB 1593 38 acroninj 1592 39 Ump 1591 40 qsasha 1588 41 mightybyte 1585 42 boris_toplak 1585 43 siddiqna 1584 44 ChrisB 1583 45 LAbiuso 1583 46 Stonkie 1583 47 xepo 1581 48 DorianGaray 1578 49 dethwing 1577 50 art 1576 51 Tuks 1572 52 adsyed01 1570 53 ArimaaCap 1567 54 kurthyl 1565 55 KT2006 1564 56 Guest5409 1564 57 Jonathan 1564 58 arishiki 1561 59 BlackKnight 1561 60 jonaspojken 1560 61 klabe 1560 62 ttt 1560 63 DanTilkin 1559 64 Magrathean 1557 65 ntroncos 1554 66 YunK 1552 67 KamiKazeKiwi3 1552 68 travis 1552 69 Archonn 1548 70 tize 1548 71 BlackShadow 1546 72 Paul 1545 73 Asubfive 1545 74 rusty 1545 75 Yzaxtol 1545 76 spela 1545 77 brad 1543 78 xabiron 1543 79 pikachamp 1543 80 GordonBlack 1541 81 Tau 1540 82 knarl 1539 83 Gesuma 1538 84 i_am_you 1537 85 gert7 1537 86 glitch 1537 87 ZeroOne 1537 88 marcgb 1537 89 BLooodyANgel 1537 90 Agt 1537 91 Ice 1537 92 Virgeist 1536 93 Raymond 1536 94 Yaron 1536 95 mikbuster 1535 96 Heidissimo 1535 97 kerdamdam 1532 98 nauboone 1532 99 Yuri 1531 100 DieInnereMelone 1530 HvH fast ratings 1 Fritzlein 2075 2 chessandgo 2022 3 PMertens 1854 4 robinson 1838 5 Adanac 1811 6 99of9 1786 7 mdk 1785 8 Soter 1771 9 arimaa_master 1746 10 IceD 1698 11 Brendan 1680 12 Arimabuff 1649 13 Ryan_Cable 1645 14 MrObvious 1641 15 ArifSyed 1640 16 naveed 1639 17 Swynndla 1630 18 kamikazeking 1620 19 Tuks 1612 20 ttt 1608 21 UltraWeak 1606 22 Spunk 1594 23 Guest5409 1585 24 willwould 1585 25 DorianGaray 1585 26 seanmcl 1585 27 Belbo 1583 28 challenger 1582 29 mistre 1579 30 BlackKnight 1570 31 Gesuma 1570 32 xepo 1569 33 Adlai 1568 34 Raymond 1566 35 BlackShadow 1565 36 Yzaxtol 1563 37 omarFast 1561 38 nbarriga 1560 39 spela 1559 40 Gorgapor 1555 41 dethwing 1554 42 jonaspojken 1554 43 Asturianuco 1552 44 Jonathan 1549 45 chessdiva27 1549 46 gert7 1548 47 petitprince 1546 48 Lautresault 1545 49 nauboone 1545 50 tize 1544 51 kurthyl 1543 52 OLTI 1543 53 Kraizy_Dave 1543 54 Sir_Twit 1543 55 sakano 1542 56 siddiqna 1542 57 Tau 1541 58 aXiom 1539 59 DieInnereMelone 1538 60 ntroncos 1537 61 JCricket 1537 62 art 1535 63 emeryaj 1535 64 Rancca 1535 65 Orc 1535 66 megamau 1535 67 Groumpf 1534 68 YunK 1534 69 snetnis 1534 70 xquezme 1532 71 Grey 1531 72 jl_ 1531 73 archigavr 1531 74 BilalQ 1529 75 Garyth 1528 76 ArimaaCap 1527 77 GordonBlack 1527 78 qsasha 1527 79 boris_toplak 1527 80 adsyed01 1527 81 Fleisch 1526 82 Ice 1525 83 knarl 1525 84 Virgeist 1524 85 Lucky 1524 86 Yaron 1524 87 BLooodyANgel 1523 88 LifeBlade 1523 89 i_am_you 1523 90 glitch 1523 91 marcgb 1523 92 ZeroOne 1523 93 smonroy 1522 94 acroninj 1521 95 drleper 1521 96 rajah 1521 97 Yusei 1521 98 mightybyte 1521 99 pikachamp 1517 100 xabiron 1517 |
||||||||||||||||||||||||
Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1! YaBB © 2000-2003. All Rights Reserved. |