Author |
Topic: Improving Arimaa Rates (Read 3123 times) |
|
Tachyon
Forum Guru
Arimaa player #3433
Gender:
Posts: 66
|
|
Improving Arimaa Rates
« on: Nov 5th, 2008, 5:39pm » |
Quote Modify
|
There has been numerous threads where the topic of arimaa game ratings has come up and it is seems that most agree that there is room for improvement. A while back Fritzlein has sugessted I start a thread to raise suggestions as to how this could be done. I thought that a ggod way to kick this off is to get some consensus about what the problems are and what causes them before looking at possible solutions. So far It seems to me that the main issues are : 1) Standardise the rating method Not a huge problem but I think that there should be consensus as to which system is used for ratings and why ( e.g. P8 or not ) and to stick with it. Having different ratings values for the same indivudual is not very helpfull and dilutes the sense of value attached to the rating. 2) Rating Manupilation Here I think we need to identify what issues there are through which ratings can be distorted and which ones have the most significant impact. So far I am thinking these are : 1) Bot bashing 2) Game platform connectivity issues. 3) Player collusion in HvH games. 4) Strong players taking advantage of weak or new players in HvH games. Any constructive input will be appreciated.
|
« Last Edit: Nov 5th, 2008, 5:40pm by Tachyon » |
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: Improving Arimaa Rates
« Reply #1 on: Nov 14th, 2008, 10:45am » |
Quote Modify
|
Thanks for starting this Tachyon. My current thoughts on the Arimaa rating system and rating systems in general is that there are really two parts to a rating system. The first is the mathematical model and the second is the game filter. Most rating systems including the current Arimaa rating system are based on the Elo model. This model works pretty good and I don't think there is much to be gained by tweaking it. The game filter part of the rating system is what determines which games are used as input to the first part. I think this is the part where there is room for improvement. Take for example the USCF or FIDE rating system; they are more accurate not because the mathematical model is so different, but because the games used were highly filtered. You can't just take any game you played and apply it towards your USCF rating, it has to be supervised and played under strict conditions and then it has to be submitted through the appropriate channels and finally be approved by the USCF to be included in the ratings. It can take a while before the game you just played eventually has an effect on your rating. But this is compensated by having more accurate ratings. In the Arimaa gameroom, players get to pick if the game will be rated or not, who the opponent will be, what time control will be used, when the game is played and even after that they can go back and unrate the game under some conditions. They may even be just playing the game for bot basing or experimenting with a particular setup. The games going into the rating model are pretty much unfiltered. But the good part is the players get instant gratification of seeing their ratings move right after the game. Also we develop a larger pool of games that can be used to compute ratings from. So to really get more accurate ratings I think the rated games we have need to be filtered. Automatically deciding which games should be used and which games should not is a bit of an AI problem. Not that different than say the problem of approving loan applications. And we know how good banks are doing that More and more I am leaning towards leaving the gameroom rating system as it is (so that players have the freedom to chose which games are rated and get instant gratification of seeing their ratings change) and filtering the rated games through another system to get more accurate ratings for different needs. Perhaps we should have a contest to see who can develop the best filtering system. All your system has to do is go through the games archive and pickout which games should be used as rated games. The system is free to look player histories, the actual moves of the game and everything else that is available about the game in selecting the rated games. Those games would then be put through the same mathematical model as currently used in the gameroom to see which produces the most accurate ratings. But before we could judge the filtering systems we would have to answer the question of what we consider to be accurate ratings
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Improving Arimaa Rates
« Reply #2 on: Nov 14th, 2008, 2:38pm » |
Quote Modify
|
A game filter could also operate on a continuum. Instead of games either counting or failing to count, they could be weighted anywhere between one and zero. Omar and I talked about the desirability of having repeated games between the same two opponents count less than the same number of games against a variety of opponents. I'm not sure how to implement that, though. Let's say I play a series of 100 games against Bomb. If we institute any kind of "diminishing returns" policy, whereby each game counts less than the previous, then my 100th game will count the least. But shouldn't it count the most, since it is my most recent? On the other hand, the reverse philosophy of counting the most recent game fully and counting older games less and less requires some kind of retroactive recalculation. Folks have resisted historical revisionism on the grounds that it is too complicated, too CPU-intensive, and/or too counter-intuitive.
|
|
IP Logged |
|
|
|
mistre
Forum Guru
Gender:
Posts: 553
|
|
Re: Improving Arimaa Rates
« Reply #3 on: Nov 14th, 2008, 7:22pm » |
Quote Modify
|
Let's look at Tachyon's four categories of potential ratings abuse: 1) Bot bashing 2) Game platform connectivity issues. 3) Player collusion in HvH games. 4) Strong players taking advantage of weak or new players in HvH games. 1) Bot bashing I think is the #1 culprit for inaccurate ratings with the current system. Here is a potential fix. Once a player has beaten a particular bot more than x times in a row (in rated games), then his future games vs this bot do not count towards his rating until he loses 1. Then he could win another x times in a row and have the rating count before he would have to lose 1. Ideally, losing the 1 game would lower his rating enough to prevent the player from just winning x and then losing 1 on purpose. I have no idea what value x would be - an outside guess would be about 7. Once a player can win 7 in a row vs a particular bot, then it could be assumed that this player can pretty much win 90% of the time vs it. Any further wins would therefore should not raise their rating. I think there are enough different bots available for play that a player can still raise their rating substantially. But once they run out, they will have to face tougher bots to raise their rating. 2) Game platform issues. There is the unrate feature which helps with this, but there are still the instances where a player feels that they are winning and Bomb declares them losing. Or perhaps a player is unaware of the unrate feature. I don't see an easy fix for this, but I don't see it as a big problem either. This problem pales in comparison to #1. 3) Player Collusion. This one should probably be handled on a case by case basis. I haven't really seen any evidence of this happening and I think it would be pretty easy to spot if it was. 4) Strong vs Weak in Human games. Once again, I don't see this as a major issue. Weak players will generally avoid strong players if they lose once or twice vs them. It is also much harder to play multiple games vs human opponents due to availability vs bashing bots as many times in a row as you want. There has been no evidence that anyone has an inflated rating just by picking on newcomers. Overall, I really only see #1 as a problem and hopefully my attempt at a solution has some merit or will help someone else to come up with a solution.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: Improving Arimaa Rates
« Reply #4 on: Nov 16th, 2008, 9:07am » |
Quote Modify
|
on Nov 14th, 2008, 2:38pm, Fritzlein wrote:A game filter could also operate on a continuum. Instead of games either counting or failing to count, they could be weighted anywhere between one and zero. |
| Yes, definitely; in fact that would be the desired way to do it. The weight would in effect scale the K factor used in the Elo formula. So if a game is specified to have a weight of 0.5 then only half of the normal K value would be used. For a long time Karl has been suggesting that I change the gameroom rating system so that games between humans and bots use half the normal K value. I've been resisting since it did seemed like adding a very ad hoc component to the rating formula. But Karl's suggestion fits nicely in the context of a game filter. The game filter simply returns 0.5 for the weight of games between humans and bots and 1 otherwise. Quote:Folks have resisted historical revisionism on the grounds that it is too complicated, too CPU-intensive, and/or too counter-intuitive. |
| Yes, more complex filters that look at complete game histories of the players are pretty compute intensive, but they can definitely be run in an off line mode.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: Improving Arimaa Rates
« Reply #5 on: Nov 16th, 2008, 9:33am » |
Quote Modify
|
on Nov 14th, 2008, 7:22pm, mistre wrote: Once a player has beaten a particular bot more than x times in a row (in rated games), then his future games vs this bot do not count towards his rating until he loses 1. |
| That sounds like an interesting filter to try out. Quote: Overall, I really only see #1 as a problem and hopefully my attempt at a solution has some merit or will help someone else to come up with a solution. |
| Even though I don't do much bot bashing games, I tend to play a lot of late night blitz or fast games without much concern for the effects on my ratings. I also occasionally experiment with different setups and don't bother to play these as unrated games. So in addition to bot bashing these kinds of games are also degrading the rating system. Also we are mixing games of vastly different speeds all into the same single rating number. I think this also contributes significantly to making the ratings inaccurate.
|
« Last Edit: Nov 20th, 2008, 6:45am by omar » |
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: Improving Arimaa Rates
« Reply #6 on: Nov 16th, 2008, 9:41am » |
Quote Modify
|
I would really like to see a situation where different rating lists are posted as a web page and also a web service. To facilitate this I can make available a version of the games database that contains only the rated games and is updated hourly. That way anyone who wants to try out an idea of how to filter the games can download this and try it out. I think this will allow various ideas to be tried out and see what the community likes.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Improving Arimaa Rates
« Reply #7 on: Nov 16th, 2008, 10:46am » |
Quote Modify
|
on Nov 16th, 2008, 9:07am, omar wrote:For a long time Karl has been suggesting that I change the gameroom rating system so that games between humans and bots use half the normal K value. |
| No, no, that isn't at all what I have been suggesting. I have been advocating that while games between humans use the win probability formula 1/(1+10^((A-B)/400)), games between a human and bot should use the win probability formula 1/(1+10^((A-B)/200)). The K value affects the volatility of the ratings, i.e. it determines how fast ratings change. I don't have strong feelings about the current volatility. The scaling factor of 200 versus 400 is a completely separate matter, and is justified in the following way: Take it as fixed that if I can beat someone 10 times out of 11, I deserve to be 400 points higher than them, and if I lose 10 times out of 11, I deserve to be 400 points lower than them. This is the normal scale of Elo ratings. However, it doesn't transfer well to bots. If I can beat a bot 10 games out of 11, I might not be that much better than the bot. I might be only a little better, and be winning by rote. Therefore the system should put me only 200 points above that bot. Similarly if I lose to a bot 10 times out of 11, I might actually be almost at its level, but still losing lots of games due to my blunders while the bot is infallible. Therefore I should only be rated 200 points below the bot. To repeat, I wasn't suggesting changing the volatility of the ratings, or suggesting counting HvB games for less than HvH games, although both of those ideas are reasonable. The thrust of my suggestion was about scaling HvB games to be twice as compressed as HvH games.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Improving Arimaa Rates
« Reply #8 on: Nov 16th, 2008, 10:46am » |
Quote Modify
|
on Nov 16th, 2008, 9:41am, omar wrote:I would really like to see a situation where different rating lists are posted as a web page and also a web service. To facilitate this I can make available a version of the games database that contains only the rated games and is updated hourly |
| Sounds like a great idea!
|
|
IP Logged |
|
|
|
Tachyon
Forum Guru
Arimaa player #3433
Gender:
Posts: 66
|
|
Re: Improving Arimaa Rates
« Reply #9 on: Nov 16th, 2008, 7:24pm » |
Quote Modify
|
Omar : Quote:Also we are mixing games of vastly different speeds all into the same single rating number. I think this also contributes significantly to making the ratings inaccurate. |
| I totally agree ... This should be taken into consideration. I a recent chatroom discussion Fritzlein referred to fast and slow games as two different games. I believe we should separate the ratings for fast and slow games. Omar : Quote:Yes, more complex filters that look at complete game histories of the players are pretty compute intensive, but they can definitely be run in an off line mode. |
| Players quality of play tend to vary over time and as I would think that we want the rating to reflect their current level of play I do not see the value of including games older than a certain time limit .. say 1 year Fritzlein: Quote:... if I lose to a bot 10 times out of 11, I might actually be almost at its level, but still losing lots of games due to my blunders while the bot is infallible. Therefore I should only be rated 200 points below the bot. |
| Surely ... not making blunders and playing consitently are key aspects of what constitutes a good player. I do not see why that serves as a reason to differentiate between bot and human play ? Fritzlein: Quote:If I can beat a bot 10 games out of 11, I might not be that much better than the bot. I might be only a little better, and be winning by rote. Therefore the system should put me only 200 points above that bot. |
| I agree ... However .. another player may have won 10 games out of eleven without using any bot bashing tactics ... how does a system distinguish between the two ?
|
« Last Edit: Nov 16th, 2008, 7:26pm by Tachyon » |
IP Logged |
|
|
|
Adanac
Forum Guru
Arimaa player #892
Gender:
Posts: 635
|
|
Re: Improving Arimaa Rates
« Reply #10 on: Nov 17th, 2008, 12:48pm » |
Quote Modify
|
on Nov 16th, 2008, 7:24pm, Tachyon wrote:I totally agree ... This should be taken into consideration. I a recent chatroom discussion Fritzlein referred to fast and slow games as two different games. I believe we should separate the ratings for fast and slow games. |
| Rather than just slow/fast, I would prefer to have 3 categories (fast/slow/postal), since they all require different skill sets. I tend to think of 15 or 30 seconds/move as “fast”, though some may argue that 45 seconds/move is also “fast”. There are many redundant bots, such as botXblitz, botXfast, botXP1, botX, etc. It’d be nice to have a single botX with multiple ratings that can play at any speed. I suppose that would get pretty messy with tens of thousands of archived bot games to re-assign to a consolidated bot. Fritzlein’s 1/(1+10^((A-B)/200)) formula is a surprisingly simple and effective way to reduce in half the number of points that can be “stolen” by rote from any bot – I like it. The (good) unintended side affect is encouraging new players to play more games against other humans - rather than just sticking to the bot ladder - if they desire a faster climb up the rating chart. I’m not partial to the idea of reducing the weight for each additional game played between the same 2 players. If World Championship seeding is the greatest concern, then how about using previous World Championship results to generate the seeds? Or, create a 4th rating category “tournament” which would be in effect only for the WC and possibly the Continuous and Postal tournaments, or any other controlled events.
|
|
IP Logged |
|
|
|
Tachyon
Forum Guru
Arimaa player #3433
Gender:
Posts: 66
|
|
Re: Improving Arimaa Rates
« Reply #11 on: Nov 17th, 2008, 4:46pm » |
Quote Modify
|
The biggest problem with fast speed games is that it promotes unforced blunders by relatively good players causing them to lose games that they would not have lost given some more time. I also believe that time controls that forces games to be played at a steady pace is advantageous to bots since humans tend to require a lot more thinking time in some positions than others. There also seem to be arguments that faster time controls and/or per move time limits are implemented more for spectator benefit than player benefit. I think we should define slow / fast games according to the purpose they serve. Fast games : 1) Strongly favours those who have many hundreds/thousands of games experience. 2) Favours those who have a better ability to think fast than deep 3) Somewhat favours bots vs humans. 4) Puts spectator interest above quality of play. Slow games : 1) Equalizes the playing field for well practiced players vs strong new players to some extent. 2) Put those who think slower but deeper on a more equal footing ( maybe somewhat favor them ) to the fast thinkers. 3) Puts bots and humans on a more equal footing. 4) Puts quality of play above spectator interest. Given the above objectives I would define fast games as any games faster than 2 minutes per move. Slow games would be 2 or more minutes per move up to a game time limit of say +-8 Hours I agree with Adanac that postal games need to be treated differently since the greater time length also allow players to research their moves or elicit some other form of help .
|
|
IP Logged |
|
|
|
Janzert
Forum Guru
Arimaa player #247
Gender:
Posts: 1016
|
|
Re: Improving Arimaa Rates
« Reply #12 on: Nov 18th, 2008, 6:04pm » |
Quote Modify
|
on Nov 17th, 2008, 4:46pm, Tachyon wrote:I also believe that time controls that forces games to be played at a steady pace is advantageous to bots since humans tend to require a lot more thinking time in some positions than others. |
| Hmm, yet at least in go and chess the opposite end of the spectrum seems to be regarded as very favorable to bots. Absolute time control seems to be often given as being unreasonably bad for humans because they aren't very good at leaving themselves the time to play the end game out correctly. I think the arimaa timecontrols in general are fairly good at getting humans to actaully use the time pretty well. They allow a player to make "obvious" moves quickly and not lose that time for use later but generally disourage them from spending too long in one move and running out of time before the end of the game. Janzert
|
|
IP Logged |
|
|
|
Tachyon
Forum Guru
Arimaa player #3433
Gender:
Posts: 66
|
|
Re: Improving Arimaa Rates
« Reply #13 on: Nov 18th, 2008, 7:37pm » |
Quote Modify
|
I do not think it is the purpose of time controls to assist players in regulating their time usage. Using your time efficiently is part of the skill of the game. It only makes it worse if the computer regulates time in a way which is too inflexible too allow players to play a style that suits them. I would rather lose a game because I run out of time due to my own bad time management, than making blunders or bad moves due to a forced pace in a difficult position.
|
|
IP Logged |
|
|
|
jdb
Forum Guru
Arimaa player #214
Gender:
Posts: 682
|
|
Re: Improving Arimaa Rates
« Reply #14 on: Nov 18th, 2008, 7:54pm » |
Quote Modify
|
The standard arimaa time controls are nice for spectators. The players play at a regular pace. Watching a game using the chess time control can be horrible for spectators, if one player decides to use an hour for one move.
|
|
IP Logged |
|
|
|
|