Author |
Topic: Whole History Ratings (Read 67403 times) |
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: Whole History Ratings
« Reply #105 on: Jan 13th, 2011, 2:51pm » |
Quote Modify
|
Must be due to an intransitivity in performance. If you were Joe Frazier, Fritzlein would be George Foreman and chessandgo would be Muhammad Ali.
|
« Last Edit: Jan 13th, 2011, 2:51pm by aaaa » |
IP Logged |
|
|
|
Eltripas
Forum Guru
Meh-he-kah-naw
Gender:
Posts: 225
|
|
Re: Whole History Ratings
« Reply #106 on: Jan 13th, 2011, 5:39pm » |
Quote Modify
|
on Jan 13th, 2011, 1:59pm, Adanac wrote:Fritzlein has beaten me 10 straight games but somehow I just passed him in WHR I'll take a screenshot of this miraculous event in case it never happens again! |
| I'm sure it has nothing to do with the fact that you defeated the world champion in 4 of your last 5 games against him, without mentioning that you defeated Tuks twice also. All this happening in the last 2 days.
|
|
IP Logged |
|
|
|
megajester
Forum Guru
Istanbul, Turkey
Gender:
Posts: 710
|
|
Re: Whole History Ratings
« Reply #107 on: Jan 14th, 2011, 2:07am » |
Quote Modify
|
on Jan 13th, 2011, 5:39pm, Eltripas wrote: I'm sure it has nothing to do with the fact that you defeated the world champion in 4 of your last 5 games against him, without mentioning that you defeated Tuks twice also. All this happening in the last 2 days. |
| Does this mean that skill at Arimaa is multidimensional? Meaning that one is not "better" or "worse" at Arimaa per se, but there are a spectrum of different styles that interact with each other differently? Of course as Fritz always says, "tactics beats strategy". But once you get beyond tactics, is it possible that Arimaa strategies are like Rock Paper Scissors? (Rock beats scissors beats paper beats rock = Fritz beats Adanac beats chessandgo beats Fritz) This would perhaps explain why bot bashers sometimes don't do well against humans and vice versa... ... megajester looks up at the edifice of his theory with pride ... ... of course if chessandgo turns up and says he was having a bad couple of days when he played Adanac then the whole thing comes crashing down. Edit: Now I see what aaaa meant by "intransitivity". I don't know boxing, so I didn't understand that post when I first read it. Perhaps I should define my theory better. Of course there are different strategies that different situations call for. I am talking more about playing styles, different philosophies, or approaches to the game. General principles of thought as opposed to knee-jerk reactionary strategies. Let's say two players, A and B, play a game according to their own styles, A and B. Up until now I would have thought that the winner was determined by whoever made the least mistakes, meaning whoever executed their own style more perfectly. So if A beats B, it means that B must have made a mistake somewhere. I thought that the playing style itself was immaterial, that everything rests on the perfect execution of that style. Now I am proposing that even if a style is perfectly executed it can still be beaten by another, perfectly executed style. Of course you would also assume that when quantum computers come along and "solve" Arimaa in the technical sense of the word, all this will be bunk. (In fact this has already happened to chess even before computers have solved it. Especially when compared with Arimaa, chess is a tactical slugfest with very little room for strategic variety.) However as Fritzlein points out in his book, branching factors are not everything. Perhaps Arimaa really does have richness all its own.
|
« Last Edit: Jan 14th, 2011, 3:37am by megajester » |
IP Logged |
|
|
|
Adanac
Forum Guru
Arimaa player #892
Gender:
Posts: 635
|
|
Re: Whole History Ratings
« Reply #108 on: Jan 14th, 2011, 5:41am » |
Quote Modify
|
on Jan 14th, 2011, 2:07am, megajester wrote: ... of course if chessandgo turns up and says he was having a bad couple of days when he played Adanac then the whole thing comes crashing down. |
| Yes, my high WHR rating is a temporary fluke and I don't expect to remain number 2 for long. Or maybe I'm becoming the Boris Gulko of Arimaa (5 out of 8 points in 8 games versus Kasparov: +3 =4 -1, but with no success against Karpov, Kramnik, etc.) I don't know if Gulko is an example of intransitivity or just a small sample size, though. Quote: Let's say two players, A and B, play a game according to their own styles, A and B. Up until now I would have thought that the winner was determined by whoever made the least mistakes, meaning whoever executed their own style more perfectly. So if A beats B, it means that B must have made a mistake somewhere. I thought that the playing style itself was immaterial, that everything rests on the perfect execution of that style. Now I am proposing that even if a style is perfectly executed it can still be beaten by another, perfectly executed style. |
| If we categorize players by level of aggressive play: High = always attacks and advances pieces into enemy territory Medium = likes to attack, but nowhere near as enthusiastically as “High” Low = prefers to defend home traps and hold hostages From my observations, if I had to guess which style tends to beat others: High > Medium > Low > High Or, as a general rule: more aggressive players tend to beat cautious players except in cases of over-exuberance against solid play. Of course, as megajester pointed out, tactics and execution are the primary factors. For example, we saw an extreme example in round 1 of the World Championship. Hanzack is one of the most aggressive players out there while Harren is known more for his defensive skill and technique. In fact, the defensive player did get the much better position out of the opening but tactics were the deciding factor in the end, not style. Arimaa is such a new game that I think everyone would benefit from exploring new ideas and experimenting with different strategies & styles. Nobody knows for sure what the future will bring but I'm pretty certain that opening setups & moves will evolve tremendously as we learn more. It's still amazing to me that after centuries of analysis by countless thousands of brilliant masters that new ideas can still be found in chess openings. This game here is an example of how computers are finding wild new tactics that even Capablanca and Kasparov never would have dreamed of: sacrificing a central pawn and giving up the ability to castle with both queens on the board just for a pased a-pawn!!? It sounds absurd, but I'll trust the judgement of this 3200-strength computer http://www.chessgames.com/perl/chessgame?gid=1546726
|
|
IP Logged |
|
|
|
Hippo
Forum Guru
Arimaa player #4450
Gender:
Posts: 883
|
|
Re: Whole History Ratings
« Reply #109 on: Jan 14th, 2011, 12:58pm » |
Quote Modify
|
on Jan 14th, 2011, 5:41am, Adanac wrote: Wow what a game . I hardly understand all the hidden tactics ... Black prefered 0-0 to capturing c5 pawn, but the pawn was captured later without problems ... couldn't black avoid pinning his bishop in final "pin exchange"?
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Whole History Ratings
« Reply #110 on: Jan 14th, 2011, 7:39pm » |
Quote Modify
|
on Jan 13th, 2011, 1:59pm, Adanac wrote:Fritzlein has beaten me 10 straight games but somehow I just passed him in WHR I'll take a screenshot of this miraculous event in case it never happens again! |
| One possible explanation is that you played rather more than I did in the Q4 of 2010, so you are entering the World Championship "in form", whereas I am rusty and lagging the latest advances. Given how little I have prepared this year, I will be happy if I can just finish on the podium.
|
|
IP Logged |
|
|
|
Boom
Forum Newbie
Arimaa player #6119
Gender:
Posts: 1
|
|
Re: Whole History Ratings
« Reply #111 on: Jan 30th, 2011, 1:41pm » |
Quote Modify
|
Dear woh, Is there any chance that your implementation of whole history rankings be published as open source software? Thanks, Boom
|
|
IP Logged |
|
|
|
woh
Forum Guru
Arimaa player #2128
Gender:
Posts: 254
|
|
Re: Whole History Ratings
« Reply #112 on: Feb 1st, 2011, 10:48am » |
Quote Modify
|
Hi Boom I am afraid I have no plans to do so. woh
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: Whole History Ratings
« Reply #113 on: Feb 4th, 2011, 3:52pm » |
Quote Modify
|
Boom, you may be interested to know that recently there was a rating system competition on kaggle.com. http://www.kaggle.com/chess If you look in the forum section of that site you will find some of the top finishers have posted their source and methods. My entry which was a simplified version of the Chessmetrics rating system placed 44 out of 258 teams. The source for it is here: http://arimaa.com/arimaa/rating/chessmetrics/cmrs.txt Hope this helps.
|
|
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: Whole History Ratings
« Reply #114 on: Feb 6th, 2011, 2:51pm » |
Quote Modify
|
woh, could you increase the variability to 90 Elo^2/day? That's what my research is giving me right now.
|
|
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: Whole History Ratings
« Reply #115 on: Jun 10th, 2011, 9:08am » |
Quote Modify
|
Given how much the values I'm getting while searching for optimal parameter values tend to fluctuate over time, I have finally come to the conclusion that there is a considerably wide range of reasonable values that can be used and that this can actually be considered a virtue of the WHR system, as it demonstrates a robustness. In light of that, I will stop making any further calls to change the official parameters, unless they were to become really out of whack. On a different note, people may have noticed that when I make use of game data for various purposes, I have been pretty insistent in maintaining a purported kind of purity by restricting the games to those that are not only rated but also involve only humans. However, when it comes to doing rating research, like finding out what the typical rating of a beginner is, this is actually pretty ill-conceived, as, after all, the overwhelming number of games humans play on this site are against bots. In addition, interesting questions involving bots, like how much the strength difference between the top humans and bots has changed over time, become outright impossible to answer using only human games as data. That brings me to the following idea: Instead of letting the WHR system loose on all rated games and simply accepting the already-mentioned distortions that come from humans playing bots, why not provide some compensation by having the system assume that server bots have static performance (which could be done by treating all their games as occurring in a single all-encompassing rating period)? Although this would technically not be entirely sound in light of hardware changes, future bots that are adaptive over time or even the fluctuating nature of the load on the server, I would still think that such an assumption would be a net winner in terms of informativeness. If this would still be too much or one would like to be able to keep tabs on how performance changes with hardware speed, the assumption could be restricted to the subset of server bots that are supposed to be of fixed strength. One concrete use for this hybrid system could be to have the calculated ratings themselves be used to help find optimal parameters for other rating systems, in particular those classified as being "incremental" in the WHR paper; these are the comparably simple "ad-hoc" systems, like the one currently in use by the Arimaa gameroom. Now, given the existence of the WHR list, Omar may have given any further work on the current gameroom rating system a very low priority, but it's clear that the gameroom ratings are still serving various purposes, including even server-technical ones. So even if a change to a more technically justified incremental rating system, like Glicko, is too much to ask for right now, at the very least, I don't think that, for optimization purposes, it would be too much to ask for less-involved changes, like different system parameters or starting ratings.
|
|
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: Whole History Ratings
« Reply #116 on: Aug 20th, 2011, 8:16am » |
Quote Modify
|
Taking the first game of a 2011 server bot as the cutoff point, I applied the WHR system (prior: 2 wins/4, variance: ca. 311 Elo^2/day) to all 160,318 earlier rated games with the modification that for each fixed-performance bot, its games, from its point of view, take place in one perpetual rating period. What follows is their performances relative to the median (retroactive) starting rating of a human player: Fixed-performance bot | Elo above "beginner" | Standard deviation | bot_MarwinXP2Blitz | 868 | 23 | bot_Sharp2010P2 | 773 | 44 | bot_Marwin2010P2 | 661 | 42 | bot_Clueless2010P2 | 660 | 53 | bot_Bomb2005P2 | 652 | 7 | bot_GnoBot2010P2 | 608 | 56 | bot_Clueless2008P2 | 554 | 46 | bot_Clueless2009P2 | 553 | 40 | bot_Clueless2007P2 | 549 | 13 | bot_Clueless2005P2 | 520 | 12 | bot_PragmaticTheory2010P2 | 465 | 55 | bot_Clueless2009P1 | 455 | 45 | bot_Sharp2010P1 | 444 | 58 | bot_Clueless2006P1 | 435 | 11 | bot_Clueless2006P2 | 434 | 16 | bot_GnoBot2006P2 | 427 | 22 | bot_OpFor2008P2 | 412 | 10 | bot_Clueless2010P1 | 402 | 61 | bot_Clueless2007P1 | 396 | 9 | bot_GnoBot2005P2 | 388 | 8 | bot_Clueless2005P1 | 374 | 8 | bot_Bomb2005P1 | 358 | 5 | bot_Badger2010P2 | 357 | 71 | bot_Loc2006P2 | 321 | 14 | bot_Marwin2010P1 | 320 | 67 | bot_Clueless2008P1 | 312 | 54 | bot_Loc2007P2 | 311 | 9 | bot_Aamira2006P2 | 292 | 7 | bot_Sharp2008P2 | 276 | 8 | bot_PragmaticTheory2010P1 | 250 | 69 | bot_Arimaazilla | 213 | 4 | bot_Loc2005P2 | 184 | 13 | bot_OpFor2008P1 | 181 | 6 | bot_Bomb2005P3 | 147 | 168 | bot_GnoBot2010P1 | 96 | 103 | bot_Loc2007P1 | 79 | 6 | bot_Loc2006P1 | 62 | 8 | bot_Badger2010P1 | 37 | 82 | bot_Loc2005P1 | 20 | 8 | bot_GnoBot2005P1 | 7 | 5 | bot_GnoBot2006P1 | 3 | 13 | bot_Rat2009P1 | 0 | 171 | bot_Rat2009P2 | 0 | 171 | bot_ArimaaScoreP3 | -13 | 165 | bot_Sharp2008P1 | -48 | 7 | bot_Arimaalon | -66 | 7 | bot_ShallowBlue | -68 | 6 | bot_ArimaaScoreP2 | -88 | 5 | bot_Aamira2006P1 | -90 | 6 | bot_ArimaaScoreP1 | -166 | 4 |
|
« Last Edit: Aug 20th, 2011, 8:21am by aaaa » |
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: Whole History Ratings
« Reply #117 on: Aug 21st, 2012, 11:56pm » |
Quote Modify
|
woh, would it be too much trouble to have the peak ratings be links to (the comment pages of) the games they correspond to? That would be a really great feature to have.
|
|
IP Logged |
|
|
|
woh
Forum Guru
Arimaa player #2128
Gender:
Posts: 254
|
|
Re: Whole History Ratings
« Reply #118 on: Aug 22nd, 2012, 3:41pm » |
Quote Modify
|
Interesting idea, aaaa. It might take me a while to implement.
|
|
IP Logged |
|
|
|
clyring
Forum Guru
Arimaa player #6218
Gender:
Posts: 359
|
|
Re: Whole History Ratings
« Reply #119 on: Aug 27th, 2012, 4:46pm » |
Quote Modify
|
I don't think cyborg_briareus should be in WHRH.
|
|
IP Logged |
I administer the Endless Endgame Event (EEE). Players welcome!
|
|
|
|