Arimaa Forum - Whole History Ratings

Welcome, Guest. Please Login or Register.
Apr 20^th, 2024, 2:19am

Home

Help

Members

Arimaa Forum « Whole History Ratings »

   Arimaa Forum
   Arimaa
   General Discussion (Moderator: supersamu)
   Whole History Ratings

« Previous topic | Next topic »

Pages: 1 ... 6 7 8 9 10

Notify of replies

Send Topic

Author

Topic: Whole History Ratings (Read 67403 times)

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Whole History Ratings
« Reply #105 on: Jan 13^th, 2011, 2:51pm »

Quote

Modify

Must be due to an intransitivity in performance. If you were Joe Frazier, Fritzlein would be George Foreman and chessandgo would be Muhammad Ali.

« Last Edit: Jan 13^th, 2011, 2:51pm by aaaa »

IP Logged

Eltripas
Forum Guru

Meh-he-kah-naw

Gender: male

Posts: 225

Re: Whole History Ratings
« Reply #106 on: Jan 13^th, 2011, 5:39pm »

Quote

Modify

on Jan 13^th, 2011, 1:59pm, Adanac wrote:

Fritzlein has beaten me 10 straight games but somehow I just passed him in WHR Embarassed

I'll take a screenshot of this miraculous event in case it never happens again!

I'm sure it has nothing to do with the fact that you defeated the world champion in 4 of your last 5 games against him, without mentioning that you defeated Tuks twice also. All this happening in the last 2 days.

IP Logged

megajester
Forum Guru

Istanbul, Turkey

Gender:

Posts: 710

Re: Whole History Ratings
« Reply #107 on: Jan 14^th, 2011, 2:07am »

Quote

Modify

on Jan 13^th, 2011, 5:39pm, Eltripas wrote:

Does this mean that skill at Arimaa is multidimensional? Meaning that one is not "better" or "worse" at Arimaa per se, but there are a spectrum of different styles that interact with each other differently?

Of course as Fritz always says, "tactics beats strategy". But once you get beyond tactics, is it possible that Arimaa strategies are like Rock Paper Scissors? (Rock beats scissors beats paper beats rock = Fritz beats Adanac beats chessandgo beats Fritz)

This would perhaps explain why bot bashers sometimes don't do well against humans and vice versa...

... megajester looks up at the edifice of his theory with pride ...

... of course if chessandgo turns up and says he was having a bad couple of days when he played Adanac then the whole thing comes crashing down.

Edit: Now I see what aaaa meant by "intransitivity". I don't know boxing, so I didn't understand that post when I first read it.

Perhaps I should define my theory better. Of course there are different strategies that different situations call for. I am talking more about playing styles, different philosophies, or approaches to the game. General principles of thought as opposed to knee-jerk reactionary strategies.

Let's say two players, A and B, play a game according to their own styles, A and B. Up until now I would have thought that the winner was determined by whoever made the least mistakes, meaning whoever executed their own style more perfectly. So if A beats B, it means that B must have made a mistake somewhere. I thought that the playing style itself was immaterial, that everything rests on the perfect execution of that style. Now I am proposing that even if a style is perfectly executed it can still be beaten by another, perfectly executed style.

Of course you would also assume that when quantum computers come along and "solve" Arimaa in the technical sense of the word, all this will be bunk. (In fact this has already happened to chess even before computers have solved it. Especially when compared with Arimaa, chess is a tactical slugfest with very little room for strategic variety.) However as Fritzlein points out in his book, branching factors are not everything. Perhaps Arimaa really does have richness all its own.

« Last Edit: Jan 14^th, 2011, 3:37am by megajester »

IP Logged

Adanac
Forum Guru

Arimaa player #892

Gender:

Posts: 635

Re: Whole History Ratings
« Reply #108 on: Jan 14^th, 2011, 5:41am »

Quote

Modify

on Jan 14^th, 2011, 2:07am, megajester wrote:

... of course if chessandgo turns up and says he was having a bad couple of days when he played Adanac then the whole thing comes crashing down.

Yes, my high WHR rating is a temporary fluke and I don't expect to remain number 2 for long. Or maybe I'm becoming the Boris Gulko of Arimaa (5 out of 8 points in 8 games versus Kasparov: +3 =4 -1, but with no success against Karpov, Kramnik, etc.) I don't know if Gulko is an example of intransitivity or just a small sample size, though.

Quote:

Let's say two players, A and B, play a game according to their own styles, A and B. Up until now I would have thought that the winner was determined by whoever made the least mistakes, meaning whoever executed their own style more perfectly. So if A beats B, it means that B must have made a mistake somewhere. I thought that the playing style itself was immaterial, that everything rests on the perfect execution of that style. Now I am proposing that even if a style is perfectly executed it can still be beaten by another, perfectly executed style.

If we categorize players by level of aggressive play:

High = always attacks and advances pieces into enemy territory
Medium = likes to attack, but nowhere near as enthusiastically as “High”
Low = prefers to defend home traps and hold hostages

From my observations, if I had to guess which style tends to beat others:
High > Medium > Low > High

Or, as a general rule: more aggressive players tend to beat cautious players except in cases of over-exuberance against solid play.

Of course, as megajester pointed out, tactics and execution are the primary factors. For example, we saw an extreme example in round 1 of the World Championship. Hanzack is one of the most aggressive players out there while Harren is known more for his defensive skill and technique. In fact, the defensive player did get the much better position out of the opening but tactics were the deciding factor in the end, not style.

Arimaa is such a new game that I think everyone would benefit from exploring new ideas and experimenting with different strategies & styles. Nobody knows for sure what the future will bring but I'm pretty certain that opening setups & moves will evolve tremendously as we learn more. It's still amazing to me that after centuries of analysis by countless thousands of brilliant masters that new ideas can still be found in chess openings. This game here is an example of how computers are finding wild new tactics that even Capablanca and Kasparov never would have dreamed of: sacrificing a central pawn and giving up the ability to castle with both queens on the board just for a pased a-pawn!!? It sounds absurd, but I'll trust the judgement of this 3200-strength computer

http://www.chessgames.com/perl/chessgame?gid=1546726

IP Logged

Hippo
Forum Guru

Arimaa player #4450

Gender: male

Posts: 883

Re: Whole History Ratings
« Reply #109 on: Jan 14^th, 2011, 12:58pm »

Quote

Modify

on Jan 14^th, 2011, 5:41am, Adanac wrote:

http://www.chessgames.com/perl/chessgame?gid=1546726

Wow what a game

. I hardly understand all the hidden tactics ... Black prefered 0-0 to capturing c5 pawn, but the pawn was captured later without problems ... couldn't black avoid pinning his bishop in final "pin exchange"?

IP Logged

Fritzlein
Forum Guru

Arimaa player #706

Gender:

Posts: 5928

Re: Whole History Ratings
« Reply #110 on: Jan 14^th, 2011, 7:39pm »

Quote

Modify

on Jan 13^th, 2011, 1:59pm, Adanac wrote:

Fritzlein has beaten me 10 straight games but somehow I just passed him in WHR Embarassed

I'll take a screenshot of this miraculous event in case it never happens again!

One possible explanation is that you played rather more than I did in the Q4 of 2010, so you are entering the World Championship "in form", whereas I am rusty and lagging the latest advances. Given how little I have prepared this year, I will be happy if I can just finish on the podium.

IP Logged

Boom
Forum Newbie

Arimaa player #6119

Gender: male

Posts: 1

Re: Whole History Ratings
« Reply #111 on: Jan 30^th, 2011, 1:41pm »

Quote

Modify

Dear woh,

Is there any chance that your implementation of whole history rankings be published as open source software?

Thanks,

Boom

IP Logged

woh
Forum Guru

Arimaa player #2128

Gender: male

Posts: 254

Re: Whole History Ratings
« Reply #112 on: Feb 1^st, 2011, 10:48am »

Quote

Modify

Hi Boom

I am afraid I have no plans to do so.

woh

IP Logged

omar
Forum Guru

Arimaa player #2

Gender: male

Posts: 1003

Re: Whole History Ratings
« Reply #113 on: Feb 4^th, 2011, 3:52pm »

Quote

Modify

Boom, you may be interested to know that recently there was a rating system competition on kaggle.com.

http://www.kaggle.com/chess

If you look in the forum section of that site you will find some of the top finishers have posted their source and methods.

My entry which was a simplified version of the Chessmetrics rating system placed 44 out of 258 teams. The source for it is here:

http://arimaa.com/arimaa/rating/chessmetrics/cmrs.txt

Hope this helps.

IP Logged

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Whole History Ratings
« Reply #114 on: Feb 6^th, 2011, 2:51pm »

Quote

Modify

woh, could you increase the variability to 90 Elo^2/day? That's what my research is giving me right now.

IP Logged

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Whole History Ratings
« Reply #115 on: Jun 10^th, 2011, 9:08am »

Quote

Modify

Given how much the values I'm getting while searching for optimal parameter values tend to fluctuate over time, I have finally come to the conclusion that there is a considerably wide range of reasonable values that can be used and that this can actually be considered a virtue of the WHR system, as it demonstrates a robustness. In light of that, I will stop making any further calls to change the official parameters, unless they were to become really out of whack.

On a different note, people may have noticed that when I make use of game data for various purposes, I have been pretty insistent in maintaining a purported kind of purity by restricting the games to those that are not only rated but also involve only humans. However, when it comes to doing rating research, like finding out what the typical rating of a beginner is, this is actually pretty ill-conceived, as, after all, the overwhelming number of games humans play on this site are against bots. In addition, interesting questions involving bots, like how much the strength difference between the top humans and bots has changed over time, become outright impossible to answer using only human games as data.

That brings me to the following idea: Instead of letting the WHR system loose on all rated games and simply accepting the already-mentioned distortions that come from humans playing bots, why not provide some compensation by having the system assume that server bots have static performance (which could be done by treating all their games as occurring in a single all-encompassing rating period)? Although this would technically not be entirely sound in light of hardware changes, future bots that are adaptive over time or even the fluctuating nature of the load on the server, I would still think that such an assumption would be a net winner in terms of informativeness. If this would still be too much or one would like to be able to keep tabs on how performance changes with hardware speed, the assumption could be restricted to the subset of server bots that are supposed to be of fixed strength.

One concrete use for this hybrid system could be to have the calculated ratings themselves be used to help find optimal parameters for other rating systems, in particular those classified as being "incremental" in the WHR paper; these are the comparably simple "ad-hoc" systems, like the one currently in use by the Arimaa gameroom.

Now, given the existence of the WHR list, Omar may have given any further work on the current gameroom rating system a very low priority, but it's clear that the gameroom ratings are still serving various purposes, including even server-technical ones. So even if a change to a more technically justified incremental rating system, like Glicko, is too much to ask for right now, at the very least, I don't think that, for optimization purposes, it would be too much to ask for less-involved changes, like different system parameters or starting ratings.

IP Logged

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Whole History Ratings
« Reply #116 on: Aug 20^th, 2011, 8:16am »

Quote

Modify

Taking the first game of a 2011 server bot as the cutoff point, I applied the WHR system (prior: 2 wins/4, variance: ca. 311 Elo^2/day) to all 160,318 earlier rated games with the modification that for each fixed-performance bot, its games, from its point of view, take place in one perpetual rating period. What follows is their performances relative to the median (retroactive) starting rating of a human player:

Fixed-performance bot	Elo above "beginner"	Standard deviation
bot_MarwinXP2Blitz	868	23
bot_Sharp2010P2	773	44
bot_Marwin2010P2	661	42
bot_Clueless2010P2	660	53
bot_Bomb2005P2	652	7
bot_GnoBot2010P2	608	56
bot_Clueless2008P2	554	46
bot_Clueless2009P2	553	40
bot_Clueless2007P2	549	13
bot_Clueless2005P2	520	12
bot_PragmaticTheory2010P2	465	55
bot_Clueless2009P1	455	45
bot_Sharp2010P1	444	58
bot_Clueless2006P1	435	11
bot_Clueless2006P2	434	16
bot_GnoBot2006P2	427	22
bot_OpFor2008P2	412	10
bot_Clueless2010P1	402	61
bot_Clueless2007P1	396	9
bot_GnoBot2005P2	388	8
bot_Clueless2005P1	374	8
bot_Bomb2005P1	358	5
bot_Badger2010P2	357	71
bot_Loc2006P2	321	14
bot_Marwin2010P1	320	67
bot_Clueless2008P1	312	54
bot_Loc2007P2	311	9
bot_Aamira2006P2	292	7
bot_Sharp2008P2	276	8
bot_PragmaticTheory2010P1	250	69
bot_Arimaazilla	213	4
bot_Loc2005P2	184	13
bot_OpFor2008P1	181	6
bot_Bomb2005P3	147	168
bot_GnoBot2010P1	96	103
bot_Loc2007P1	79	6
bot_Loc2006P1	62	8
bot_Badger2010P1	37	82
bot_Loc2005P1	20	8
bot_GnoBot2005P1	7	5
bot_GnoBot2006P1	3	13
bot_Rat2009P1	0	171
bot_Rat2009P2	0	171
bot_ArimaaScoreP3	-13	165
bot_Sharp2008P1	-48	7
bot_Arimaalon	-66	7
bot_ShallowBlue	-68	6
bot_ArimaaScoreP2	-88	5
bot_Aamira2006P1	-90	6
bot_ArimaaScoreP1	-166	4

« Last Edit: Aug 20^th, 2011, 8:21am by aaaa »

IP Logged

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Whole History Ratings
« Reply #117 on: Aug 21^st, 2012, 11:56pm »

Quote

Modify

woh, would it be too much trouble to have the peak ratings be links to (the comment pages of) the games they correspond to? That would be a really great feature to have.

IP Logged