Author |
Topic: Whole History Ratings (Read 67266 times) |
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: Whole History Ratings
« Reply #45 on: Mar 26th, 2009, 2:31pm » |
Quote Modify
|
on Mar 21st, 2009, 6:47am, woh wrote: Omar At this moment the source of the WHR rating tool is the Arimaa game archive. This archive is only updated on a weekly base. To make daily updates available I would need another source. What would you suggest? The tool is a Windows executable. Can you run this on the arimaa.com server? |
| I've set it up to run everyday now. The earliest time for you to pick up the new data would be 9:10 am GMT.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Whole History Ratings
« Reply #46 on: Mar 31st, 2009, 3:03pm » |
Quote Modify
|
Woh, your graphs of player ratings over time made me think of an excellent application of WHR. In the Arimaa article on Wikipedia, I have included the ranks of the human defenders of the Challenge. However, I always felt odd that the ranks were based on the inaccurate game room ratings. For known duplicate accounts I included only the account for which the most human games had been played, but still there were many ratings distorted by bot bashing in the actual ratings I used. One feature of WHR is that we don't just have to use them going forward; we can use information after a point in time to retroactively improve our guess of a player's skill level at that time. We are now in a position to more accurately rank the players at the times they played the Challenge games of past years. If you have time to extract the historical data, I would love to update the Wikipedia article with more realistic rankings. (Incidentally, for this purpose the level of the anchor doesn't matter, since I only want relative positions of the players.) The dates of interest would be the starting dates of each challenge, namely February 2, 2004 (omar) February 7, 2005 (Belbo) February 5, 2006 (Fritzlein, Adanac, PMertens) February 11, 2007 (Fritzlein, Brendan, omar, naveed) April 6, 2008 (chessandgo, Adanac, mistre, omar) Thanks in advance if you have time for this project!
|
|
IP Logged |
|
|
|
woh
Forum Guru
Arimaa player #2128
Gender:
Posts: 254
|
|
Re: Whole History Ratings
« Reply #47 on: Apr 2nd, 2009, 9:21am » |
Quote Modify
|
on Mar 26th, 2009, 2:31pm, omar wrote: I've set it up to run everyday now. The earliest time for you to pick up the new data would be 9:10 am GMT. |
| Thank you very much, Omar! It is quite faster to generate the new rankings if the game archive is up to date. The rankings are now updated daily around 5 PM GMT.
|
|
IP Logged |
|
|
|
woh
Forum Guru
Arimaa player #2128
Gender:
Posts: 254
|
|
#2Re: Whole History Ratings
« Reply #48 on: Apr 2nd, 2009, 9:29am » |
Quote Modify
|
on Mar 31st, 2009, 3:03pm, Fritzlein wrote:If you have time to extract the historical data, I would love to update the Wikipedia article with more realistic rankings. |
| It was not much trouble to generate these rankings. So here are the results. Follow the links for the full rankings on those dates. February 2, 2004 omar #1 February 7, 2005 Belbo #5 February 5, 2006 Fritzlein #1 Adanac #2 PMertens #5 February 11, 2007 Fritzlein #1 Brendan #12 omar #9 naveed #23 April 6, 2008 chessandgo #2 Adanac #3 mistre #20 omar #24
|
« Last Edit: Apr 2nd, 2009, 9:33am by woh » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Whole History Ratings
« Reply #49 on: Apr 2nd, 2009, 10:49am » |
Quote Modify
|
Thanks again for all your work on this, woh. I have updated the Wikipedia page. I knew that #1/#2 would be close between chessandgo and myself for the 2008 Challenge; by your results I was a whopping 7 rating points ahead. Another interesting side note is that the early-2008 ratings have the two of us a little over 300 points ahead of Adanac at #3, whereas now chessandgo is only 160 points ahead of him and I'm only 90 ahead. Sounds about right given the results then and now.
|
|
IP Logged |
|
|
|
mistre
Forum Guru
Gender:
Posts: 553
|
|
Re: Whole History Ratings
« Reply #50 on: Apr 3rd, 2009, 6:06am » |
Quote Modify
|
This might be asking too much - but would it be possible to further separate the WHR for postal vs live games if just for a one time analysis rather than an ongoing rating. I have no doubt that I am a stronger player postally than I am live, but it would interesting to see by how much.
|
|
IP Logged |
|
|
|
woh
Forum Guru
Arimaa player #2128
Gender:
Posts: 254
|
|
Re: Whole History Ratings
« Reply #51 on: Oct 30th, 2009, 10:01am » |
Quote Modify
|
on Apr 3rd, 2009, 6:06am, mistre wrote:I have no doubt that I am a stronger player postally than I am live, but it would interesting to see by how much. |
| Your ranking based on the postal games only is 5 position higher than your ranking based on all games (18 vs 23). Other players with a better postal ranking include Fritzlein at 1 vs 2. 99of9 3 (7) jdb 5 (15) camelback 7 (13) omar 11 (16) ChrisB 13 (17) Simon 14 (46) ! OLTI 15 (30) Tuks 17 (20) And me? I make a move in the opposite direction, dropping from 38 to 49.
|
|
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: Whole History Ratings
« Reply #52 on: Oct 30th, 2009, 3:22pm » |
Quote Modify
|
Would you be willing to apply your rating system to rated games involving only developer bots?
|
|
IP Logged |
|
|
|
Simon
Forum Guru
Arimaa player #1198
Gender:
Posts: 125
|
|
Re: Whole History Ratings
« Reply #53 on: Oct 30th, 2009, 7:36pm » |
Quote Modify
|
There would be a much larger discrepancy in my ratings if you compared my postal rating with my rating for live games only, as my record is (not counting my accidental resignation against ChrisB) 4-0 in H. v. H postal games and 1-5 in H. v. H live games. One thing I am wondering about whole history ratings, is how the prior works. I take it that there is an imaginary pair of games, one won and one lost, against an opponent with a standard rating. But when is this imaginary game supposed to have occured? If it is taken to have occured far in the past, say when a player joined or played their first game, then the effects of the prior would decay away over time, resulting in inflated ratings for players (such as myself) who have a long time gap between their most recent win and their most recent loss or first entry into the system. The way it ought to work is that the prior games are taken to have occured at the present moment, i.e. at the moment the ratings are calculated. I am not sure if that is how it does work, however. Edit: or maybe simultaneous with that player's most recent game. Otherwise the ratings of inactive players would tend to move towards the standard rating as they remain inactive. And, actually, maybe even that version could be problematic because a win against a weak player by a long-inactive player would likely result in a sudden rating drop for the winner. On the other hand, continued playing would smooth things out...though, maybe the rating convergence of inactive players wouldn't be a bad thing, it would quickly be corrected by resumed play and would encourage (higher-rated than the standard rating) players to remain active. Further edit: making the prior games occur at the moment the ratings are calculated would also tend to make players who play infrequently tend to have ratings closer to the standard rating than players with similar skill who play infrequently. (in contrast to the system of having prior games at the beginning, which results in exaggerated ratings for players who play infrequently, particularly if on a winning ir losing streak). One possibility might be to have a fraction of a prior pair of games for every game a player plays. In order to avoid making players who play to many uneven games get a rating too close to the standard rating, the weight of the prior game pair could be adjusted based on the expected absolute value of rating change for the player from that game, or some similar metric.
|
« Last Edit: Oct 30th, 2009, 8:11pm by Simon » |
IP Logged |
|
|
|
woh
Forum Guru
Arimaa player #2128
Gender:
Posts: 254
|
|
Re: Whole History Ratings
« Reply #54 on: Oct 31st, 2009, 5:51am » |
Quote Modify
|
on Oct 30th, 2009, 3:22pm, aaaa wrote:Would you be willing to apply your rating system to rated games involving only developer bots? |
| I am not sure what you mean with 'developer bots'. Do you mean all BvB games or only games between some particular bots?
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Whole History Ratings
« Reply #55 on: Oct 31st, 2009, 6:24am » |
Quote Modify
|
Like Simon, I am curious how the prior is applied relative to time-varying ratings. I had assumed that the win and loss against a 1500 player would be coincident with each player's first real game, and the effect of the prior would damp out over time, but now I see that this could result in a player with a one-game history eventually having a weaker prior than a player with no game history, which would be odd. If that is what happens, how long does it take for it to happen?
|
|
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: Whole History Ratings
« Reply #56 on: Oct 31st, 2009, 8:55am » |
Quote Modify
|
on Oct 31st, 2009, 5:51am, woh wrote: I am not sure what you mean with 'developer bots'. Do you mean all BvB games or only games between some particular bots? |
| Games between bots not hosted on the server, i.e. those not listed here.
|
|
IP Logged |
|
|
|
woh
Forum Guru
Arimaa player #2128
Gender:
Posts: 254
|
|
Re: Whole History Ratings
« Reply #57 on: Nov 6th, 2009, 8:16am » |
Quote Modify
|
on Oct 31st, 2009, 8:55am, aaaa wrote: Games between bots not hosted on the server, i.e. those not listed here. |
| I generated the rankings based on those games.
|
|
IP Logged |
|
|
|
Janzert
Forum Guru
Arimaa player #247
Gender:
Posts: 1016
|
|
Re: Whole History Ratings
« Reply #58 on: Nov 6th, 2009, 8:27am » |
Quote Modify
|
on Nov 6th, 2009, 8:16am, woh wrote: I generated the rankings based on those games. |
| Hmm, very interesting. Quite different than I expected. What games were included? Thanks for doing this, Janzert
|
|
IP Logged |
|
|
|
woh
Forum Guru
Arimaa player #2128
Gender:
Posts: 254
|
|
Re: Whole History Ratings
« Reply #59 on: Nov 6th, 2009, 8:34am » |
Quote Modify
|
on Nov 6th, 2009, 8:27am, Janzert wrote:What games were included? |
| All rated games between 2 bots which both are not listed on the 'Arimaa Bots Available to Play' page.
|
|
IP Logged |
|
|
|
|