Welcome, Guest. Please Login or Register.
Apr 18th, 2024, 8:59pm

Home Home Help Help Search Search Members Members Login Login Register Register
Arimaa Forum « Whole History Ratings »


   Arimaa Forum
   Arimaa
   General Discussion
(Moderator: supersamu)
   Whole History Ratings
« Previous topic | Next topic »
Pages: 1 ... 5 6 7 8 9  10 Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: Whole History Ratings  (Read 67394 times)
woh
Forum Guru
*****



Arimaa player #2128

   


Gender: male
Posts: 254
Re: Whole History Ratings
« Reply #90 on: Apr 22nd, 2010, 3:14am »
Quote Quote Modify Modify

Weirdo87, you have a 6-1 win/loss record on event games, including a 1-1 record against the player ranked 22nd and wins against the players ranked 32nd, 53rd and 55th. With such a record I think it is quite acceptable to be in 16th position.
IP Logged

aaaa
Forum Guru
*****



Arimaa player #958

   


Posts: 768
Re: Whole History Ratings
« Reply #91 on: Apr 22nd, 2010, 5:54pm »
Quote Quote Modify Modify

I've managed to implement this system myself and tried to discover what optimized values for the parameters might look like. I'm getting about 1.4 (+/- 0.1) wins/losses for the prior and roughly 235 (+/- 20) Elo^2/day for the variance of the Wiener process. Especially the latter value may seem like the result of overfitting (since a high flexibility of a player's rating would allow it to be separately tailored for each game result), but performance was measured through cross-validation.
If any "official" adoption of these values is contingent on me supplying more details, I'll gladly do so.
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: Whole History Ratings
« Reply #92 on: Apr 24th, 2010, 6:00am »
Quote Quote Modify Modify

on Apr 22nd, 2010, 5:54pm, aaaa wrote:
I've managed to implement this system myself and tried to discover what optimized values for the parameters might look like. I'm getting about 1.4 (+/- 0.1) wins/losses for the prior and roughly 235 (+/- 20) Elo^2/day for the variance of the Wiener process. Especially the latter value may seem like the result of overfitting (since a high flexibility of a player's rating would allow it to be separately tailored for each game result), but performance was measured through cross-validation.
If any "official" adoption of these values is contingent on me supplying more details, I'll gladly do so.

 
Yes, I am interested to know more about the experiments that you have tried. I will probably have lots of questions. I'll contact you by email.
IP Logged
aaaa
Forum Guru
*****



Arimaa player #958

   


Posts: 768
Re: Whole History Ratings
« Reply #93 on: Apr 25th, 2010, 10:44pm »
Quote Quote Modify Modify

It seems to me that it would be better to just discuss it here. Anyway, the following elaboration should hopefully give a complete picture; if not, just post questions here.
 
First of all, the games considered are, obviously, all the rated ones involving only humans.
At the start of a run, each game that is either the earliest or the latest for either player is set aside. The remaining ones are randomly divided into 10 subsamples.
A Nelder–Mead process then tries to home in on the best combination of the two aforementioned parameters (starting with a random triangle of initial guesses).
For a pair of values to be evaluated, each of the 10 subsamples is, in turn, omitted when the parameters and games are fed into the Whole-History Rating system. Performance is then measured by how well the system predicts the omitted games (constituting the validation set). These values are summed to give the total error for the pair of parameters in question. This seems to be just like 10-fold cross-validation, but games at an extreme time point for any player always have to be part of the training set in order that ratings at time points not occurring in the system can always be interpolated (as given in the last page of the paper).
Finally, the error of a game set is calculated by adding for each member the result of "(r-1)*log(1-e)-r*log(e)" where 'e' is the expected outcome (as calculated with the usual logistic formula) and 'r' the actual one. This seems to me to be the right formula to use, as it makes optimizing the parameters coincide with maximizing their likelihood with respect to the data.
 
Running the test various times led me to the figures I gave earlier.
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: Whole History Ratings
« Reply #94 on: Apr 26th, 2010, 3:42pm »
Quote Quote Modify Modify

Thanks aaaa. Is it possible you can send me the code you used to run these tests. I'd like to try it out with different data sets.
IP Logged
aaaa
Forum Guru
*****



Arimaa player #958

   


Posts: 768
Re: Whole History Ratings
« Reply #95 on: Aug 20th, 2010, 5:25pm »
Quote Quote Modify Modify

woh, could you add a statistic that shows the geometric mean likelihood of a game outcome for the rating systems? You can calculate it efficiently by calculating the arithmetic mean log-likelihood and raising e to it.
 
I also notice that "Predict Percentage" is a misnomer for the shown values, as they are unaltered ratios.
IP Logged
woh
Forum Guru
*****



Arimaa player #2128

   


Gender: male
Posts: 254
Re: Whole History Ratings
« Reply #96 on: Aug 26th, 2010, 6:31am »
Quote Quote Modify Modify

on Aug 20th, 2010, 5:25pm, aaaa wrote:
woh, could you add a statistic that shows the geometric mean likelihood of a game outcome for the rating systems?

 
Hi aaaa,
It might take a while before I can spend some time on this.
 
on Aug 20th, 2010, 5:25pm, aaaa wrote:
I also notice that "Predict Percentage" is a misnomer for the shown values, as they are unaltered ratios.

 
What do you mean with 'unaltered ratios'?
IP Logged

aaaa
Forum Guru
*****



Arimaa player #958

   


Posts: 768
Re: Whole History Ratings
« Reply #97 on: Aug 26th, 2010, 9:15am »
Quote Quote Modify Modify

on Aug 26th, 2010, 6:31am, woh wrote:
What do you mean with 'unaltered ratios'?

Well I assume that the respective systems are correctly predicting the winners of 83% and 75% of the games, not merely 0.83% and 0.75%.
 
In response to your hesitancy of accepting any optimized parameters due to the changing nature of Arimaa players, I've changed the evaluation such that predictions of game outcomes are weighted by how new they are. The weights decay exponentially in such a way that games at the median time point have half the weight in comparison to the newest (the latter, which of course wouldn't itself be evaluated due to aforementioned interpolation restriction, but you get the idea).
Tell me whether this would satisfy you enough to the point of adopting figures that will come with this setup.
IP Logged
woh
Forum Guru
*****



Arimaa player #2128

   


Gender: male
Posts: 254
Re: Whole History Ratings
« Reply #98 on: Aug 27th, 2010, 10:10am »
Quote Quote Modify Modify

on Aug 26th, 2010, 9:15am, aaaa wrote:

Well I assume that the respective systems are correctly predicting the winners of 83% and 75% of the games, not merely 0.83% and 0.75%.

 
OK, Thanks aaaa.
Now I've got it.
My bad, I should have given it some more thoughts.
 
I have changed the title.
IP Logged

woh
Forum Guru
*****



Arimaa player #2128

   


Gender: male
Posts: 254
Re: Whole History Ratings
« Reply #99 on: Aug 27th, 2010, 10:12am »
Quote Quote Modify Modify

on Aug 20th, 2010, 5:25pm, aaaa wrote:
woh, could you add a statistic that shows the geometric mean likelihood of a game outcome for the rating systems?

 
Added.
IP Logged

Tuks
Forum Guru
*****



Arimaa player #2626

   


Gender: male
Posts: 203
Re: Whole History Ratings
« Reply #100 on: Aug 30th, 2010, 2:53pm »
Quote Quote Modify Modify

is it possible to have the graphs that janzert made with the whr ratings instead of the gameroom rating, the graphs are a very cool feature but by using the gameroom rating some players have rather strange and inaccurate graphs like omar for example or other players who went through phases of bot bashing or testing
 
that would actually show accurate progression for each player
IP Logged
aaaa
Forum Guru
*****



Arimaa player #958

   


Posts: 768
Re: Whole History Ratings
« Reply #101 on: Sep 3rd, 2010, 3:51pm »
Quote Quote Modify Modify

I'm going to recommend that the parameters are set to 1.3 and 200. Unlike with the prior, I cannot justify giving anything other than such a round value for the increase in variance in light of the lack of precision of the optimization method with regard to this parameter.
Perhaps the switch could be made immediately, while keeping a separate temporary page with the old parameter values for the duration of the competition season.
 
Also, I think it would be nice have a column next to that of the peak ratings that shows the respective dates of reaching it.
 
Thanks.
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: Whole History Ratings
« Reply #102 on: Sep 15th, 2010, 7:09pm »
Quote Quote Modify Modify

on Aug 30th, 2010, 2:53pm, Tuks wrote:
is it possible to have the graphs that janzert made with the whr ratings instead of the gameroom rating, the graphs are a very cool feature but by using the gameroom rating some players have rather strange and inaccurate graphs like omar for example or other players who went through phases of bot bashing or testing
 
that would actually show accurate progression for each player

 
The WHR system does not keep a static rating which is changed for the two players after the game. It computes the ratings of all the players at once. So woh would have to store the daily WHR snapshots into a database and make it available as a web service in order to do this.
IP Logged
aaaa
Forum Guru
*****



Arimaa player #958

   


Posts: 768
Re: Whole History Ratings
« Reply #103 on: Dec 22nd, 2010, 5:02pm »
Quote Quote Modify Modify

In light of the upcoming use of WHR ratings for the seeding of the championship and the fact that currently any change of parameters shouldn't be disruptive right now, I would like to reiterate my request to change them to values that have at least some empirical basis to them. The latest values I'm getting (with exponential decay) are 1.3 and 170.
IP Logged
Adanac
Forum Guru
*****



Arimaa player #892

   
Email

Gender: male
Posts: 635
Re: Whole History Ratings
« Reply #104 on: Jan 13th, 2011, 1:59pm »
Quote Quote Modify Modify

Fritzlein has beaten me 10 straight games but somehow I just passed him in WHR  Embarassed  I'll take a screenshot of this miraculous event in case it never happens again!
IP Logged


Pages: 1 ... 5 6 7 8 9  10 Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.