Arimaa Forum - Whole History Ratings

Welcome, Guest. Please Login or Register.
Apr 18^th, 2024, 8:59pm

Home

Help

Members

Arimaa Forum « Whole History Ratings »

   Arimaa Forum
   Arimaa
   General Discussion (Moderator: supersamu)
   Whole History Ratings

« Previous topic | Next topic »

Pages: 1 ... 5 6 7 8 9 10

Notify of replies

Send Topic

Author

Topic: Whole History Ratings (Read 67394 times)

woh
Forum Guru

Arimaa player #2128

Gender: male

Posts: 254

Re: Whole History Ratings
« Reply #90 on: Apr 22^nd, 2010, 3:14am »

Quote

Modify

Weirdo87, you have a 6-1 win/loss record on event games, including a 1-1 record against the player ranked 22nd and wins against the players ranked 32nd, 53rd and 55th. With such a record I think it is quite acceptable to be in 16th position.

IP Logged

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Whole History Ratings
« Reply #91 on: Apr 22^nd, 2010, 5:54pm »

Quote

Modify

I've managed to implement this system myself and tried to discover what optimized values for the parameters might look like. I'm getting about 1.4 (+/- 0.1) wins/losses for the prior and roughly 235 (+/- 20) Elo^2/day for the variance of the Wiener process. Especially the latter value may seem like the result of overfitting (since a high flexibility of a player's rating would allow it to be separately tailored for each game result), but performance was measured through cross-validation.
If any "official" adoption of these values is contingent on me supplying more details, I'll gladly do so.

IP Logged

omar
Forum Guru

Arimaa player #2

Gender: male

Posts: 1003

Re: Whole History Ratings
« Reply #92 on: Apr 24^th, 2010, 6:00am »

Quote

Modify

on Apr 22^nd, 2010, 5:54pm, aaaa wrote:

Yes, I am interested to know more about the experiments that you have tried. I will probably have lots of questions. I'll contact you by email.

IP Logged

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Whole History Ratings
« Reply #93 on: Apr 25^th, 2010, 10:44pm »

Quote

Modify

It seems to me that it would be better to just discuss it here. Anyway, the following elaboration should hopefully give a complete picture; if not, just post questions here.

First of all, the games considered are, obviously, all the rated ones involving only humans.
At the start of a run, each game that is either the earliest or the latest for either player is set aside. The remaining ones are randomly divided into 10 subsamples.
A Nelder–Mead process then tries to home in on the best combination of the two aforementioned parameters (starting with a random triangle of initial guesses).
For a pair of values to be evaluated, each of the 10 subsamples is, in turn, omitted when the parameters and games are fed into the Whole-History Rating system. Performance is then measured by how well the system predicts the omitted games (constituting the validation set). These values are summed to give the total error for the pair of parameters in question. This seems to be just like 10-fold cross-validation, but games at an extreme time point for any player always have to be part of the training set in order that ratings at time points not occurring in the system can always be interpolated (as given in the last page of the paper).
Finally, the error of a game set is calculated by adding for each member the result of "(r-1)*log(1-e)-r*log(e)" where 'e' is the expected outcome (as calculated with the usual logistic formula) and 'r' the actual one. This seems to me to be the right formula to use, as it makes optimizing the parameters coincide with maximizing their likelihood with respect to the data.

Running the test various times led me to the figures I gave earlier.

IP Logged

omar
Forum Guru

Arimaa player #2

Gender: male

Posts: 1003

Re: Whole History Ratings
« Reply #94 on: Apr 26^th, 2010, 3:42pm »

Quote

Modify

Thanks aaaa. Is it possible you can send me the code you used to run these tests. I'd like to try it out with different data sets.

IP Logged

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Whole History Ratings
« Reply #95 on: Aug 20^th, 2010, 5:25pm »

Quote

Modify

woh, could you add a statistic that shows the geometric mean likelihood of a game outcome for the rating systems? You can calculate it efficiently by calculating the arithmetic mean log-likelihood and raising e to it.

I also notice that "Predict Percentage" is a misnomer for the shown values, as they are unaltered ratios.

IP Logged

woh
Forum Guru

Arimaa player #2128

Gender: male

Posts: 254

Re: Whole History Ratings
« Reply #96 on: Aug 26^th, 2010, 6:31am »

Quote

Modify

on Aug 20^th, 2010, 5:25pm, aaaa wrote:

woh, could you add a statistic that shows the geometric mean likelihood of a game outcome for the rating systems?

Hi aaaa,
It might take a while before I can spend some time on this.

on Aug 20^th, 2010, 5:25pm, aaaa wrote:

I also notice that "Predict Percentage" is a misnomer for the shown values, as they are unaltered ratios.

What do you mean with 'unaltered ratios'?

IP Logged

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Whole History Ratings
« Reply #97 on: Aug 26^th, 2010, 9:15am »

Quote

Modify

on Aug 26^th, 2010, 6:31am, woh wrote:

What do you mean with 'unaltered ratios'?

Well I assume that the respective systems are correctly predicting the winners of 83% and 75% of the games, not merely 0.83% and 0.75%.

In response to your hesitancy of accepting any optimized parameters due to the changing nature of Arimaa players, I've changed the evaluation such that predictions of game outcomes are weighted by how new they are. The weights decay exponentially in such a way that games at the median time point have half the weight in comparison to the newest (the latter, which of course wouldn't itself be evaluated due to aforementioned interpolation restriction, but you get the idea).
Tell me whether this would satisfy you enough to the point of adopting figures that will come with this setup.

IP Logged

woh
Forum Guru

Arimaa player #2128

Gender: male

Posts: 254

Re: Whole History Ratings
« Reply #98 on: Aug 27^th, 2010, 10:10am »

Quote

Modify

on Aug 26^th, 2010, 9:15am, aaaa wrote:

Well I assume that the respective systems are correctly predicting the winners of 83% and 75% of the games, not merely 0.83% and 0.75%.

OK, Thanks aaaa.
Now I've got it.
My bad, I should have given it some more thoughts.

I have changed the title.

IP Logged

woh
Forum Guru

Arimaa player #2128

Gender: male

Posts: 254

Re: Whole History Ratings
« Reply #99 on: Aug 27^th, 2010, 10:12am »

Quote

Modify

on Aug 20^th, 2010, 5:25pm, aaaa wrote:

woh, could you add a statistic that shows the geometric mean likelihood of a game outcome for the rating systems?

Added.

IP Logged

Tuks
Forum Guru

Arimaa player #2626

Gender: male

Posts: 203

Re: Whole History Ratings
« Reply #100 on: Aug 30^th, 2010, 2:53pm »

Quote

Modify

is it possible to have the graphs that janzert made with the whr ratings instead of the gameroom rating, the graphs are a very cool feature but by using the gameroom rating some players have rather strange and inaccurate graphs like omar for example or other players who went through phases of bot bashing or testing

that would actually show accurate progression for each player

IP Logged

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Whole History Ratings
« Reply #101 on: Sep 3^rd, 2010, 3:51pm »

Quote

Modify

I'm going to recommend that the parameters are set to 1.3 and 200. Unlike with the prior, I cannot justify giving anything other than such a round value for the increase in variance in light of the lack of precision of the optimization method with regard to this parameter.
Perhaps the switch could be made immediately, while keeping a separate temporary page with the old parameter values for the duration of the competition season.

Also, I think it would be nice have a column next to that of the peak ratings that shows the respective dates of reaching it.

Thanks.

IP Logged

omar
Forum Guru

Arimaa player #2

Gender: male

Posts: 1003

Re: Whole History Ratings
« Reply #102 on: Sep 15^th, 2010, 7:09pm »

Quote

Modify

on Aug 30^th, 2010, 2:53pm, Tuks wrote:

The WHR system does not keep a static rating which is changed for the two players after the game. It computes the ratings of all the players at once. So woh would have to store the daily WHR snapshots into a database and make it available as a web service in order to do this.

IP Logged

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Whole History Ratings
« Reply #103 on: Dec 22^nd, 2010, 5:02pm »

Quote

Modify

In light of the upcoming use of WHR ratings for the seeding of the championship and the fact that currently any change of parameters shouldn't be disruptive right now, I would like to reiterate my request to change them to values that have at least some empirical basis to them. The latest values I'm getting (with exponential decay) are 1.3 and 170.

IP Logged

Adanac
Forum Guru

Arimaa player #892

Gender:

Posts: 635

Re: Whole History Ratings
« Reply #104 on: Jan 13^th, 2011, 1:59pm »

Quote

Modify

Fritzlein has beaten me 10 straight games but somehow I just passed him in WHR Embarassed

I'll take a screenshot of this miraculous event in case it never happens again!

IP Logged

Pages: 1 ... 5 6 7 8 9 10

Notify of replies

Send Topic


« Previous topic \| Next topic »