Welcome, Guest. Please Login or Register.
Nov 23rd, 2024, 12:56am

Home Home Help Help Search Search Members Members Login Login Register Register
Arimaa Forum « Arimaa Top 25 COMPUTER Power Ranking Results »


   Arimaa Forum
   Arimaa
   General Discussion
(Moderator: supersamu)
   Arimaa Top 25 COMPUTER Power Ranking Results
« Previous topic | Next topic »
Pages: 1 2 3  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: Arimaa Top 25 COMPUTER Power Ranking Results  (Read 4039 times)
The_Jeh
Forum Guru
*****



Arimaa player #634

   


Gender: male
Posts: 460
Arimaa Top 25 COMPUTER Power Ranking Results
« on: Nov 3rd, 2007, 12:10am »
Quote Quote Modify Modify

With the help of Fritzlein, I was able to get data for all rated Human v. Human games (played through noon Oct 28, 2007), and I put the results through a least-squares regression model to obtain the following rankings:
 
All Players:
 
1. xabiron 1-0 2.26468
2. dethwing 2-0 1.76469
3. spela 2-0 1.36315
4. Sameer 1-2 1.26475
5. omarFast 1-0 1.13060
6. GordonBlack 1-0 1.11907
7. acroninj 2-0 1.0
7. archigavr 1-0 1.0
7. Virgeist 1-0 1.0
7. Yaron 1-0 1.0
7. pikachamp 1-0 1.0
7. BLooodyANgel 1-0 1.0
7. marcgb 1-0 1.0
7. emeryaj 1-0 1.0
7. glitch 1-0 1.0
7. Gesuma 2-0 1.0
7. brad 1-0 1.0
7. i_am_you 1-0 1.0
7. ZeroOne 1-0 1.0
7. Yzaxtol 1-0 1.0
7. mightybyte 2-0 1.0
7. Asturianuco 3-0 1.0
7. Guest5409 3-0 1.0
7. travis 1-0 1.0
25. Fritzlein 385-51 .86405
 
Now, you may notice that the list above is pretty meaningless, and I have to agree. For a better ranking, I will ignore players with fewer than 15 HvH rated games played. The first number is the rating, the second is the schedule strength. This is actually a list of all players who have played 15+rated HvH games:
 
1. Fritzlein 385-51             .86405     .09799
2. 99of9 182-55                .71591     .18005
3. chessandgo 227-98      .56717     .17025
4. robinson 185-112         .55218     .30639
5. Adanac 105-80             .45143      .31630
6. PMertens 246-144        .41130     .14976
7. RonWeasley 49-24       .33907     -.00340
8. Belbo 60-67                  .27918      .33430
9. omar 81-64                  .26476       .14752
10. UltraWeak 10-5         .26066      -.07268
11. thorin 13-9                .22853         .04671
12. Paul 26-26                 .21480       .21480
13. BlackKnight 7-11        .19097      .41319
14. Akhenaten 10-7         .17647       0.00000
15. clauchau 23-21          .17627       .13082
16. Ryan_Cable 76-78     .16039        .17337
17. petitprince 11-6         .14689       -.14723
18. mdk 15-13                 .14515         .07372
19. naveed 117-129        .13060        .17938
20. kamikazeking 37-38   .11908         .13241
21. Brendan 30-46           .10497         .31550
22. OLTI 59-58                .02792         .01937
23. jdb 104-112              .02441            .06145
24. blue22 51-66             -0.00199      .12622
25. arimaa_master 170-83 -.04366     -.38753
26. Swynndla 72-38        -.08838          -.39747
27. nbarriga 14-9             -.16625         -.38364
28. appalachia 7-10          -.17647         0.00000
29. kerdamdam 18-18       -.20366      -.20366
30. KT2006 10-10             -.20886        .20886
31. Soter 23-7                  -.20965       -.74298
32. megamau 21-41        -.23047         .09211
33. camelback 11-8           -.23434      -.39224
34. mistre 22-16              -.26178           -.41967
35. Tanker_JD 15-11         -.29379       -.44763
36. The_Jeh 13-48           -.33582        .23795
37. Arimanator 23-47       -.36339        -.02054
38. IdahoEv 31-42           -.38449         -.23381
39. purplebaron 7-11       -.42379          -.20156
40. woh 21-37                  -.42449        -.14863
41. Mr. Brain 6-18             -.43030        .06970
42. H_Bobbeltoff 8-35       -.45670          .17121
43. rick 8-17                      -.46878        -.10878
44. Chegorimaa 10-34       -.46912        .07633
45. seanick 62-159           -.52921          -.09030
46. frostlad 8-22               -.53164          -.06497
47. friztforpresident 6-17   -.53320        -.05494
48. Tore 10-17                   -.54907         -.28981
49. grey_0x2A 3-20           -.57682         .16231
50. dtj 8-25                       -.62151          -.10636
51. Keith 3-14                   -.65800         -.01094
52. NIC1138 25-97            -.70551        -.11535
53. aaaa 3-19                   -.82908         -.10181
54. Gregorius 0-17             -.83280          .16720
55. Slowstorm 7-20          -.84630          -.36482
56. Erezap 3-18               -.89784           -.18356
57. mentalsurge 4-13       -1.01733         -.48792
58. proselyte 7-19            -1.07071        -.60917
59. Calumet45 3-27           -1.10712       -.30712
60. Kruschak 0-17             -1.14554         -.14554
 
Here are the top 10 players in schedule strength among the 15+ game group:
1. BlackKnight .41319
2. Belbo .33430
3. Adanac .31630
4. Brendan .31550
5. robinson .30639
6. The_Jeh .23795
7. Paul .21480
8. 99of9 .18005
9. naveed .17938
10. Ryan_Cable .17121
 
Extra notes: 376 distinct users have participated in a rated HvH game, and there have been 3068 such games as of noon October 28, 2007.
« Last Edit: Nov 3rd, 2007, 2:25am by The_Jeh » IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #1 on: Nov 3rd, 2007, 12:23am »
Quote Quote Modify Modify

Very interesting.  I think this ranking is reasonable if all games in history count equally.  However, if you weight more recent games more heavily, and allow for the strength of players to have fluctuated over time, chessandgo would rise and robinson would fall, I expect.
 
Also, is there some elegant way to weight ratings toward the middle, so that 1-0 players don't rate so high?
 
I'm curious to see how this compares to the p8 ratings, since those ratings do have a ballast to hold inexperienced players near 1500, as well as a decay factor to weight older results less heavily.
 
Still, it's good to know that, taking all games at once, I'm the #1 player of all time.  (Well, #25, at least... Tongue)
IP Logged

Janzert
Forum Guru
*****



Arimaa player #247

   


Gender: male
Posts: 1016
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #2 on: Nov 3rd, 2007, 12:39am »
Quote Quote Modify Modify

Would it be possible to calculate a confidence interval then rank by the conservative rating, i.e. the rating minus the confidence interval?
 
Interesting result as it stands though.
 
Janzert
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #3 on: Nov 4th, 2007, 1:41am »
Quote Quote Modify Modify

Nice job on producing this list Jeh. The rankings here seem to match very closely with our intuitive feel for players strengths. I could probably use this for ordering the players in the Swiss preliminary, if we can't produce a better list before January.
 
If you want to take a crack at generating P8 ratings you can find the code for it here:
http://arimaa.com/arimaa/rating/testRatings.tgz
IP Logged
The_Jeh
Forum Guru
*****



Arimaa player #634

   


Gender: male
Posts: 460
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #4 on: Nov 4th, 2007, 11:57pm »
Quote Quote Modify Modify

I thought it might me interesting to do the calculations again using only games played after the conclusion of the last WC, which would mark the start of a new season. Thanks again to Fritzlein. Here are the results of that calculation:
 
1. clauchau 1-0
2. ntroncos 1-0
3. chessandgo 66-7
4. Rabbit 2-0
5. Yzaxtol 2-0
6. ZeroOne 1-0
7. Virgeist 1-0
8. PatoGuy 1-0
9. Fritzlein 66-6
10. challenger 2-0
11. Raymond 2-0
12. RonWeasley 23-6
13. 99of9 11-4
14. Brendan 12-7
15. knarl 1-0
16. PMertens 15-8
17. smonroy 1-0
18. jdb 12-6
19. UltraWeak 2-1
20. blue22 17-10
21. OLTI 5-5
22. robinson 2-2
23. omar 6-3
24. arimaa_master 90-28
25. petitprince 11-6
 
The problem inevitably encountered with these calculations is isolated pools of players who play each other but do not play anyone outside of their circle. For those players who are connected by games, the results are reasonable enough relative to each other. Just to make things look better, I'll cut out players who've played fewer than five games:
 
1. chessandgo 66-7   1.108   .300
2. Fritzlein 66-6   .903   .069
3. RonWeasley 23-6   .732   .146
4. 99of9 11-4  .689   .222
5. Brendan 12-7   .667   .404
6. PMertens 15-8   .582   .278
7. jdb 12-6   .476   .143
8. blue22 17-10   .437   .178
9. OLTI 5-5   .403   .403
10. omar 6-3   .375   .041
11. arimaa_master 90-28   .367   -.156
12. petitprince 11-6   .357   .063
13. mdk 15-13   .324   .252
14. Adanac 6-10   .215   .465
15. nbarriga 5-3   .177   -.073
16. Soter 25-7   .100   -.462
17. mistre 23-16   .090   -.089
18. camelback 11-7 .044   -.178
19. woh 15-17   .043   .106
20. Tanker_JD 15-11    .033   -.121
21. seanmcl 4-4   0   0
21. Asubfive 4-4   0   0
23. JacquesB 5-4   -.008   -.119
24. kerdamdam 5-5   -.061   -.061
25. megamau 2-1   -.113   -.446
26. IdahoEv 16-17   -.127   -.097
27. seanick 9-11   -.149   -.049
28. The_Jeh 13-32   -.154   .268
29. Chegorimaa 7-17   -.170   .247
30. Erezap 3-8   -.433   .021
31. NIC1138 25-85   -.450   .096
32. K_Hayes 3-5   -.493   -.243
33. ChrisB 5-6   -.517   -.426
34. aaaa 3-19   -.547   .180
35. Slowstorm 3-11   -.595   -.023
36. naveed 1-14   -.622   .244
37. Ganesha 0-5   -.623   .377
38. dougk 0-6   -.631  .369
39. nogard 3-6   -.656    -.323
40. BBcardsRI 0-5   -.715   .285
41. gunananda 1-4   -.738   -.138
42. Kruschak 0-17   -.759   .241
43. proselyte 7-19   -.760   -.299
44. froody 6-13   -.815   -.447
45. pcpdams 1-8   -.818   -.041
46. Krasnotron 4-7   -.842   -.569
47. willwould 2-4   -.984   -.650
48. casparix 0-9   -1.076   -.076
 
What are your opinions of the second option compared to the first? If only people would play a variety of opponents, this would work much better.
« Last Edit: Nov 5th, 2007, 8:16am by The_Jeh » IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #5 on: Nov 5th, 2007, 8:01am »
Quote Quote Modify Modify

For those of us who have been playing for longer than a year, I think the second list better reflects our results in the most recent year.  All my "learning losses" to 99of9 and chessandgo's learning losses to me are not included, which probably gives a better indication of current playing strength.
 
Still, there are players like mdk and mistre who have improved a great deal within the last year.  I'm not sure what one can do about that, because at some point using only the most recent games makes the sample of games too small to be useful.
 
Fortunately, pre-tournament ratings only have a limited impact, because the preliminary sorts things out better to seed the final, and in the final everyone gets two lives again.  The tournament will be long enough this year that it will be unequivocally settled over the board within the tournament, rather than being too influenced by ratings generated during the rest of the year.
IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #6 on: Nov 8th, 2007, 10:19pm »
Quote Quote Modify Modify

on Nov 4th, 2007, 1:41am, omar wrote:
I could probably use this for ordering the players in the Swiss preliminary, if we can't produce a better list before January.

I like that you are opening up the process, Omar, and that you will possibly use a ranking list from the community.  However, my current preference would be for using the p8 HvH ratings rather than the list produced by The_Jeh's program, because for seeding the Swiss preliminary, we need to be able to seed everyone.  The_Jeh's list is quite reasonable when we cut out everyone who played too few games, but for seeding the tournament we don't have the luxury of omitting players.
 
John, do you think you could tweak your algorithm so that players with few games also have a reasonable rating?  One idea would be to add in an anchor player, and fake results that everyone has one win and one loss against the anchor player.  That will bias everyone towards the mean and (perhaps) produce reasonable seeds for inexperienced players.
IP Logged

The_Jeh
Forum Guru
*****



Arimaa player #634

   


Gender: male
Posts: 460
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #7 on: Nov 8th, 2007, 11:32pm »
Quote Quote Modify Modify

Your idea of adding an anchor player might work. It would bring everyone toward the mean, but it would affect players with fewer games more than those with many games. It might punish good players with few games more than we'd like. I'll have to see the results to know for sure.
 
One thing that I know it would help is connecting players into one pool. For example, in a pool of two players who've played one game, there are an infinite number of solutions. If A defeats B, as long as -A=B, any ratings would solve the system. So in my previous posts, players who are 1-0 and have a rating of 1 would have had a rating of 0 had I done an odd number of iterations. In the case of everyone else, the ratings do converge to a single solution that minimizes the squared error. I think with an anchor, everyone will be connected to the big pool that has one solution, so everyone's rating will converge.
« Last Edit: Nov 9th, 2007, 12:40am by The_Jeh » IP Logged
The_Jeh
Forum Guru
*****



Arimaa player #634

   


Gender: male
Posts: 460
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #8 on: Nov 9th, 2007, 10:57am »
Quote Quote Modify Modify

With the anchor player added, the rankings are as follows. (And I've consolidated Arimanator's accounts.) The W-L are given without the anchor games. This still uses the data from last time:
 
1. Chessandgo 66-7
2. Fritzlein 66-6
3. RonWeasley 23-6
4. Brendan 12-7
5. 99of9 11-4
6. PMertens 15-8
7. clauchau 1-0
8. Arimanator 8-2
9. jdb 12-6
10. arimaa_master 90-28
11. blue22 17-10
12. petitprince 11-6
13. mdk 15-13
14. omar 6-3
15. OLTI 5-5
16. Soter 25-7
17. Rabbit 2-0
18. UltraWeak 2-1
19. mistre 23-16
20. ntroncos 1-0
21. Tau 3-0
22. Robinson 2-2
23. Adanac 6-10
24. woh 15-17
25. camelback 11-7
 
I'm not sure I like this yet, either.
« Last Edit: Nov 9th, 2007, 11:09am by The_Jeh » IP Logged
The_Jeh
Forum Guru
*****



Arimaa player #634

   


Gender: male
Posts: 460
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #9 on: Nov 9th, 2007, 12:00pm »
Quote Quote Modify Modify

I guess the problem is always that assumptions have to be made. You assume players who are 1-0 or 2-0 on the list are weaker than what this rating says because you have access to knowledge the computer doesn't. You know they might have gotten lucky or might have lost other games not considered here. I, however, cannot maintain absolute objectivity by adding presumptions into the formula. And adding these presumptions always helps some things while hurting others.
 
I've tried several different schemes of adding fictitious games, such as the Anchor player, and also Genius/Idiot players who always win or always lose, but the results are always better in some respects and worse in others. The only way for me to achieve greater accuracy is to add more true games.
 
So that's what I'm going to do. Fritzlein, if you would be so kind, please e-mail me the spreadsheet of all rated games, HH HB and BB, played within the last 12 months, and a second list with only the last 6 months. I really won't know if it's feasible to calculate all that until I try. Actually, I'm thinking it's possible. If it can be done, the results should be perfectly acceptable. If there still are players you think should be lower or higher, you will have no evidence to point to that the computer won't have considered.  
 
I am not necessarily saying that this should replace p8, though.
« Last Edit: Nov 9th, 2007, 12:55pm by The_Jeh » IP Logged
Janzert
Forum Guru
*****



Arimaa player #247

   


Gender: male
Posts: 1016
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #10 on: Nov 9th, 2007, 2:17pm »
Quote Quote Modify Modify

Let's say you have player A beating player C in 100 games and losing 50 games. At the same time player B beats player C in 2 games and loses 1 game.
 
While on the one hand you can say that from the data available it appears that players A and B are both twice as good as C. You should also be able to say that you are much more confident player A is twice as good as C than you are that player B is.
 
Janzert
« Last Edit: Nov 9th, 2007, 2:18pm by Janzert » IP Logged
The_Jeh
Forum Guru
*****



Arimaa player #634

   


Gender: male
Posts: 460
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #11 on: Nov 9th, 2007, 3:18pm »
Quote Quote Modify Modify

Yes, but I cannot translate lack of confidence into a lower rating.
« Last Edit: Nov 9th, 2007, 3:20pm by The_Jeh » IP Logged
Janzert
Forum Guru
*****



Arimaa player #247

   


Gender: male
Posts: 1016
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #12 on: Nov 9th, 2007, 4:36pm »
Quote Quote Modify Modify

The way I've seen is to subtract the confidence interval from the apparent rating. Basically this means the resulting rating is saying we believe this players true rating to be at least this good with whatever confidence the interval used is.
 
Janzert
« Last Edit: Nov 9th, 2007, 4:39pm by Janzert » IP Logged
The_Jeh
Forum Guru
*****



Arimaa player #634

   


Gender: male
Posts: 460
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #13 on: Nov 9th, 2007, 5:03pm »
Quote Quote Modify Modify

I see. You want each player to be given a performance rating on each game played, a standard deviation calculated from these games, and then a t-model used to determine the confidence interval of their true rating?
 
I admit, it's getting a bit complicated for me. Right now, I am anxious to see the results from all the rated games of the past year, including bots. Anyone considering entering the WC, though he finds it hard to find humans to play, likely plays the bots several times. I know a reason why HB games aren't used for p8's for the WC - because playing a thousand games against weak bots will inflate one's rating. I know p8 attempts to correct this, but it does so imperfectly. That is a nonissue with this system. But we'll get sufficient quantity with bot games also considered, and we can benefit from being able to include the type of game most often played on the server. Sorry if I keep asking for more, Fritzlein, but I think I'm nearing the max of what I could ask.
« Last Edit: Nov 9th, 2007, 5:28pm by The_Jeh » IP Logged
mistre
Forum Guru
*****





   


Gender: male
Posts: 553
Re: Arimaa Top 25 COMPUTER Power Ranking Results
« Reply #14 on: Nov 9th, 2007, 5:51pm »
Quote Quote Modify Modify

I am continuing to watch this topic with interest.  Thanks for all of your research, John!
 
IP Logged

Pages: 1 2 3  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.