Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> General Discussion >> HH ratings using the current rating system
(Message started by: omar on Sep 28th, 2008, 9:06pm)

Title: HH ratings using the current rating system
Post by omar on Sep 28th, 2008, 9:06pm
In the chatroom today there was some discussion about using HH ratings for the initial seeding of the 2009 WC tournament.

I modified one of my rating system programs to see what the ratings would be now if our current rating system was applied to only HH rated games from the very beginning. Here are the results:
 http://arimaa.com/arimaa/gameroom/xHHratings.txt

The input to the rating system was an array of rated game records sorted by the game end time. The only information used from the game records was the usernames of the two players and the result of the game. All players started out with a rating of 1300 and a RU of 120. The output of the rating system was an array of game records with the additional information of the rating and RU of the two players just before the game ended. Thus these ratings and the game result can be used to compute the ratings of the players after the game.

Here is the complete output of the rating system. The file is about 300K.
 http://arimaa.com/arimaa/gameroom/xHHgamesOut.txt

Here is the program for the rating system.
 http://arimaa.com/arimaa/gameroom/xHHratingSys.txt

Do these look accurate? The ratings are all shifted down by almost 500 points, but its the rating difference that matter.

Title: Re: HH ratings using the current rating system
Post by Janzert on Sep 28th, 2008, 9:36pm
Two quick thoughts. First my impression is that they seem too compressed. It seems every game would bump someone up or down in ranking for many cases. Second while the overall order seems pretty good, some cases still seem out of whack (e.g. the first one that stands out is I'm surprised to see Ultraweak in the 4th rank).

Janzert

Title: Re: HH ratings using the current rating system
Post by Fritzlein on Sep 29th, 2008, 4:27am
My first impression is that it looks better than our current systm, and not just because it vaults me up to #1.  :)  I think this would be a much better way to seed the 2009 preliminaries than via game room ratings.  Thanks for coding that up, Omar.

Title: Re: HH ratings using the current rating system
Post by Fritzlein on Sep 29th, 2008, 4:39am

on 09/28/08 at 21:36:49, Janzert wrote:
First my impression is that they seem too compressed.

Probably they are too compressed.  If everyone starts at the middle, it takes time for the range to spread out, and there probably haven't been enough HvH games to make it spread as much as it should.  But I'm evaluating from a standpoint of how to seed the 2009 World Championship preliminary round, and the various attempts we had to seed the 2008 World Championship .  Compared to the rankings we were generating in advance of this year's tournament, these numbers are fine.


Quote:
Second while the overall order seems pretty good, some cases still seem out of whack (e.g. the first one that stands out is I'm surprised to see Ultraweak in the 4th rank).

Yes, but is it surprising because his results against humans have been poor, or because you're not used to seeing him ranked there?  We ought to at least look at who he has beaten and lost to before saying that this system puts Ultraweak too high.

Title: Re: HH ratings using the current rating system
Post by aaaa on Sep 29th, 2008, 6:54am
My code doesn't weigh by time and is therefore very sensitive to recently played games. Here is what it currently outputs:

chessandgo      2600.050409
Fritzlein       2585.987356
RonWeasley      2439.366635
PMertens        2247.598538
UltraWeak       2173.314777
blue22  2149.491492
99of9   2087.50461
omar    2084.626293
The_Jeh 2073.961225
Adanac  2066.613984
Brendan 2034.590901
mdk     2031.38806
robinson        1994.640603
mistre  1986.727672
Belbo   1966.353365
clauchau        1958.606471
arimaa_master   1946.160642
jdb     1901.204228
naveed  1853.643173
Ryan_Cable      1852.65595
camelback       1832.60653
thorin  1830.300987
Arimabuff       1829.887659
megamau 1829.291908
Soter   1814.171184
Tuks    1764.446404
aaaa    1755.953876
Spunk   1741.708113
tize    1736.152959
petitprince     1735.154953
ttt     1734.983149
IceD    1733.076556
woh     1731.727335
OLTI    1729.138475
BlackKnight     1724.626262
Tanker_JD       1701.851147
DorianGaray     1684.271055
Rabbit  1677.039817
JacquesB        1651.817945
challenger      1651.537611
ArifSyed        1635.631114
xquezme 1628.855006
Arimanator      1628.79962
ntroncos        1626.580989
rajah   1625.977279
Grey    1622.135662
Asubfive        1622.009975
Raymond 1619.665292
Asturianuco     1618.93568
Guest5409       1618.93568
xepo    1618.93568
dethwing        1617.439131
siddiqna        1612.196436
Swynndla        1611.548227
KT2006  1603.676529
mouse   1603.648051
Groumpf 1603.442563
BilalQ  1602.101663
Jonathan        1599.685223
nauboone        1598.893112
Kraizy_Dave     1598.613488
boris_toplak    1597.675227
jonaspojken     1597.675227
acroninj        1597.675227
mightybyte      1590.426905
Gesuma  1590.426905
Yzaxtol 1590.426905
art     1590.426905
kurthyl 1590.426905
spela   1588.08104
Ump     1587.83263
GordonBlack     1587.015582
Krasnotron      1586.513217
Gorgapor        1586.446565
omarFast        1585.079585
adsyed01        1584.408103
YunK    1583.76829
Tau     1579.835461
KamiKazeKiwi3   1578.107526
sakano  1577.444485
qsasha  1573.38487
Stonkie 1572.249729
Adlai   1571.990022
gert7   1567.962863
MrObvious       1565.421163
kamikazeking    1564.560707
kerdamdam       1563.817418
aXiom   1563.314825
appalachia      1560.70592
knarl   1558.857397
DanTilkin       1558.489441
Ice     1557.30557
Magrathean      1555.322731
ChrisB  1553.036234
arishiki        1552.878834
travis  1552.731401
xabiron 1552.731401
Yaron   1552.731401
pikachamp       1552.731401
BLooodyANgel    1552.731401
brad    1552.731401
Orc     1552.731401
emeryaj 1552.731401
i_am_you        1552.731401
JCricket        1552.731401
marcgb  1552.731401
glitch  1552.731401
ZeroOne 1552.731401
Virgeist        1552.731401
Sir_Twit        1552.731401
klabe   1552.731401
Archonn 1552.731401
chessdiva27     1552.731401
Rancca  1552.731401
religion        1552.731401
thongrim        1552.731401
fwk     1550.465536
LAbiuso 1550.207266
Lautresault     1548.688062
archigavr       1548.230969
iamanigeeit     1548.106795
rusty   1547.438641
obiwan  1545.770567
nogard  1545.735379
Blyx    1544.549776
Chegorimaa      1544.142472
Garyth  1542.635557
agiau   1541.342743
nao_vou_por_ai  1540.092555
mikbuster       1537.51665
fritzlforpresident      1536.240457
smonroy 1535.021658
LifeBlade       1534.661545
purplebaron     1533.760401
DieInnereMelone 1530.087231
fdailey 1529.327187
snetnis 1529.051428
Gerenuk 1528.625537
willwould       1527.801089
ArimaaCap       1527.6587
Darkenrahl      1526.880896
K_Hayes 1523.674082
IdahoEv 1522.109868
Aquarius        1519.580426
Josh    1516.443184
drleper 1516.443184
misterscoundrel 1516.443184
MonkeyPilot     1516.443184
Helios  1516.443184
Lucky   1514.326946
richferrara     1513.926636
WagnerK 1513.702401
kontur  1512.204356
Yuri    1510.633684
PatoGuy 1508.330333
Yusei   1508.181625
craig   1505.1962
Agt     1504.900897
knta    1504.583038
adannada        1504.36136
jl_     1503.636087
botkiller       1500.185596
svchb   1499.478576
Juliana_frithie 1498.99674
kungfucraig     1498.121363
ytri    1496.402009
Peter   1495.89952
Evrimedont      1495.748463
CeeJay  1495.311556
mancity0987     1495.037402
zolli   1494.904518
MartinFuller    1494.430166
Pyrocyon        1494.324821
LiquidTester    1494.262587
Donal   1493.222569
napoleon9th     1493.196716
OrangUtan       1493.13079
Deadpool        1492.940757
dics    1492.805054
grey_0x2A       1491.832291
abeysn  1491.715327
Guest8405       1491.662951
amindehesh      1490.341958
Valueinvestor   1490.218209
godspal 1489.836162
heuertag        1489.36133
seanick 1488.903052
BlackShadow     1488.588184
A_C_Sandino     1487.432631
lyc123  1487.227472
Zeldarimaa      1487.109431
gunblob 1486.605541
jarrausi        1486.601469
Draxamus        1485.789083
gsyed   1484.807983
cloakski        1483.556816
JBadlands       1483.556816
froggirl        1483.556816
desu    1483.556816
Dean    1483.556816
arvindn 1482.493547
milo    1481.989504
grant74745      1481.84908
Vinvin  1481.5901
horse   1481.292261
Amtiskaw        1481.289848
FreddyFish      1481.275108
p4n1q   1481.17259
databass        1480.830666
DreamingDemon   1480.27958
Jester  1479.87472
Brick_Salad     1479.825587
dannyant        1479.761152
nbarriga        1479.057365
asyed94 1478.848229
Sevenviolets    1478.77416
Fleisch 1478.717873
Ming    1477.764396
Guest207        1477.336556
Talmun  1477.2834
TheGrandSage    1476.886043
gingermen       1475.90847
Legolas11       1475.702687
camperman       1475.410797
caber   1474.823749
LVBen   1474.670729
Scott   1474.660104
Renaissance     1474.27359
haizhi  1473.970972
Nevernever      1473.223296
pallab  1473.176538
Geheimnis       1472.388013
Guest4880       1472.372212
ishvahan        1472.3413
fhorozal        1472.261596
seanmcl 1472.229358
Spottedfeather  1471.931585
Juha    1471.607661
bleitner        1471.593852
bananaman       1471.570848
Alky    1470.935044
richard 1470.41986
Emaad   1470.285194
Annatar 1469.985199
kcyhm1991       1469.860093
Sameer  1469.83906
NS_Serg 1469.830374
proselyte       1468.861785
Henki   1468.758498
Clarphimous     1468.199533
Gdngms  1468.069755
emmanuel69      1467.487723
manwithfire     1467.363244
UruramTururam   1467.259336
TheMadHair      1466.996738
supernova       1466.382493
Miki    1466.239374
rushdy  1466.02455
narri   1466.020638
mawzsr  1466.019199
zarios  1465.665637
Waken   1465.226388
leo     1464.709753
AjedrezDude     1464.686398
gunaji  1463.501678
jephly  1463.480947
PickPocket      1463.280951
mooseye 1463.080651
sensaijin       1463.076462
hhornet 1462.987337
Moi     1462.82107
ghetom  1462.52202
Polyfractal     1462.232217
jshira  1461.964919
Wilsonia        1461.841716
Guest8212       1461.767513
Darkhorse       1461.706115
jazz153 1460.665209
scienceman      1460.544509
SimplePlayer    1460.437198
botbasher       1460.277477
Ytterbium       1459.894926
Oystein 1459.436594
mammutino       1458.657257
seepage87       1458.517559
fernobob        1458.014273
brownsugar726   1457.673437
jpages  1457.164578
Antti   1456.140139
Bezman  1455.808845
Pimm    1455.368831
Tore    1455.298988
chlydra 1455.290498
novacat 1455.056174
Heidissimo      1454.821205
wiggin  1454.721217
carolaina       1454.680896
Guest7222       1454.28813
fotland 1453.972905
potatoj316      1453.867006
antoniotheripper        1453.3386
Jaspa   1453.226031
taral   1453.158454
Sipi    1452.407724
tightwitjc      1451.676303
StormLord711    1451.491224
H_Bobbeltoff    1451.374237
dougk   1451.162156
LamCF_LamCF     1450.964696
bhujjy  1450.514737
ServiceDog      1450.188026
v_dhanasekar    1450.072362
haderlump       1449.87299
steve252        1448.738988
Rileyjal        1448.409426
ThePirate       1448.126745
Sylvain 1448.114807
Guest7904       1447.881652
froody  1447.875351
illz    1447.268599
miri    1447.268599
Kristijan       1447.268599
Guest383        1447.268599
ugaiW   1447.268599
RSA     1447.268599
ih8evilstuff    1447.268599
Keeps   1447.268599
robstar 1447.268599
Kobold  1447.268599
handsomestofall 1447.268599
Yold    1447.268599
Robert  1447.268599
chrismccoll     1447.268599
rocketman469    1447.268599
Elmo    1447.268599
anton_ou        1447.268599
amarantaursula  1447.268599
Rhapsody        1447.268599
sharky6000      1447.268599
Xantix  1447.268599
paulm   1447.268599
roule   1447.268599
Gridiron        1447.268599
goldenspiral    1447.268599
PaulJohn        1447.268599
JField  1447.268599
Winegar 1447.268599
grep    1447.268599
vma     1447.268599
Victerminator   1447.268599
solar_flare     1447.268599
novicehex       1445.806764
shortdog        1444.756074
Emma2   1444.503103
qwert   1444.452601
Emma    1444.413941
clavius 1444.105789
BenW    1443.996302
Ellis   1443.961572
zaidq   1443.904628
sleepywind      1443.300552
Arimaardvark    1443.15877
Condo3356       1442.646991
sam77   1441.829643
kietui  1441.447856
Mason11987      1441.328901
HEx     1440.332821
BullBombP1      1439.498015
kissl   1439.466671
6sense  1439.323
Akhenaten       1439.29408
Gugunja 1439.155096
Einarin 1439.116649
davidz  1438.462988
ImranG  1438.275256
TheCaptain      1436.149431
Matthias        1436.133928
Erezap  1435.482182
Karlo   1434.683319
B44     1434.231537
walleye 1434.221811
stilozu 1434.148678
unic    1433.722101
raimond 1433.56664
trevor  1432.831531
TipAndMe2       1432.595275
theHOG  1431.830532
Ethos   1430.473635
llauro  1430.217953
Hamzah  1429.859651
AmitChaturvedi  1429.848156
Chroniker       1429.743019
jinarimaa       1427.716148
Nate1729        1426.757608
habubi  1426.630789
Zigmar  1426.40481
Dreadpirate     1426.24862
thecreeep       1426.150127
hamzahq 1426.065161
Galad   1424.958276
whiteKnight     1424.698782
Leonidas        1424.318611
deepgrave       1423.460441
itaibn  1423.237163
Grochti 1421.681799
justsojazz      1420.545654
acheron 1419.96795
kraj    1419.766989
Shadowmusic     1419.563284
Amina   1419.499586
lyhy    1419.1389
Aamir   1418.859403
RedOaks 1418.777118
anti    1418.386079
dreamfish99     1418.10654
RainMan 1417.827833
Sophi05 1417.107251
Wendell 1416.712857
lihanzo 1416.632559
Snarboo 1416.116828
atb42   1415.325348
Slowstorm       1415.047722
Norman314       1414.851954
corsix_ 1414.663995
warren  1414.220369
Paul    1413.716742
neKcid  1413.523562
ghunum  1413.117985
Anar    1412.356679
coolits 1411.849836
MasterBlaster   1411.495526
Tachyon 1410.500247
Ganesha 1410.471718
spunky  1409.573095
kenntoft        1409.573095
thecrecarc      1409.573095
erika1987       1409.573095
d4v3    1409.145908
yarnalito       1408.992549
Dromar  1408.947342
Eileen  1406.564259
Fisma   1406.131782
SubZero 1406.123818
benba57 1405.777709
gunananda       1405.566862
BBcardsRI       1403.300345
keita   1402.34245
722caasi        1401.65667
netarrow        1401.008624
coachbudka      1400.264995
Retrovirus      1398.860055
dan724  1398.455246
Greytle 1398.006968
naveed4 1396.390553
Diana   1396.209151
Fazer   1393.225168
Kruschak        1390.312734
CalebBrown      1390.192622
Aaron   1389.276973
CrazyMan04      1387.864767
lizards101      1384.401161
junaid  1382.3585
matjaz  1381.690305
Monedero        1381.06432
Lyra    1381.06432
Lenuvas 1381.06432
prayer56        1379.657301
Imran   1375.407028
subs2000        1375.384196
rick    1369.417879
dtj     1367.979416
Yogui   1365.975999
frostlad        1365.70827
omega91 1362.928481
el_Triste       1361.730139
Sergey  1361.544425
RuthlessD       1359.158953
LUNO    1357.511543
lukenilpolso    1355.644541
lelievre        1354.018586
shas71  1353.614465
Jimmy_Newtron   1353.089525
monste9 1344.112897
Microbe 1340.865659
Beowulf 1340.300956
NumBeast        1338.965138
sean    1337.272865
ChristianDK     1328.065667
carlsquared     1327.150574
znjznj  1323.152931
soldier 1319.94634
themonk 1317.730798
vanloan 1314.856033
Sana    1314.53573
Thom    1311.598677
NIC1138 1311.487414
allBlax 1309.416735
jemicobel       1306.852284
pcpdams 1281.088938
gern    1277.870795
mentalsurge     1267.414785
MrBrain 1255.766678
silump5 1254.338506
casparix        1250.500116
Jnate   1246.336109
Keith   1242.503885
eric    1239.406612
Rabbitball      1209.300944
PhilomathBret   1184.918838
Gregorius       1138.154633
Calumet45       1136.216082
yuji    1102.200517

Title: Re: HH ratings using the current rating system
Post by Janzert on Sep 29th, 2008, 7:49am

on 09/29/08 at 04:39:45, Fritzlein wrote:
Yes, but is it surprising because his results against humans have been poor, or because you're not used to seeing him ranked there?  We ought to at least look at who he has beaten and lost to before saying that this system puts Ultraweak too high.


It is surprising to me because of how few games he has played (25). True his only losses are 2 to Fritzlein, 2 to Chessandgo, 1 to RonWeasley and 1 to Omar. Had he forgone playing you and Chessandgo would he be ranked first?

I think if we are going to switch systems it should at least be able to deal gracefully with people that have a low number of games. Something as simple as subtracting one standard deviation from the rating should take care of it.

Janzert

Title: Re: HH ratings using the current rating system
Post by Tuks on Sep 29th, 2008, 10:23am
im not sure about his...
i seem awfully high for how i play! ;)

Title: Re: HH ratings using the current rating system
Post by Fritzlein on Sep 29th, 2008, 12:57pm

on 09/29/08 at 06:54:56, aaaa wrote:
Fritzlein       2585.987356

That can't be right.  My rating should be at least 2585.987388!

Title: Re: HH ratings using the current rating system
Post by omar on Sep 30th, 2008, 12:05am

on 09/29/08 at 07:49:16, Janzert wrote:
Something as simple as subtracting one standard deviation from the rating should take care of it.

Maybe, but it might also introduce problems elsewhere. It is really hard to judge what effects an ad hoc change like this would have on the ratings. One really needs to implement it and run it through various tests. Some of the tests I had used were:
* drift - If you start out with calculated ratings equal to true ratings, you find that as games are played the calculated ratings drift around the true ratings. This gives a good idea of how accurate you can expect your ratings to ever be. Generally reducing the K factor used in rating calculations increases the accuracy of the ratings. I probably should have called this  accuracy instead of drift.
* convergence - This measures how many games it takes on average for the ratings of new players to reach within the drift level of the true ratings. Reducing the K factor increases the number of games needed for the ratings to converge.


Title: Re: HH ratings using the current rating system
Post by ChrisB on Sep 30th, 2008, 7:52am
Thanks Omar and aaaa.  It's interesting to see HH ratings and also the raw data.


on 09/28/08 at 21:06:35, omar wrote:
Do these look accurate?


Just one minor point .... the summary listing seems to show the ratings before each player's most recently played game.

Title: Re: HH ratings using the current rating system
Post by Fritzlein on Oct 1st, 2008, 12:53pm
Following up to Janzert's concern, for seeding a tournament it is very difficult to know what to do with players with a sparse game record.  The reason I recommended falling back on game-room ratings to seed the 2008 World Championships was that they were slightly more in line with with my subjective judgment of the strength of players who had very few HvH games.  In principle I like seeding the human tournament based on human games only, but The_Jeh's system and my own seemed to give out-of-kilter results for players like soldier and silump who had hardly played any human games, and those only against each other.

Omar, I agree with you that adding in a hack to reduce the seeding of an inexperienced player can have unintended consequences for the rating system as a whole.  I think, however, that a hack might be entirely appropriate for the purpose of generating tournament seeds.  Your measures of drift and convergence are about long-term performance, not about newcomers.

Maybe we need to split into two different discussions here: one about a new-and-improved rating system for the game room, and another about seeding the 2009 World Championship.

Title: Re: HH ratings using the current rating system&
Post by Janzert on Oct 1st, 2008, 5:38pm

on 09/30/08 at 00:05:09, omar wrote:
Maybe, but it might also introduce problems elsewhere. It is really hard to judge what effects an ad hoc change like this would have on the ratings. One really needs to implement it and run it through various tests. Some of the tests I had used were:
* drift - If you start out with calculated ratings equal to true ratings, you find that as games are played the calculated ratings drift around the true ratings. This gives a good idea of how accurate you can expect your ratings to ever be. Generally reducing the K factor used in rating calculations increases the accuracy of the ratings. I probably should have called this  accuracy instead of drift.
* convergence - This measures how many games it takes on average for the ratings of new players to reach within the drift level of the true ratings. Reducing the K factor increases the number of games needed for the ratings to converge.


If I'm understanding you correctly, subtracting a standard deviation won't change the drift of a rating system at all. It will cause the convergence to take longer, but this is purposeful. This is because the reason for subtracting a standard deviation (or more generally standard deviation times some constant) is to change what the rating is saying.

A regular elo rating and most other ratings are trying to give a best guess at a players true rating. I believe it is much more useful, especially in the case of new players, to know that a player's rating is at least a certain level. So the purpose of subtracting a standard deviation is to show a lower confidence bound for the player's rating.

Janzert



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.