Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> General Discussion >> Reddit.com
(Message started by: Fritzlein on Aug 3rd, 2007, 11:18pm)

Title: Reddit.com
Post by Fritzlein on Aug 3rd, 2007, 11:18pm
According to Alexa, reddit.com is about three times as popular as metafilter.com.  However, it looks like the surge of new players might be of a similar scale to what happened when we were mentioned on Metafilter, rather than three times as large.

Title: Re: Reddit.com
Post by omar on Aug 4th, 2007, 12:45pm
The traffic seems to be due to this page:

http://reddit.com/info/2c45n/comments

Omar

Title: Re: Reddit.com
Post by omar on Aug 4th, 2007, 1:22pm
There is a discussion of the game complexity there and gives a link to a wikipedia article. The article says:

It's hard even to estimate the game-tree complexity, but for some games a reasonable lower bound can be given by raising the game's average branching factor to the power of the number of plies in an average game.

If I use the branching factor of about 17,000 that Janzert calculated here:
http://arimaa.janzert.com/bf_study/

and raise it to the 70 power, I get: 1.3x10^296 which is much higher than the 190 listed in the Wikipedia. I've updated the wikipedia page and added Janzert's page as a reference.









Title: Re: Reddit.com
Post by The_Jeh on Aug 4th, 2007, 3:54pm
Perhaps you can clarify for me. Is the 17000 branching factor the average number of choices for one player, or is it the average number of choices for a twofold move?

If an average game of Arimaa lasts 45 moves, then:
Half-move: 17000^90 = 10^380.74
Move: 17000^45 = 10^190.37 (and the Wiki is already right)

Of course, if you think an average game is 70 moves, then yes, it's a severe underestimation.

If two players play very passively, the gametree complexity could get almost infinitely larger as the game could go on almost endlessly, there being no 50-move rule as there is in chess. Do you only count meaningful games? Or do I misunderstand something?

Title: Re: Reddit.com
Post by aaaa on Aug 4th, 2007, 5:08pm
It's per half-move. 17000 for a play and a response would leave about 130 possibilities per player. That's even less than the number of ways to choose up to 4 out of 8 free pieces to move.

Title: Re: Reddit.com
Post by omar on Aug 4th, 2007, 7:14pm
Yes, 17,000 is the average number of unique new positions that can arise from a typical Arimaa position. This was computed from thousands of real games. It must have taken Janzert quite some time to run through all those, and I'm very thankful to him for having done this study; it answered a lot of questions.

I ran some queries on the games database to see what the total number of plies in rated games is really like. I think Fritzlein had ran some stats on this before, but I can't seem to find it at the moment.

For all rated games: average number of plies is 64.5 (44,856 games). Average number of moves would be half of this: 32.25.

When the rating of both players is greater than 1500 the average number of plies is 81.6 with 18747 games in the set. This is for rated games only. As the rating of both players increases the game length seems to increase but then levels off.

1500 81.6 (18747 games)
1600 85.6 (12865 games)
1700 88.7 (7784 games)
1800 91.1 (3688 games)
1900 96.0 (1313 games)
2000 91.1 (457 games)
2100 93.7 (128 games)
2200 89.8 (24 games)

From this it seems I was too conservative in using 70 as the ply count. A more realistic value for well played real games seems more like 90.

So 17000^90 gives: 5.5x10^380. Whoa, that's more than Go (10^360). OK I'll be more conservative and use 86; that gives: 6.5x10^363.

Humm, am I doing this right. Can others double check.

Title: Re: Reddit.com
Post by omar on Aug 4th, 2007, 7:54pm

on 08/04/07 at 15:54:01, The_Jeh wrote:
If an average game of Arimaa lasts 45 moves, then:
Half-move: 17000^90 = 10^380.74
Move: 17000^45 = 10^190.37 (and the Wiki is already right)


Well the description of how to compute the game-tree complexity says to raise it to the number of plies. The definition of a ply is one turn taken by one player:
http://en.wikipedia.org/wiki/Ply_%28game_theory%29

So I think raising it to 90 is right.

Title: Re: Reddit.com
Post by aaaa on Aug 4th, 2007, 8:01pm
The big problem here is that it looks like the number 17000 is an arithmetic mean and not a geometric one. Since the branch factor is clearly not constant judging from the same study, one can't just raise this number to the number of plies.
Additionally, I consider it a liability that such a large part of the games analyzed involves bots.

Title: Re: Reddit.com
Post by omar on Aug 4th, 2007, 8:20pm
Im noticing some errors in the Wikipedia Game complexity page. For Go it says the game length is 200 plies (actually the heading of the column says 'moves' and should be changed to say 'plies') but the game-tree complexity says: 10^360. If I assume 200 as the branching factor then the game-tree complexity should be: 200^200 = 10^460. So I looked up what Victor Allis said in his Thesis (I can't find it online anymore, but I had a saved copy). He used a branching factor of 250 and a ply count of 150, thus 250^150 = 4.9x10^359. So the game-tree complexity was right, but the ply count was wrong. I wonder how many more errors there are on that wiki page.


Title: Re: Reddit.com
Post by omar on Aug 4th, 2007, 8:46pm

on 08/04/07 at 20:01:28, aaaa wrote:
The big problem here is that it looks like the number 17000 is an arithmetic mean and not a geometric one. Since the branch factor is clearly not constant judging from the same study, one can't just raise this number to the number of plies.

I agree with you that the branching factor is not constant throughout the game, but the description for computing game-tree complexity says to use "the game's average branching factor". In Go also the branching factor starts at 361 and decreases as the game progresses, but Victor uses an average of 250.

Perhaps a more accurate way to estimate the game-tree complexity based on real game data would be to use the data from Janzert's frist graph and take the product of:

(average branching factor at ply count N) * (fraction of games that reached ply count N)

and then take the product of all these numbers for all ply counts where the fraction of games that reached that ply count is greater than some small number like 0.01.


Quote:
Additionally, I consider it a liability that such a large part of the games analyzed involves bots.

Janzert also looked at human vs human games and got an average branching factor of 16,844.

Title: Re: Reddit.com
Post by Janzert on Aug 4th, 2007, 8:47pm
Just to confirm, the 17000'ish number is an arithmetic mean.

I thought somewhere I had figured a geometric mean but I'm not finding it now.

Janzert

Title: Re: Reddit.com
Post by Fritzlein on Aug 4th, 2007, 10:14pm
Janzert, if you stored all your calculated data about the number of possible moves in each position, you should be able to make a more direct calculation of game tree complexity.  For each game calculate the the product of the branching factors for each of its moves; that's the estimated tree size for that game.  Then take the geometric mean of tree sizes across all the games.

Title: Re: Reddit.com
Post by The_Jeh on Aug 4th, 2007, 11:45pm
Why isn't Go's gametree complexity more like:

(361!)/((361-n)!) , where n is the average number of plies?
Using 300 as n, this would be 10^684.45
Using 200 as n, this would be 10^481.28
Using 150 as n, this would be 10^367.81

My reasoning is that the branching factor decreases by about 1 with each ply. A move that captures stones generally does not increase options in an intelligent way, so I assume it is negligible.

I realized a little after I posted that the square root of 17000 is only 130.38.  Please forgive my poor math skills. I just graduated from high school, and the most advanced math I have is elementary statistics and calculus I. I plan on majoring in mathematics at college, though, so please take every opportunity to teach me.

Title: Re: Reddit.com
Post by Fritzlein on Aug 5th, 2007, 10:20am

on 08/04/07 at 23:45:16, The_Jeh wrote:
Why isn't Go's gametree complexity more like:

(361!)/((361-n)!) , where n is the average number of plies?

That seems like an excellent estimate.  It is somewhat too high, since suicides and ko moves are illegal, but somewhat too low because captures re-open spaces.  Your estimate makes more sense to me than (average branching factor)^(average game length).

Title: Re: Reddit.com
Post by Janzert on Aug 5th, 2007, 2:28pm
Here are a few geometric means I got calculated, at least I hope I did it right. ;)
Type of gamesNumber of gamesNumber of movesArithmetic meanGeometric mean
All5524932094401641111411
HvH34012143211969013451
BvB56684036961512210560

One other thing to remember about all of the means I've given is that they do not include the initial setup moves. If they were included it would increase all the numbers a bit.

Janzert

Title: Re: Reddit.com
Post by Janzert on Aug 6th, 2007, 2:31pm

on 08/04/07 at 22:14:07, Fritzlein wrote:
Janzert, if you stored all your calculated data about the number of possible moves in each position, you should be able to make a more direct calculation of game tree complexity.  For each game calculate the the product of the branching factors for each of its moves; that's the estimated tree size for that game.  Then take the geometric mean of tree sizes across all the games.


The problem I've had doing this is that the "product of the branching factors" is often too large for standard floating point numbers (about 55% are larger than 10^300 and the largest is slightly larger than 10^11001). I managed to do it with an arbitrary precision library and slightly modified natural log and exp functions I found with google. I believe the results are accurate but it is very slow to calculate the geometric mean (around 13 hours for 1400 games). So if anyone has a good way to work with these large numbers I'd be glad to give them to you. :)

I did complete one calculation using all hvh games with a 1700+ rating for both players. But I realised about two thirds of the way through the run that it has two flaws that will cause the result to be lower than it should be. First I once again did not include the two setup moves and secondly games that did not come to a "natural" ending shouldn't have been used (e.g. resignations, time outs, etc.). Having said that here are the results:
Type of gamesNumber of gamesArithmetic meanGeometric meanGame Tree Complexity
HvH 1700+144317990133342.57x10^324

Janzert

1 This is game #53484; woh and mistre's 135 move game.

Title: Re: Reddit.com
Post by Fritzlein on Aug 6th, 2007, 3:27pm
Wow, thank you for doing that computation!  So the game tree complexity is definitely in the neighborhood of Go, i.e. 10^324 for Arimaa compared to 10^360 for Go.


on 08/06/07 at 14:31:01, Janzert wrote:
So if anyone has a good way to work with these large numbers[...]

Why do you need an arbitrary-precision library?  Can't you make this calculation fit into standard size numbers using logarithms?  Instead of multiplying all the branching numbers and getting 10^1100, take the logs of all the branching numbers and add them to get 1100.  After doing that for each game, you can take the arithmetic mean across games to get something like 324.3.  Then if you want, you can exponentiate to get something like 2.2 x 10^324.

I don't know, perhaps taking logarithms is as slow as multiplying, but at least you won't get overflow errors.

Title: Re: Reddit.com
Post by Janzert on Aug 6th, 2007, 9:26pm
Umm, well mostly I was doing it because I was way to focused on the log(ab) = log(a) + log(b) equivalancy for the overall mean and the speed of doing logs on 10^300+ numbers to think about applying the same principle for the first half of the problem.  ::)

Janzert

Title: Re: Reddit.com
Post by Janzert on Aug 7th, 2007, 12:01am
Thanks to the thunk in the head from Fritzlein here are a few different sets of numbers. All the sets below only include rated games that end with g, m, p or n. Also all the means and GTC numbers include the initial two setup turns.
Game typeNumber of gamesNumber of movesArithmetic mean movesGeometric meanGame tree complexity
All 339572388069 1860483 14344 2.10x10^292
HvH   2059   159019 1698526 166098.78x10^325
HvH 1700+   1038     91069 1495574 15910 4.31x10^368
HvB 275671903546 1894519 14426 1.57x10^287
BvB   4331   325504 1740562 12913 9.41x10^308

Janzert

Title: Re: Reddit.com
Post by Fritzlein on Aug 7th, 2007, 6:59am
Excellent.  Glad to provide a thunk.  Where did I read, "blunt minds swung with great force"?  Can we quote you on Wikipedia?  I guess the most representative number in my opinion is the HvH unlimited by rating.  Good work, Dude.

Title: Re: Reddit.com
Post by omar on Aug 7th, 2007, 11:42am
Thanks a lot for doing these calculations, Janzert; and thanks for your guidance Fritzlein. This helps to confirm from real game data that the game-tree complexity of Arimaa is close that of Go. Wow!! Another way to look at it is: Arimaa gives a googol times a googol more games to explore than Chess while still using the same board and material :-)

Janzert, if you can just document this method and the results on your web site, we could use that as a more direct reference for the GTC in the Wikipedia.

I think the question of what kind of games should be included in calculating the GTC from the available real games is something that needs further consideration. It is clear from Janzert's results that different types of games can give considerably different final results. Also if the GTC of other games such as Chess or Go were calculated using real games, it would be good if they used the same method and types of games so that the resulting numbers would be a more apples to apples comparison.

Im wondering what types of games others games have used. For example when a source says "the average number of plies in Chess is 80", what type of games were used in determining this number. Were "all rated games" used; were only "human vs human games" used; were only games between "best" players used (regardless of human or computer). As we've seen the final results can be significantly different based on the choice of games used.

If anyone can give a reference that mentions the types of games used in calculating the average number of plies or the branching factor, it would be useful to look at.

I think the set of games used to calculate the GTC should use rated games played by the best players of the game and the set of games should include at least 10,000 games. The set should not include games between a learning player and a non-learning player. All human players are considered learning players. All the current Arimaa bots are considered non-learning players. Thus, in our case the set of games would not include human vs bot games, but could include H-H or B-B games. This would eliminate all rated games where the human players have beat the bot repeatedly using the same technique.

What do others think about the type of games used to calculate the GTC.

Title: Re: Reddit.com
Post by Fritzlein on Aug 7th, 2007, 1:11pm
For an apples-to-apples comparison to chess, we probably have to include games ending in resignation or timeout, because these are recognized as valid endings in the chess world.

Title: Re: Reddit.com
Post by Janzert on Aug 8th, 2007, 11:18pm
I'd be happy to right something up, but it'll probably have to wait till the end of the month before I have time to do it. I should probably read up on the subject as well first. :D It appears that Victor Allis' Thesis "Searching for Solutions in Games and Artificial Intelligence" is the more or less definitive work on the subject any other pointers I should check out?

Janzert



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.