Author |
Topic: 2013 Arimaa Challenge (Read 8056 times) |
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
2013 Arimaa Challenge
« on: Mar 11th, 2013, 5:31pm » |
Quote Modify
|
The first screening games of 2013 have already been played, and the bots are off to a slow start. All four developers said there were not many improvements from last year, so most of the increase in performance should come from running on eight cores instead of four, but is it possible that they will have a lower performance than in 2012? Year Pairs Decisive Winner / Score / Perf Loser / Score / Perf ---- ----- -------- --------------------- -------------------- 2007 12 . 2 . bomb / 2 / 2087 . Zombie / 0 / 1876 2008 16 . 7 . bomb / 6 / 1918 . sharp / 1 / 1576 2009 23 . 7 clueless / 5 / 1910 . GnoBot / 2 / 1792 2010 25 . 11 marwin / 6 / 2065 clueless / 5 / 1960 2011 40 . 11 marwin / 6 / 2110 . sharp / 5 / 2109 2012 33 . 7 briareus / 5 / 2232 . marwin / 2 / 2128 2013 25 . 6 marwin / 4 / 2121 ziltoid / 2 / 2055
|
« Last Edit: Apr 1st, 2013, 4:20pm by Fritzlein » |
IP Logged |
|
|
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Re: 2013 Arimaa Challenge
« Reply #1 on: Mar 11th, 2013, 6:20pm » |
Quote Modify
|
on Mar 11th, 2013, 5:31pm, Fritzlein wrote:most of the increase in performance should come from running on eight cores instead of four |
| I think ziltoid2013 only sees a small increase in nodes per second vs briareus2012, and this is with a doubling of the number of threads (which hurts the search). So the hardware may be about as powerful as last year's, at least for my bot. on Mar 11th, 2013, 5:31pm, Fritzlein wrote:but is it possible that they will have a lower performance than in 2012? |
| Maybe yes, if marwin and ziltoid are pretty similar to their 2012 versions and more people got used to their weaknesses.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2013 Arimaa Challenge
« Reply #2 on: Mar 13th, 2013, 1:41pm » |
Quote Modify
|
Congrats to Max on scoring a clean sweep! Marwin draws first blood with two victories this morning, and now has a calculable performance rating, but ziltoid is still scoreless. Together they are 2-8, counting my half-game as a victory for humanity.
|
« Last Edit: Mar 13th, 2013, 2:13pm by Fritzlein » |
IP Logged |
|
|
|
tize
Forum Guru
Arimaa player #3121
Gender:
Posts: 118
|
|
Re: 2013 Arimaa Challenge
« Reply #3 on: Mar 13th, 2013, 2:03pm » |
Quote Modify
|
on Mar 11th, 2013, 5:31pm, Fritzlein wrote:All four developers said there were not many improvements from last year |
| I don't think I said that (or maybe I did, it's hard to remember everything that one says). But I made a lot of changes (note the word changes here and not improvements ), so it might be true for marwin that not many improvements where made. Of course many of them doesn't affect the rating much (or even measurable), like the new improved handling of repetions. I have not made any measurements of the changes, but maybe I will.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2013 Arimaa Challenge
« Reply #4 on: Mar 13th, 2013, 2:16pm » |
Quote Modify
|
on Mar 13th, 2013, 2:03pm, tize wrote:I don't think I said that (or maybe I did, it's hard to remember everything that one says). But I made a lot of changes (note the word changes here and not improvements |
| Oh, I apologize for misquoting you. I wasn't paying close attention, and I just reported my general impression of developer chat, so I certainly forgot who said what. Perhaps all of your changes will indeed make a noticeable difference. By the way, did you see the same lack of improvement from 4 cores to 8 cores that Ricardo did? Maybe parallelization gets increasingly difficult as the number of cores increases.
|
« Last Edit: Mar 13th, 2013, 2:17pm by Fritzlein » |
IP Logged |
|
|
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Re: 2013 Arimaa Challenge
« Reply #5 on: Mar 13th, 2013, 2:26pm » |
Quote Modify
|
on Mar 13th, 2013, 2:16pm, Fritzlein wrote: By the way, did you see the same lack of improvement from 4 cores to 8 cores that Ricardo did? Maybe parallelization gets increasingly difficult as the number of cores increases. |
| I don't think my observations of performance on the 2013 challenge hardware have much to do with the general difficulty of parallelization (although it definitely plays a part too), because I have run my bot on 32-64 core machines with a nice speedup. This year's CPU is based on the AMD Bulldozer architecture, in which each pair of cores is contained in a module with common instruction fetchers/decoders, in addition to the shared L2/L3 cache. The latter was already common in other architectures. This means that, depending on the particular software, going from 4 threads to 8 threads on a 8-core CPU can easily result in a speedup well below 2x.
|
« Last Edit: Mar 13th, 2013, 2:37pm by rbarreira » |
IP Logged |
|
|
|
tize
Forum Guru
Arimaa player #3121
Gender:
Posts: 118
|
|
Re: 2013 Arimaa Challenge
« Reply #6 on: Mar 13th, 2013, 2:28pm » |
Quote Modify
|
No problem, it might also be that you were correct, if we talk about improvements. As for the 8 cores: I didn't do any testing to see how much faster he got (just tested that it worked and he used all 8 ). And it depends heavily on the position for marwin, since I only split at the root level and run the first move in "single core mode". This means that marwin switches between using 8 cores and 1 (slightly simplified...). But my guess is that marwin got 50% faster. (The number is scientifically pull from thin air.)
|
|
IP Logged |
|
|
|
foggy
Forum Full Member
Arimaa player #6010
Gender:
Posts: 13
|
|
Re: 2013 Arimaa Challenge
« Reply #7 on: Mar 13th, 2013, 4:13pm » |
Quote Modify
|
Actually, I dont think this year's HW is much improvement. There were 8 cores instead of 4, but Bulldozer "core/unit" has at least 1.5 less performance comparing to Intel. I was expecting i7 architecture this year. According to Fritz (I mean chess progarm, not the arimaa player/TD measurements (which I suppose should be close to arimaa bots), i7 is better than Bulldozer 8 cores (which are similar to i7 - sharing decoder and cache makes it similar to Intel hyperthreading).
|
« Last Edit: Mar 13th, 2013, 4:15pm by foggy » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2013 Arimaa Challenge
« Reply #8 on: Mar 16th, 2013, 1:21pm » |
Quote Modify
|
We now have two sweeps of the bots, equaling last year's total. Ziltoid is finally on the board with a win and a calculable performance rating, but marathon wins by Nombril and novacat keep it down to a paltry 1659. Meanwhile Harren knocked marwin down to 1972. There's a long way to go yet in the screening, and the bots' performance ratings are quite likely to rise, but it already seems like a long shot for them to equal their excellent results from last year. Marwin remain 0.5 point ahead by virtue of winning the only decisive pair so far.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2013 Arimaa Challenge
« Reply #9 on: Mar 19th, 2013, 5:24am » |
Quote Modify
|
on Mar 13th, 2013, 4:13pm, foggy wrote:Actually, I dont think this year's HW is much improvement. There were 8 cores instead of 4, but Bulldozer "core/unit" has at least 1.5 less performance comparing to Intel. I was expecting i7 architecture this year. According to Fritz (I mean chess progarm, not the arimaa player/TD measurements (which I suppose should be close to arimaa bots), i7 is better than Bulldozer 8 cores (which are similar to i7 - sharing decoder and cache makes it similar to Intel hyperthreading). |
| I would assume more cores would be better for the bots than slightly faster cores. But that's just a guess. If any of the bot developers have some benchmark program that I can use to gauge the hardware performance I would like to start using it. Maybe something that runs for about a minute on some fixed game position and prints out the number of nodes that were evaluated. It should auto detect the number of cores and memory and adjust itself to the hardware (perhaps using a launch script that checks the hardware and starts the benchmark program with the best parameters for the hardware). It would be good to start using something like this and keeping track of the hardware improvement year to year. Wish we had started doing this earlier. I can use one of the bots to do this, but just wanted to check if anyone already had a benchmark program they were using.
|
|
IP Logged |
|
|
|
lightvector
Forum Guru
Arimaa player #2543
Gender:
Posts: 197
|
|
Re: 2013 Arimaa Challenge
« Reply #10 on: Mar 19th, 2013, 7:23am » |
Quote Modify
|
Note that the number of nodes is probably *not* the right thing to measure. For a fixed search depth, the number of nodes will on average increase as the number of threads increases due to losses in the efficiency of the search as the number of threads goes up. For example, searching two branches of the tree in parallel will be worse than searching them sequentially if searching one of them first would have provided better alpha/beta bounds for the second, or even a beta cutoff so that the second branch need not have been searched at all. Instead, you probably want to measure the time taken to reach a given fixed depth. Although if a bot has some unsafe pruning heuristics and such that depend heavily on dynamically gathered information in the search, that might not exactly be right either. It might also differ slightly from the actual effective strength due to how a bot handles cases where the final depth is only partially searched, rather than fully searched. But probably these are second-order and not too big of a deal. As for whether 8 cores is better, I haven't looked in detail, but for sharp, above 3-4 cores, I recall that the loss becomes very noticeable, so that 8 cores gives far less than 8x the effective search power. Although some of that might be simply sharp being underoptimized - I did implemented a threading framework that gives a lot of freedom to choose any parallelization policy, but have spent very little time tuning the policy so far. Probably other developers could provide better stats.
|
« Last Edit: Mar 19th, 2013, 7:32am by lightvector » |
IP Logged |
|
|
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Re: 2013 Arimaa Challenge
« Reply #11 on: Mar 19th, 2013, 7:38am » |
Quote Modify
|
I don't have any benchmark script. I usually just try ziltoid on a position a few times and find the maximum achieved NPS. It can take quite a few tries due to parallel non-determinism, and some positions can be bad for this. In particular, if one move takes much longer to calculate than all others the bot might be using just one thread for quite a while. For most positions this is not the case. My feeling for the last few years is this: The 2010 hardware was exactly as powerful as the 2011 hardware (the X3360, AFAIK is exactly like the Q9550 CPU except for being a server part). The 2012 hardware is about as powerful as the 2013 one as I said earlier (for my bot). So the remaining question is how much of a boost happened between 2011 and 2012. According to most of these benchmarks, it seems to be around 40%. Both are quad-core CPUs, so parallelization should be a non-issue in this particular comparison. My not-so-scientific guess is that, for my bot, between 2010 to 2013 the hardware got around 40% faster. But it's hard to give a concrete number without trying benchmarks again. on Mar 19th, 2013, 7:23am, lightvector wrote:Instead, you probably want to measure the time taken to reach a given fixed depth. |
| I agree this is probably the best way. Either the minimum or the median time of several tries should be taken, to account for non-determinism.
|
« Last Edit: Mar 19th, 2013, 8:23am by rbarreira » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2013 Arimaa Challenge
« Reply #12 on: Mar 22nd, 2013, 10:29pm » |
Quote Modify
|
In the general man vs. machine contest, there has been some tit-for-tat resulting in humanity staying well ahead of where it was last year. In the bot vs. bot contest, marwin has opened up a commanding 2.5-point lead by beating both RmznA and arimaa_master, each of whom turned around and beat ziltoid.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2013 Arimaa Challenge
« Reply #13 on: Mar 25th, 2013, 6:50pm » |
Quote Modify
|
Since last update, marwin went 0-2 while ziltoid was 3-1. This hasn't changed marwin's 2.5-point lead yet, but the incomplete pairs are now more favorable to ziltoid. Humanity, meanwhile, continues to score better overall than last year, although the bots are inching up from their dismal opening to currently weigh in at 2034 and 1892 respectively.
|
« Last Edit: Mar 25th, 2013, 6:59pm by Fritzlein » |
IP Logged |
|
|
|
Boo
Forum Guru
Arimaa player #6466
Gender:
Posts: 118
|
|
Re: 2013 Arimaa Challenge
« Reply #14 on: Mar 26th, 2013, 10:35am » |
Quote Modify
|
What if some players end up having played only 3 screening games? I think the results are calculated in weird way. E.g. both aaaa and arimaa_master have played 3 games, 2 against ziltoid and 1 against marwin. Both won 1 game against ziltoid, and lost 2 other games. however the score is 1-1 for aaaa, and 0-1 for arimaa_master. Why does a colour of a game have such a big impact to the final result? I think the same amount of points for marwin and ziltoid should be assigned in such a case.
|
|
IP Logged |
|
|
|
|