Author |
Topic: 2010 Challenge Screening (Read 3709 times) |
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
2010 Challenge Screening
« on: Mar 13th, 2010, 9:55am » |
Quote Modify
|
I'm going to start a thread specific to the screening, partly so we can dissect the game results and discuss any technical issues that arise, and partly so that I will be able to extend the following table and find it next year: Year Pairs Decisive Winner/Score Loser/Score ---- ----- -------- ------------ ----------- 2007 12 . 2 . Bomb / 2 Zombie / 0 2008 16 . 7 . Bomb / 6 Sharp / 1 2009 23 . 7 Clueless / 5 GnoBot / 2
|
« Last Edit: Mar 13th, 2010, 4:29pm by Fritzlein » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 Challenge Screening
« Reply #1 on: Mar 13th, 2010, 3:13pm » |
Quote Modify
|
I just now tried to play a screening game, and I get the message Quote:Sorry the server is currently busy. bot |
| However, bot_clueless vs. Hippo is the only screening game at present. The other server should be available. What's up with that?
|
|
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: 2010 Challenge Screening
« Reply #2 on: Mar 13th, 2010, 3:29pm » |
Quote Modify
|
I got the same message and after I pointed this out to Omar, he fixed it for me.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 Challenge Screening
« Reply #3 on: Mar 14th, 2010, 9:10pm » |
Quote Modify
|
In honor of marwin's winning the first decisive pair of games of the 2010 screening, I have updated the table to add a row for this year. Also I have included performance ratings for each bot for each year over all screening games played (i.e. not just over the paired games) Year Pairs Decisive Winner / Score / Perf Loser / Score / Perf ---- ----- -------- --------------------- -------------------- 2007 12 . 2 . bomb / 2 / 2087 . Zombie / 0 / 1876 2008 16 . 7 . bomb / 6 / 1918 . sharp / 1 / 1576 2009 23 . 7 clueless / 5 / 1910 . GnoBot / 2 / 1792 2010 25 . 11 marwin / 6 / 2065 clueless / 5 / 1960 The performance rating is perhaps not very reliable. Bomb in 2008 and clueless in 2009 had similar performance ratings from the screening, but bomb went 0-9 in the Challenge whereas clueless went 2-7. We shouldn't read too much into marwin's impressive 4-3 start in the screening. [EDIT] updated through game 139910 (including joe's game 138003; not including hanzack's games or quad's)
|
« Last Edit: Mar 31st, 2010, 7:05am by Fritzlein » |
IP Logged |
|
|
|
RonWeasley
Forum Guru
Harry's friend (Arimaa player #441)
Gender:
Posts: 882
|
|
Re: 2010 Challenge Screening
« Reply #4 on: Mar 15th, 2010, 4:45am » |
Quote Modify
|
Without going into a lot of detail, I'm going to use a gut reaction and allow joe to replay the game against bot_marwin that he resigned by mistake on move 3. I'm not sure about the precedent this sets, but it seems like human confusion was the root cause. If this were a WC game I would rule that the result must stand, but the screening games are all about getting information about bots' playing ability against humans. Forcing this result to stand would be counter to that purpose.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 Challenge Screening
« Reply #5 on: Mar 23rd, 2010, 8:38pm » |
Quote Modify
|
In my week-long absence the race tightened up. Marwin now leads only 3-2 in decisive pairs, and has an unpaired loss to novacat. The screen, like the Computer Championship, could go down to the wire. I have updated the table in my previous post. The screening performance ratings of marwin and clueless are now 2036 and 1977 respectively, rather in line with my gut feeling about their true abilities under Challenge conditions.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 Challenge Screening
« Reply #6 on: Mar 25th, 2010, 6:47am » |
Quote Modify
|
Wow, clueless's win over novacat brings the match all square, 3-3 in the six decisive pairs. This is as exciting as the Computer Championship itself! I hope we have several more pairs completed before the month is over, starting with onigawara, 722caasi, camelback, Simon, The_Jeh, joe, and clauchau finishing the pairs they started. Marwin's performance rating in my table is still slightly higher than clueless's because of incomplete pairs slightly favoring marwin overall.
|
« Last Edit: Mar 25th, 2010, 6:53am by Fritzlein » |
IP Logged |
|
|
|
camelback
Forum Guru
Arimaa perl monger
Gender:
Posts: 144
|
|
Re: 2010 Challenge Screening
« Reply #7 on: Mar 25th, 2010, 12:43pm » |
Quote Modify
|
I'm not going to play at 2 min control anymore. It was tempting to play new bots but annoying to get dragged beyond 4 hours. At last I had to leave while in winning position and can't unrate. May be there should be different time control options in the future for bot screening.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 Challenge Screening
« Reply #8 on: Mar 25th, 2010, 2:37pm » |
Quote Modify
|
At one point I was in favor of different time controls, but I changed my mind. The screening time controls should be the same as the challenge time controls, and those should not change. I like that Omar hasn't changed the rules of the Arimaa Challenge for several years now, including the time controls. People who try to meet AI challenges always have to face the "moving goalposts" syndrome, i.e. whenever they get close to humans, the match conditions change to make it more favorable to humans. I was upset when Omar changed the Arimaa Challenge to require beating three out of three defenders, since that is harder than just beating one defender. By the same token I don't want to have faster time controls now when humans don't mind and then change to slower time controls later when we need it. It's not fair to base the match conditions on what is most convenient for humans at the moment. Besides, it turns out that we are having an awesome screening at two minutes per move. Since I last wrote, The_Jeh and onigawara completed pairs, moving the score to 4-4, still tied! I somehow feel we won't know until the very last day which bot will advance.
|
« Last Edit: Mar 25th, 2010, 2:38pm by Fritzlein » |
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: 2010 Challenge Screening
« Reply #9 on: Mar 26th, 2010, 1:37am » |
Quote Modify
|
I can't see the link to play the screening bots anymore. Can someone help me?
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: 2010 Challenge Screening
« Reply #11 on: Mar 26th, 2010, 4:42am » |
Quote Modify
|
Thanks Nombril. I hope I get some more free nights to take revenge on marwin.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 Challenge Screening
« Reply #12 on: Mar 29th, 2010, 5:52am » |
Quote Modify
|
How exciting! Rabbits' punchout of marwin gave clueless the lead for all of 83 minutes before Eltripas downed clueless to even the score at 5-5 again. Just two and a half days left in this thriller...
|
|
IP Logged |
|
|
|
knarl
Forum Guru
Arimaa player #1648
Gender:
Posts: 104
|
|
Re: 2010 Challenge Screening
« Reply #13 on: Mar 29th, 2010, 4:34pm » |
Quote Modify
|
I have plans to try a reckless race game against marwin again, after almost winning (is there such a thing in arimaa? ) last time. I just need to find time so I can play both bots before the deadline. I watched marwin's defeat to rabbits yesterday. Talk about putting up a death struggle at the end! Cheers, knarl.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2010 Challenge Screening
« Reply #14 on: Mar 29th, 2010, 4:48pm » |
Quote Modify
|
on Mar 29th, 2010, 4:34pm, knarl wrote:I have plans to try a reckless race game against marwin again, after almost winning (is there such a thing in arimaa? ) last time. |
| Heheh. Your game against marwin reminds me of something I once heard a chess player say: "He creamed me, but just barely!"
|
|
IP Logged |
|
|
|
|