Author |
Topic: 2012 Arimaa Challenge (Read 7865 times) |
|
hyperpape
Forum Guru
Arimaa player #7113
Gender:
Posts: 80
|
|
Re: 2012 Arimaa Challenge
« Reply #30 on: Mar 20th, 2012, 11:25am » |
Quote Modify
|
Btw: if bots have a lower variance in winning (they very consistently beat players below a certain mark, and very consistently lose to players above a different mark) isn't the probability distribution of the performance rating dependent on the strength of opponents in a way that a human's performance rating wouldn't? I guess the very top bots might not show the same pattern, since they're a lot less one-dimensional than old Bomb et al.
|
« Last Edit: Mar 20th, 2012, 11:25am by hyperpape » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2012 Arimaa Challenge
« Reply #31 on: Mar 20th, 2012, 6:00pm » |
Quote Modify
|
on Mar 20th, 2012, 11:25am, hyperpape wrote:Btw: if bots have a lower variance in winning (they very consistently beat players below a certain mark, and very consistently lose to players above a different mark) isn't the probability distribution of the performance rating dependent on the strength of opponents in a way that a human's performance rating wouldn't? |
| This situation you outline is exactly what I believe. If players rated below the bot participate in the screening, the bot should be infallible and its performance rating should be biased upward. I think this happened in the 2007 screening. On the other hand, if players rated above the bot participate in the screening, the bot should be crushed and its performance rating should be biased downward. This was more like the 2008 screening. Bomb didn't change between 2007 and 2008, so something else must explain the 169-point drop in its performance rating between the two years. The only trouble with our theory is that it doesn't match the facts of the current year. Postulating a true strength of 2200 for both bots, there are few humans with a WHR above that level. Chessandgo, hanzack, and Nombril can't participate. Rabbits and 99of9 haven't started yet. Boo, Adanac, Hippo, and I have collectively won six and lost five, hardly a dominating score. Meanwhile, the supposedly over-matched rest of the field has won nine and lost twenty-two, much better than expected from their WHR ratings. Harren(4-0), Max(3-1), ocmiente(1-0), and aaaa(1-0) are all rated lower than the hypothetical strength of the bots. Their wins are what is keeping down the performance rating of the two bots. Quote:I guess the very top bots might not show the same pattern, since they're a lot less one-dimensional than old Bomb et al. |
| I don't know how to explain the failure of our theory at present, so I'll just file away the notion that it may well be wrong, and await further observations. Perhaps it is just random fluctuations due to a small data set and further games will spare me the inconvenience of changing my mind. (Also it is a great temptation to start gloating too early that my prediction of performance rating looks closer than rbarriera's prediction. But it would still be quite possible for briareus to top 2300 in performance by the time the screening is over, so I had better keep my mouth shut. )
|
« Last Edit: Mar 20th, 2012, 6:17pm by Fritzlein » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2012 Arimaa Challenge
« Reply #32 on: Mar 28th, 2012, 8:01am » |
Quote Modify
|
The bots are on a twelve-game winning streak. With only a few days left in the screening, who is stand up for humanity and put silicon back in its place? (Or even just give it a try? )
|
|
IP Logged |
|
|
|
tize
Forum Guru
Arimaa player #3121
Gender:
Posts: 118
|
|
Re: 2012 Arimaa Challenge
« Reply #33 on: Mar 30th, 2012, 2:07pm » |
Quote Modify
|
The bots are still undefeted this week. Go silicon, go!
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2012 Arimaa Challenge
« Reply #34 on: Mar 31st, 2012, 10:12pm » |
Quote Modify
|
Congratulations, Ricardo! Briareus won convincingly, although I must point out that the performance rating of 2232 falls shy of your "closer to 2300 than 2200" prediction. (I so seldom predict correctly that I have to make a fuss over it when I do. ) The fact that marwin fell well short of 2200 despite edging out briareus in the Computer Championships makes me wonder marwin isn't as tough on humans or it was just random variation. The total of 33 completed pairs falls short of last year's 40, which is surprising given that bot advances have given new spice to the man vs. machine contest. Ah, well, now we can kick back and wait for the Arimaa Challenge games.
|
|
IP Logged |
|
|
|
tize
Forum Guru
Arimaa player #3121
Gender:
Posts: 118
|
|
Re: 2012 Arimaa Challenge
« Reply #35 on: Apr 1st, 2012, 1:46am » |
Quote Modify
|
Quote:The bots finish on a 14-1 run, with briareus going 8-0 to close out with an overall record of 33-8 and a performance of 2232. |
| Actually if you just take briareus games then it's even more impressive, it finished the screen with 12 straight victories! Congratulations rbarriera!
|
|
IP Logged |
|
|
|
Hippo
Forum Guru
Arimaa player #4450
Gender:
Posts: 883
|
|
Re: 2012 Arimaa Challenge
« Reply #36 on: Apr 1st, 2012, 2:36am » |
Quote Modify
|
Oh sorry I have not finished the other pair. The only long enough time window I have found was time when I had headache . Hmmm ... actually I could play last night ... my bad. Fortunately this time the unfinished games does not favour the bot finishing second. Congrats Ricardo, good job Mattias.
|
« Last Edit: Apr 1st, 2012, 2:37am by Hippo » |
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2012 Arimaa Challenge
« Reply #37 on: Apr 4th, 2012, 2:59pm » |
Quote Modify
|
Congrats rbarriera. Interestingly this is the second time now where the bot that placed second in the computer championship performed better against the humans. The challenge match games for round one have been scheduled. Please check the gameroom for your local times. If there is interest in commentating on these games, please post here. It would be interesting to have some bot developers commentate along with some top players.
|
|
IP Logged |
|
|
|
Arimabuff
Forum Guru
Arimaa player #2764
Gender:
Posts: 589
|
|
Re: 2012 Arimaa Challenge
« Reply #38 on: Apr 4th, 2012, 11:40pm » |
Quote Modify
|
on Apr 4th, 2012, 2:59pm, omar wrote:...If there is interest in commentating on these games, please post here. It would be interesting to have some bot developers commentate along with some top players. |
| I am neither but I'd still be interested in commentating those games.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: 2012 Arimaa Challenge
« Reply #39 on: Apr 5th, 2012, 2:25pm » |
Quote Modify
|
on Apr 4th, 2012, 11:40pm, Arimabuff wrote: I am neither but I'd still be interested in commentating those games. |
| Great. Feel free to join in TeamSpeak. It's two minutes per move, so it'll be good to have a few people there.
|
|
IP Logged |
|
|
|
Arimabuff
Forum Guru
Arimaa player #2764
Gender:
Posts: 589
|
|
Re: 2012 Arimaa Challenge
« Reply #40 on: Apr 12th, 2012, 9:50am » |
Quote Modify
|
on Apr 5th, 2012, 2:25pm, omar wrote: Great. Feel free to join in TeamSpeak. It's two minutes per move, so it'll be good to have a few people there. |
| I am sorry I wasn't there last time. I'll do my best to be present for each game from now on, even though unexpected events sometimes get in the way.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2012 Arimaa Challenge
« Reply #41 on: Apr 12th, 2012, 12:39pm » |
Quote Modify
|
Thanks for this community service, Patrick!
|
|
IP Logged |
|
|
|
Arimabuff
Forum Guru
Arimaa player #2764
Gender:
Posts: 589
|
|
Re: 2012 Arimaa Challenge
« Reply #42 on: Apr 14th, 2012, 9:54am » |
Quote Modify
|
on Apr 12th, 2012, 12:39pm, Fritzlein wrote:Thanks for this community service, Patrick! |
| That's nice to hear, I mean read.
|
|
IP Logged |
|
|
|
mistre
Forum Guru
Gender:
Posts: 553
|
|
Re: 2012 Arimaa Challenge
« Reply #43 on: Apr 16th, 2012, 9:59am » |
Quote Modify
|
on Mar 13th, 2012, 3:41pm, mistre wrote:Prediction: Briareus wins screening and then shocks with 2 wins in Challenge. Of course neither are vs. Chessandgo... |
| Turns out I was right... This does not bode well for humans going forward. We might be in for several years of the computer champion beating 1 or 2 challengers. Let me see if I understand the challenge correctly.. So for a bot to win, it has to beat all 3 challengers? So if a bot loses 2-1 to one challenger, but then destroys the other two 3-0 for a total record of 7-2, it still loses? Hardly seems fair...
|
« Last Edit: Apr 16th, 2012, 10:00am by mistre » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2012 Arimaa Challenge
« Reply #44 on: Apr 16th, 2012, 10:30am » |
Quote Modify
|
on Apr 16th, 2012, 9:59am, mistre wrote:Let me see if I understand the challenge correctly.. So for a bot to win, it has to beat all 3 challengers? So if a bot loses 2-1 to one challenger, but then destroys the other two 3-0 for a total record of 7-2, it still loses? |
| Correct. This means a bot that is even in skill with all three defenders has only 1/8 chance of winning the Challenge in any given year. In order for a bot to have a 1/2 chance of winning the Challenge in any given year, it needs a 79% chance of winning each mini-match (assuming all three defenders are equal), which translates into 71% chance of winning each game, which means being 154 Elo stronger than the defenders. On the other hand, the bots get to try every year. If we said the bot needed only to win two of the three mini-matches, so that a bot that was even in skill with all three defenders would have a 50% of winning the Challenge in any given year, there would be a high chance of a weaker bot winning by a fluke. A bot rated 100 Elo below the defenders would have a 36% chance of winning each game, a 29.5% chance of winning each mini-match, and a 18.5% chance of winning the Challenge. If that bot tried for five years, it would have a 64% chance of winning the Challenge some year. That is to say, a bot weaker than all the defenders would be a favorite to win the Challenge given multiple tries. It isn't obvious how to deal with uncertainty. Yes, it is unfair that a bot on a par with the best humans is an underdog to win, but it would also be unfair if a bot won the Challenge on a fluke and humans had no chance to win it back. It would be silly if a bot walked away with the prize, and a month later several humans were consistently winning the majority of games against that bot. When you balance out potential evils, the Challenge structure may not be as unfair as it first appears.
|
« Last Edit: Apr 16th, 2012, 10:35am by Fritzlein » |
IP Logged |
|
|
|
|