Author |
Topic: 2011 Arimaa Challenge (Read 9712 times) |
|
UruramTururam
Forum Guru
Arimaa player #2537
Gender:
Posts: 319
|
|
Re: 2011 Arimaa Challenge
« Reply #15 on: Mar 11th, 2011, 10:13am » |
Quote Modify
|
on Mar 11th, 2011, 9:58am, Fritzlein wrote: One thing we'll always be better at than computers is making excuses. |
| I remember playing computer version of Magic: the Gathering. after losing a game the computer opponent always made a witty comment like: "Yeah, you won. Of course I did not draw enough lands..."
|
|
IP Logged |
Caffa et bucella per attactionem corporum venit ad stomachum meum. BGG Arimaa badges - get your own one!
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2011 Arimaa Challenge
« Reply #16 on: Mar 11th, 2011, 10:33am » |
Quote Modify
|
on Mar 11th, 2011, 10:13am, UruramTururam wrote:I remember playing computer version of Magic: the Gathering. after losing a game the computer opponent always made a witty comment like: "Yeah, you won. Of course I did not draw enough lands..." |
| Heheh, that's not the bot making an excuse; that's the developer making fun of humans (like me) who always blame their losses on luck.
|
« Last Edit: Mar 11th, 2011, 10:34am by Fritzlein » |
IP Logged |
|
|
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Re: 2011 Arimaa Challenge
« Reply #18 on: Mar 11th, 2011, 12:24pm » |
Quote Modify
|
Now that Tuks has lost against sharp, I think you can start saying it looks grim a bit more legitimately
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2011 Arimaa Challenge
« Reply #19 on: Mar 11th, 2011, 12:35pm » |
Quote Modify
|
Quote:Yes, I know that performance rating is a very noisy metric when there have only been a few games played, but the bots 5-1 record so far means they are performing at a rating of 2443! |
| Make that 6-1, for a performance of 2464. By a remarkable coincidence, my game room rating is 2464. In fairness, though, we haven't yet heard Tuks's reason for losing.
|
|
IP Logged |
|
|
|
UruramTururam
Forum Guru
Arimaa player #2537
Gender:
Posts: 319
|
|
Re: 2011 Arimaa Challenge
« Reply #20 on: Mar 11th, 2011, 1:16pm » |
Quote Modify
|
on Mar 11th, 2011, 12:35pm, Fritzlein wrote:we haven't yet heard Tuks's reason for losing. |
| Here it goes (taken from the game comment): Quote:sharp is good. Good news for everyone though, sharp is open to bait and tackle shame i'm an idiot and after setting up the position i messed up the execution. also missed the false protection at the end and there bunch of other mistakes i don't care to reexamine Tuks Fri 20:01 YLT |
|
|
|
IP Logged |
Caffa et bucella per attactionem corporum venit ad stomachum meum. BGG Arimaa badges - get your own one!
|
|
|
Adanac
Forum Guru
Arimaa player #892
Gender:
Posts: 635
|
|
Re: 2011 Arimaa Challenge
« Reply #21 on: Mar 11th, 2011, 1:51pm » |
Quote Modify
|
on Mar 11th, 2011, 12:35pm, Fritzlein wrote: Make that 6-1, for a performance of 2464. By a remarkable coincidence, my game room rating is 2464. In fairness, though, we haven't yet heard Tuks's reason for losing. |
| Bots are improving at the rate of about 400 points per year. So they'll be invincible by 2013
|
|
IP Logged |
|
|
|
mistre
Forum Guru
Gender:
Posts: 553
|
|
Re: 2011 Arimaa Challenge
« Reply #22 on: Mar 11th, 2011, 3:42pm » |
Quote Modify
|
on Mar 11th, 2011, 12:35pm, Fritzlein wrote: Make that 6-1, for a performance of 2464. By a remarkable coincidence, my game room rating is 2464. In fairness, though, we haven't yet heard Tuks's reason for losing. |
| What is even more scary is how badly Bot_Quad crushed Sharp!
|
|
IP Logged |
|
|
|
chessandgo
Forum Guru
Arimaa player #1889
Gender:
Posts: 1244
|
|
Re: 2011 Arimaa Challenge
« Reply #23 on: Mar 11th, 2011, 4:19pm » |
Quote Modify
|
on Mar 11th, 2011, 3:42pm, mistre wrote: What is even more scary is how badly Bot_Quad crushed Sharp! |
| Yeah this, that was an impressive game. I must admit I had pretty much given up on the whole institution of playing bots, but it seems they've had a big leaf in playing strength this year, I'll get a shot at marwin tomorrow hopefully. Also it's kinda frightening to wonder about sharp with a better treatment of advanced rabbits (possibly next year...).
|
|
IP Logged |
|
|
|
mistre
Forum Guru
Gender:
Posts: 553
|
|
Re: 2011 Arimaa Challenge
« Reply #24 on: Mar 11th, 2011, 4:26pm » |
Quote Modify
|
Good game Jean vs Sharp, your superior goal attack understanding made the difference.
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2011 Arimaa Challenge
« Reply #25 on: Mar 11th, 2011, 4:54pm » |
Quote Modify
|
A commanding victory, Jean, the more impressive for winning (at least as I perceive it) against sharp's strength rather than against sharp's weakness. If it turns out that people win against sharp by swarming and getting in a goal race and lose when they play a home game and take a camel hostage, I will have to re-adjust my expectations. Now that sharp has a loss, I can give each bot an independent performance rating. I will maintain the table by editing this post rather than by re-posting every time I re-calculate. For the record, I was expecting only modest improvement in marwin, i.e. performance in the 2050-2100 range, and sharp to perform perhaps worse against humans, say in the 1950-2050 range despite beating marwin head-to-head. It seems I already have to kiss that hope goodbye; now the question is simply how worried I need to be, and whether there is any easy pattern for winning against each. Please, everyone take a shot at these bots; surely they have exploitable weaknesses if only we can discover them. Year Pairs Decisive Winner / Score / Perf Loser / Score / Perf ---- ----- -------- --------------------- -------------------- 2007 12 . 2 . bomb / 2 / 2087 . Zombie / 0 / 1876 2008 16 . 7 . bomb / 6 / 1918 . sharp / 1 / 1576 2009 23 . 7 clueless / 5 / 1910 . GnoBot / 2 / 1792 2010 25 . 11 marwin / 6 / 2065 clueless / 5 / 1960 2011 39 . 11 sharp / 5 / 2102 . marwin / 6 / 2110
|
« Last Edit: Mar 29th, 2011, 12:43am by Fritzlein » |
IP Logged |
|
|
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Re: 2011 Arimaa Challenge
« Reply #26 on: Mar 11th, 2011, 4:59pm » |
Quote Modify
|
How did sharp's performance rating go from 2464 to 2521 after it lost a game?
|
|
IP Logged |
|
|
|
chessandgo
Forum Guru
Arimaa player #1889
Gender:
Posts: 1244
|
|
Re: 2011 Arimaa Challenge
« Reply #27 on: Mar 11th, 2011, 4:59pm » |
Quote Modify
|
Thanks guys.
|
|
IP Logged |
|
|
|
chessandgo
Forum Guru
Arimaa player #1889
Gender:
Posts: 1244
|
|
Re: 2011 Arimaa Challenge
« Reply #28 on: Mar 11th, 2011, 5:00pm » |
Quote Modify
|
on Mar 11th, 2011, 4:59pm, rbarreira wrote:How did sharp's performance rating go from 2464 to 2521 after it lost a game? |
| Elo performance does that sometimes, when you lose to a high rated player after winning against several other players, I think (?).
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: 2011 Arimaa Challenge
« Reply #29 on: Mar 11th, 2011, 5:56pm » |
Quote Modify
|
on Mar 11th, 2011, 4:59pm, rbarreira wrote:How did sharp's performance rating go from 2464 to 2521 after it lost a game? |
| on Mar 11th, 2011, 5:00pm, chessandgo wrote:Elo performance does that sometimes, when you lose to a high rated player after winning against several other players, I think (?). |
| There are two standard ways to calculate performance rating. One way has the disadvantage that a loss to a much-higher-rated player can help you and a win over a much-lower-rated player can hurt you. I don't like this feature, so I use the other way. I say performance is the rating that would have predicted the score you got. The disadvantage of the way I like is that if you have no losses, your performance is plus-infinity and if you have no wins, your performance rating is minus-infinity. This is because even a very high rating doesn't predict you will never lose. This was a problem when sharp had no losses, so I was lumping sharp and marwin together to get a combined performance of 2464. If I had split them apart, marwin's performance would have been 2230 and sharp's would have been plus infinity. When sharp lost to chessandgo, its performance rating dropped from plus infinity to only 2521. That's weird, I know, but I like it better than the other disadvantage. For example, sharp just now beat 722caasi, rated 1804. In my calculation, that boosts sharp's performance from 2521 to 2526, a gain of five points. In the other calculation, beating 722caasi would have lowered sharp's performance from 2450 to 2409, a loss of forty-one points. Since I can't stand the illogical of losing points for a win, I tolerate the infinities. In a footnote, to calculate these performance ratings, I am making no reference to WHR; I'm simply use the opponent's game room rating at the time of the game. I know game room ratings are inaccurate, but I put up with it. I will, however, draw the line at using hanzack's game room rating, should he participate in the screening phase. Instead I will use his performance from the World Championship, where his 4-4 record against his strong schedule would have been exactly predicted for someone with a rating of 2060.
|
|
IP Logged |
|
|
|
|