Welcome, Guest. Please Login or Register.
Apr 17th, 2024, 10:25pm

Home Home Help Help Search Search Members Members Login Login Register Register
Arimaa Forum « 2014 Arimaa Challenge »


   Arimaa Forum
   Arimaa
   Events
(Moderator: supersamu)
   2014 Arimaa Challenge
« Previous topic | Next topic »
Pages: 1 2 3  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: 2014 Arimaa Challenge  (Read 8814 times)
Janzert
Forum Guru
*****



Arimaa player #247

   


Gender: male
Posts: 1016
Re: 2014 Arimaa Challenge
« Reply #15 on: Mar 13th, 2014, 8:14pm »
Quote Quote Modify Modify

Sorry for the delay. I'm going to make the symmetric ruling for human timeouts attributable to connection issues as for bot timeouts. Specifically the game should be resumed if possible and disregarded if it can't be completed by the time the screening ends.
 
If you need a game resumed you can get with either myself and/or Omar if you need a game resumed.
 
Janzert
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: 2014 Arimaa Challenge
« Reply #16 on: Mar 15th, 2014, 1:53pm »
Quote Quote Modify Modify

I have updated the results including kzb's resumed win over sharp but excluding SilverMitt's timeout win over sharp.  This leaves me even on my bets with browni because the top bot is setting a record, but the top bot isn't sharp.  Ziltoid leads sharp by 2-0 in completed pairs, and by 2382-2067 in performance rating.
IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: 2014 Arimaa Challenge
« Reply #17 on: Mar 19th, 2014, 5:55pm »
Quote Quote Modify Modify

Arimaa_master's win over sharp drops sharp's performance rating to a disappointing 2036, while ziltoid has kept on trucking to a stratospheric performance rating of 2455.  The small sample is obviously at work on both sides.  There have been no more decisive pairs completed, so ziltoid continues to lead 2-0.
« Last Edit: Mar 19th, 2014, 6:04pm by Fritzlein » IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: 2014 Arimaa Challenge
« Reply #18 on: Mar 20th, 2014, 10:42pm »
Quote Quote Modify Modify

Since last update, browni beat ziltiod, but ziltoid beat arimaa_master and sharp beat both aaaa and harvestsnow, so the bots collectively gained a bit of ground.  Ziltoid's lead stretches to 3-0 on the completed arimaa_master pair, but its lead in performance rating shrinks to 2403 vs. 2139.
IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: 2014 Arimaa Challenge
« Reply #19 on: Mar 23rd, 2014, 10:40am »
Quote Quote Modify Modify

on Mar 12th, 2014, 4:41pm, Ail wrote:

Theory #1:
People like winning. Winning was easier when the bots were easier to beat.
Thus less people felt like challenging the bots when they expected to be beaten.

Good theory, Ail.  It would be nice if we could at least match the 25 completed pairs that we had last year (currently we have 11 with eight days to go), but I'm afraid there will be a lot of "one and done" screening participants.  I'll bet people who lose their first screening game are much less likely to play a second than people who win their first.  What seems like a fun challenge can quickly turn into a chore without positive feedback.
 
Hat tip to arimaa_master for becoming the first player to complete all four screening games.  His final game, a victory over ziltoid, gives sharp its first point of the screening, narrowing ziltoid's lead to 3-1.  Ziltoid also leads in performance rating by 2347 to 2172, but there is plenty of time for that to change in the final week of screening!
IP Logged

browni3141
Forum Guru
*****



Arimaa player #7014

   


Gender: male
Posts: 384
Re: 2014 Arimaa Challenge
« Reply #20 on: Mar 23rd, 2014, 11:03am »
Quote Quote Modify Modify

Has omar considered a shorter time control, like 1m/move? I think a lot of people either can't, or don't want to set aside such a large block of time.
IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: 2014 Arimaa Challenge
« Reply #21 on: Mar 23rd, 2014, 3:04pm »
Quote Quote Modify Modify

on Mar 23rd, 2014, 11:03am, browni3141 wrote:
Has omar considered a shorter time control, like 1m/move? I think a lot of people either can't, or don't want to set aside such a large block of time.

It was discussed in the past, but the argument that the Arimaa Challenge time controls are the ones that should govern the screening is the one that prevailed.  One concern is that bots may be better at different speeds, and we want the bot that is best at the Challenge speed.  But these days there is starting to be another issue: if you speed up the time control, then even fewer humans will be able to win.  Halving the time control probably adds 50 Elo or more to bot strength relative to humans, further demotivating people who get whacked and further shrinking the pool of folks who are likely to provide discrimination by beating one bot and losing to the other.
 
I do see the case for shorter time controls: more games equals more information.  In fact, I once proposed that we speed up the time controls temporarily, as long as humans are comfortably ahead, and only slow them down again when we are nearer to defeat.  That idea didn't fly because it creates the impression that we are willing to "move the goalposts", i.e. keep changing the rules of the Challenge so that we can be sure to keep winning.  For that reason alone, I expect any rule change will be a tough sell to Omar.  He would be happiest if we could get away with not making any more changes until the Challenge expires in 2020.
 
For the mean time, I hope we can inspire a few more people to take their best shot at winning a long, slow game.  Scoring even one win is an achievement to be proud of.  Do it now before our silicon overlords enslave us all!  Tongue
IP Logged

rbarreira
Forum Guru
*****



Arimaa player #1621

   


Gender: male
Posts: 605
Re: 2014 Arimaa Challenge
« Reply #22 on: Mar 23rd, 2014, 5:46pm »
Quote Quote Modify Modify

I noticed that the precise moment the screening ends is not defined in the rules:
 
http://arimaa.com/arimaa/wc/2014/sch.html
 
http://arimaa.com/arimaa/challenge/2014/
 
It just says "March 31" without specifying a time or timezone for the games to start/end.
 
It might be worth it to clarify that before it becomes an issue.
« Last Edit: Mar 23rd, 2014, 5:46pm by rbarreira » IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: 2014 Arimaa Challenge
« Reply #23 on: Mar 23rd, 2014, 7:21pm »
Quote Quote Modify Modify

on Mar 23rd, 2014, 3:04pm, Fritzlein wrote:
He would be happiest if we could get away with not making any more changes until the Challenge expires in 2020.

Me too, for the same reasons.
IP Logged
Ail
Forum Guru
*****




Rabbits can't push Rabbits!

   


Gender: male
Posts: 52
Re: 2014 Arimaa Challenge
« Reply #24 on: Mar 24th, 2014, 11:00am »
Quote Quote Modify Modify

on Mar 23rd, 2014, 10:40am, Fritzlein wrote:

I'm afraid there will be a lot of "one and done" screening participants.  I'll bet people who lose their first screening game are much less likely to play a second than people who win their first.

I feel pretty much looked through now.
 
I am like 1:10 against not even the highest level of the 2012-Sharp on my phone having it use like 5 seconds while I use as long as I feel like.
Thus I got smashed like expected. And I really don't feel like getting smashed 3 more times.
 
I think that if I can't even put up a good fight against my phone, it's too unlikely I can do against better bots on better hardware.
IP Logged
browni3141
Forum Guru
*****



Arimaa player #7014

   


Gender: male
Posts: 384
Re: 2014 Arimaa Challenge
« Reply #25 on: Mar 24th, 2014, 12:13pm »
Quote Quote Modify Modify

on Mar 23rd, 2014, 3:04pm, Fritzlein wrote:

Halving the time control probably adds 50 Elo or more to bot strength relative to humans, further demotivating people who get whacked and further shrinking the pool of folks who are likely to provide discrimination by beating one bot and losing to the other.

Wow, my own estimate was that a single doubling was worth about 150 points of strength relative to a bot getting the same time increase, at least for myself.
 
I agree with all the reasons why we shouldn't change the time control, but at the same time I think increasing participation is extremely important, especially as we are losing participation and accuracy in the screening in consecutive years.
How about having a reward for each pair completed? Then the problem is where the reward will come from...
Perhaps the reward can just be someone's time. Maybe some strong players can annotate all the games of completed pairs, and getting some free game help will be enough for more players to complete at least one pair.
 
Another suggestion that can be implemented independently of previous suggestions is to allow players to complete more than two pairs. This would be a very minor rule change. I understand that omar wouldn't want one player's performance being weighted too heavily, but I don't see how more games can hurt at this point. A cap of three or four pairs seems reasonable. I'm not sure how many people would want to do more anyway. Two is probably already plenty for most Wink
 
Also, I just remembered that I have a half typed response for this thread...
 
on Mar 24th, 2014, 11:00am, Ail wrote:

I feel pretty much looked through now.
 
I am like 1:10 against not even the highest level of the 2012-Sharp on my phone having it use like 5 seconds while I use as long as I feel like.
Thus I got smashed like expected. And I really don't feel like getting smashed 3 more times.
 
I think that if I can't even put up a good fight against my phone, it's too unlikely I can do against better bots on better hardware.

 
Although games between very close opponents should yield the most information, every pair completed is meaningful, Ail, so it would be really nice if you could play just one more game. If you play a second screening game, then I offer to annotate both of your games for you, and answer any questions you have about either game.
If you play another pair after that, I'll do the same for that pair also.
IP Logged

browni3141
Forum Guru
*****



Arimaa player #7014

   


Gender: male
Posts: 384
Re: 2014 Arimaa Challenge
« Reply #26 on: Mar 24th, 2014, 12:16pm »
Quote Quote Modify Modify

on Mar 23rd, 2014, 5:46pm, rbarreira wrote:
I noticed that the precise moment the screening ends is not defined in the rules:
 
http://arimaa.com/arimaa/wc/2014/sch.html
 
http://arimaa.com/arimaa/challenge/2014/
 
It just says "March 31" without specifying a time or timezone for the games to start/end.
 
It might be worth it to clarify that before it becomes an issue.

It is on this page: http://arimaa.com/arimaa/challenge/2014/playBestBots.cgi
but it probably should be on those pages also.
IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: 2014 Arimaa Challenge
« Reply #27 on: Mar 25th, 2014, 5:35pm »
Quote Quote Modify Modify

on Mar 24th, 2014, 12:13pm, browni3141 wrote:
Wow, my own estimate was that a single doubling was worth about 150 points of strength relative to a bot getting the same time increase, at least for myself.

Hmmm...
With three doublings between CC and blitz, that would be a 450 point difference?  I admit that the CC bots are probably a bit overrated because the humans don't use their full time allotment, but the actual rating difference between a blitz and a CC bot of the same vintage on the server seems to be in the 150-200 point range on average, just from eyeballing it.  So my 50 points per doubling is probably a lower bound rather than an accurate guess, but not a ridiculously conservative lower bound.
IP Logged

browni3141
Forum Guru
*****



Arimaa player #7014

   


Gender: male
Posts: 384
Re: 2014 Arimaa Challenge
« Reply #28 on: Mar 25th, 2014, 8:46pm »
Quote Quote Modify Modify

on Mar 25th, 2014, 5:35pm, Fritzlein wrote:

Hmmm...
With three doublings between CC and blitz, that would be a 450 point difference?  I admit that the CC bots are probably a bit overrated because the humans don't use their full time allotment, but the actual rating difference between a blitz and a CC bot of the same vintage on the server seems to be in the 150-200 point range on average, just from eyeballing it.  So my 50 points per doubling is probably a lower bound rather than an accurate guess, but not a ridiculously conservative lower bound.

These win-rates seem reasonable:
Blitz: 50%
Fast: 70%
60s: 85%
CC: 93%
I would be interested in seeing more data, but for something like this, there are tons of variables which could affect the results.
IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: 2014 Arimaa Challenge
« Reply #29 on: Mar 26th, 2014, 8:50pm »
Quote Quote Modify Modify

Hat tip to Heyckie for becoming the second player to complete all four screening games, and to BlakeD, Braveheart, and BrendanM for individual wins.  The bots have slipped a bit to performance ratings of 2286 vs. 2112, and ziltoid's lead has opened back up to 4-1, so it is looking more likely that browni will win both of his bets with me.
IP Logged

Pages: 1 2 3  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.