Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> Events >> 2009 Arimaa Challenge
(Message started by: RonWeasley on Mar 12th, 2009, 4:46pm)

Title: 2009 Arimaa Challenge
Post by RonWeasley on Mar 12th, 2009, 4:46pm
The Arimaa Challenge begins soon with the finalists from the WCC vying for the right to face the best humans and try for a large monitary prize.

As this year's tournament director I am please to announce bot_clueless and bot_GnoBot are the challengers.  The best performer in the screening games will challenge the defenders in separate three game matches.

I am especially pleased and grateful to announce this year's defenders:  arimaa_master, chessandgo, and Fritzlein.  Playing a three game match at the long time control and with the pressure associated with defending the honor humanity is a significant undertaking.  I thank each of these top players for taking on the task.

Please see the game room for links to the Challenge and enjoy rooting for your favorite bot or human.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 13th, 2009, 7:23am
Thanks for the announcement Ron and also thanks for accepting to be the TD for the this years events.

The screening period will begin Mar 14th and continue up to Mar 28th. All human players are encouraged to play the best bots of the computer championship during this time. These will be some tough games. Have fun :-)

http://arimaa.com/arimaa/challenge/2009/playBestBots.cgi

Also I would like to thank the challenge defenders for accepting the responsibility of defending the Arimaa challenge. I have full trust and confidence that you will do your level best as representatives for humanity. In the next few weeks the future of Arimaa will lie in your hands. I wish you the best of luck.


Title: Re: 2009 Arimaa Challenge
Post by Janzert on Mar 13th, 2009, 9:05am
Are there any backup defenders this year?

Janzert

Title: Re: 2009 Arimaa Challenge
Post by Tuks on Mar 13th, 2009, 9:38am
well, im glad we aren't going to lose any games this year :) with the best humans defending
;D

goodluck to Gnobot and Clueless

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 13th, 2009, 2:42pm

on 03/13/09 at 09:05:42, Janzert wrote:
Are there any backup defenders this year?

Omar is the backup again this year.

Title: Re: 2009 Arimaa Challenge
Post by aaaa on Mar 14th, 2009, 2:15am
Can anyone establish what happened in my preliminary game against Clueless (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=99880)? If Clueless itself is not to blame for the timeout, then I think it deserves the point. So consider this my virtual resignation of any hypothetical resumption of the game from the last position.

Title: Re: 2009 Arimaa Challenge
Post by Janzert on Mar 14th, 2009, 3:07am
From my quick glance at the logs it seems like Clueless thought for a second or two longer than the time it actually had left. Clueless' log itself looks like 122.95 seconds and the network log 124 seconds, the reserve was only 2 seconds at the start of the move.

Janzert

Title: Re: 2009 Arimaa Challenge
Post by Janzert on Mar 14th, 2009, 3:12am
I also just noticed the preliminary games are running with an unlimited reserve timecontrol (2/2/100/0/8) instead of the 10 minute limit (2/2/100/10/8).

Janzert

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 14th, 2009, 6:24am
To quote jdb from chat:


Quote:
there is no limit on both the move time and the reserve allowed
the time control logic will not handle this
it looks like clueless will always use a little reserve each move and eventually run out of time

Since the  2m/2m/100/0/8h time control has directly compromised at least one game outcome, and since all games were played under the incorrect time control, it seems to me that the fairest solution is to invalidate all games played so far and restart the screening period from scratch.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 14th, 2009, 6:34am
I consent to a restart, it seems like the obvious thing to do.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 14th, 2009, 7:22am
I made a mistake when setting up the time controls for the screening period games. The reserve limit should have been capped at 10 minutes but was set to unlimited. I have fixed it now. I also shifted the time frame for the screening period to start at Mar 14 16:00:00 2009 GMT and run until Mar 28 16:00:00 2009 GMT. So any games that were played will not be counted. Sorry about the goofup.

I didn't think this required approval from the TD since it wasn't a situation where there seemed to be any choice about how to correct it. Besides I've been bothering the TD too much lately :-)

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 14th, 2009, 5:27pm

on 03/14/09 at 07:22:20, omar wrote:
I didn't think this required approval from the TD since it wasn't a situation where there seemed to be any choice about how to correct it.

If the time control cap had not significantly affected either bot, nor the humans, I would have argued for the games to be counted.  But in this case there was the additional point that Clueless played dramatically differently, so all parties were agreed with the restart.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 14th, 2009, 5:29pm
Omar, please don't forget to run the postanalysis.sh script every day or so.  I notice that it hasn't run since Thursday (the last CC game).

Title: Re: 2009 Arimaa Challenge
Post by aaaa on Mar 15th, 2009, 12:15am
Clueless timed out on me again!

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 15th, 2009, 12:32pm

on 03/14/09 at 17:29:48, 99of9 wrote:
Omar, please don't forget to run the postanalysis.sh script every day or so.  I notice that it hasn't run since Thursday (the last CC game).

I ran it today aroun 12pm MST. I'll try to run it about this time everyday. If you notice I didn't run it till 12:30 pm ping me.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 15th, 2009, 12:52pm

on 03/15/09 at 12:32:35, omar wrote:
I ran it today aroun 12pm MST. I'll try to run it about this time everyday. If you notice I didn't run it till 12:30 pm ping me.

Thanks omar.  You may not be able to run it at the same time every day because one of the bots may be playing.  This time it took 6 minutes - enough to completely mess up a bot if it were running.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 15th, 2009, 2:35pm
That's right. I have to make sure nothing is running before running the script. Also I should disable the bots from starting while the script is running.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 15th, 2009, 2:44pm
Jdb, can you look at the log files for aaaa vs clueless and see if clueless really crashed. The netLog file just shows that it left the game while it was still thinking.

Title: Re: 2009 Arimaa Challenge
Post by jdb on Mar 15th, 2009, 7:49pm

on 03/15/09 at 14:44:20, omar wrote:
Jdb, can you look at the log files for aaaa vs clueless and see if clueless really crashed. The netLog file just shows that it left the game while it was still thinking.


I took a look at the game.log file for clueless. I have no idea what happened. If clueless crashes, it supposed to print out a stacktrace in the log. It didn't do this, but  it still most likely crashed.

As far as I can tell the game should count.

Title: Re: 2009 Arimaa Challenge
Post by ChrisB on Mar 16th, 2009, 9:12pm

on 03/15/09 at 00:15:18, aaaa wrote:
Clueless timed out on me again!


Clueless timed out on me also... and it was UP a horse!

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 16th, 2009, 9:50pm

on 03/16/09 at 21:12:53, ChrisB wrote:
Clueless timed out on me also... and it was UP a horse!


This time it appears that clueless had 10s in reserve, together with a move time of 120s, giving it a total of 130s.

The game log says Total Elapsed Time: 129409, which presumeably means 129.409 seconds... which seems to me to be cutting things too close, so was probably a lag timeout.  For comparison, gnobot is told to immediately cut the search when it has 15s left... and the actual gameroom regularly records 13s reserve.

Title: Re: 2009 Arimaa Challenge
Post by aaaa on Mar 17th, 2009, 12:43pm
I ask that games 100123 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100123), 100128 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100128), 100135 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100135) and 100141 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100141) are disregarded from consideration on account of the fact that the bots in those games were sharing the resources of one server for a period of time.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 17th, 2009, 1:02pm
Ouch, there's an extra explanation of why humans are doing so well in the screening.  :(  Thanks for looking up the game times, aaaa.  I'm glad there have only been two overlaps, so that only four of the eighteen games so far need to be thrown out.

Title: Re: 2009 Arimaa Challenge
Post by Arimabuff on Mar 17th, 2009, 2:12pm

on 03/17/09 at 13:02:44, Fritzlein wrote:
Ouch, there's an extra explanation of why humans are doing so well in the screening.  :(  Thanks for looking up the game times, aaaa.  I'm glad there have only been two overlaps, so that only four of the eighteen games so far need to be thrown out.

I think you're jumping the gun here, Chrisb is the only one who's likely to lose against the bot(s). aaaa, mdk and camelback have proven that they are strong enough to beat the bots as many times as necessary. So your pessimism is a little  premature.

Title: Re: 2009 Arimaa Challenge
Post by Arimabuff on Mar 17th, 2009, 2:17pm
I hope that Omar will see to it that there are no more overlaps otherwise these screening are likely to be of little effect as no one likely to lose against one bot and win against the other will have a chance to play.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 17th, 2009, 3:08pm
I do have checks in the bot starting script to check the servers and pick the one which is not busy. I am not sure why it didn't work. I added some logging and will be testing this out in a bit.

I'll wait for the TDs decision before I mark the games as unrated.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 17th, 2009, 3:23pm

on 03/17/09 at 15:08:31, omar wrote:
I do have checks in the bot starting script to check the servers and pick the one which is not busy. I am not sure why it didn't work. I added some logging and will be testing this out in a bit.

How do you check if it's busy?  50% of the time it's the human opponent's move, so it would look quiet unless you had an explicit lock file to show when a game started and ended.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 17th, 2009, 3:46pm

on 03/17/09 at 15:23:34, 99of9 wrote:
How do you check if it's busy?  50% of the time it's the human opponent's move, so it would look quiet unless you had an explicit lock file to show when a game started and ended.

An inefficient but reliable system is to stick with the notion that the bots only play Gold on gold.arimaa.com and only play Silver on silver.arimaa.com.  Then the script could simply check for games in progress involving the two bots.  The downside is that if some human is "due" to play a color, and that color server is occupied, he can't play even if the other server is free.  But at least it has the convenience that the "lock" is already created.

Title: Re: 2009 Arimaa Challenge
Post by ChrisB on Mar 17th, 2009, 5:23pm

on 03/17/09 at 12:43:51, aaaa wrote:
I ask that games 100123 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100123), 100128 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100128), 100135 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100135) and 100141 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100141) are disregarded from consideration on account of the fact that the bots in those games were sharing the resources of one server for a period of time.


I agree, especially in the case of my game, 100123 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100123), where Clueless timed out while ahead.  Marking these games as unrated seems appropriate also

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 17th, 2009, 6:33pm
OK, I was able to trace down this problem and fix it. Hopefully it should not happen again. The output of the 'ps' command was formatted a little different than what my script was expecting. When a bot is about to be started the startup script does a 'ps' command on the two servers and check to see if the 'bot' script is running on them. The bot interface script runs throughout the game even when the bot engine is not running; so it is a pretty good way to tell if a bot is running on the server. If both servers are free then the startup script runs the bot on the server that matches the side the bot will be playing, otherwise it starts it on the only available server. If both servers are busy it gives an appropriate error message.

Thanks for spotting this aaaa and jdb.

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 18th, 2009, 4:39am

on 03/17/09 at 15:08:31, omar wrote:
I do have checks in the bot starting script to check the servers and pick the one which is not busy. I am not sure why it didn't work. I added some logging and will be testing this out in a bit.

I'll wait for the TDs decision before I mark the games as unrated.

Clearly a server side error.  Unrate and discount those games.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 18th, 2009, 5:09am

on 03/17/09 at 18:33:05, omar wrote:
The output of the 'ps' command was formatted a little different than what my script was expecting.

Ah, that explains why it worked last year and not this year.  The logic was/is correct, but the darn 'ps' command changed.  I guess that sort of thing comes with the territory when you upgrade the server every year.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 18th, 2009, 9:33am

on 03/18/09 at 04:39:26, RonWeasley wrote:
Clearly a server side error.  Unrate and discount those games.


Thanks for the confirmation Ned.

I've unrated those game and thus they will not count towards computing the bot scores.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 19th, 2009, 3:23am

 POST at Thu Mar 19 06:35:09 2009
   maxwait = 300
   sid = 811417636
   what = gamestate
   lastchange = 28
   wait = 1
 RESPONSE at Thu Mar 19 06:42:32 2009

My reading of this log is that a network delay caused clueless' latest timeout against aaaa.

By the way, since it was not a problem with the server the bot was on, continuing the game from the final position is a possibility.  aaaa, am I right from our conversation about the very first timeout you had, that you view a continuation as the best solution, and a restart as unfair?  I just want to test whether your opinion is consistent in cases where you are both winning and losing when the clock breaks.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 19th, 2009, 6:05pm
I don't know what to make of this:

http://gold.arimaa.com/~clueless/logs/14494.netLog

The log file shows that at 06:42:32 clueless received move 12w and had 270 seconds in reserve plus 120 seconds for the move. Clueless took 160 seconds and sent its move at 06:44:08 and immeadiately the gameserver replied that the game had timed out. The game should not have timed out. Clueless should have still had 230 seconds in reserve. Very strange, but clearly not the fault of clueless.

Ned can you let us know how we should proceed on this. Thanks.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 19th, 2009, 6:34pm
In the 722caasi vs bot_clueless game it don't see any problem on the server side.

http://gold.arimaa.com/~clueless/logs/11160.netLog

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 19th, 2009, 6:43pm

Quote:
Very strange, but clearly not the fault of clueless.

I agree it's not clueless' fault, but I don't think it's very strange.


on 03/19/09 at 18:05:52, omar wrote:
The log file shows that at 06:42:32 clueless received move 12w and had 270 seconds in reserve plus 120 seconds for the move.

This message got to cluless at 06:42:32, but it was sent earlier (as soon as aaaa moved... most likely around 06:37).  This seems like a typical network problem to me.

Title: Re: 2009 Arimaa Challenge
Post by Janzert on Mar 19th, 2009, 7:20pm
Yes, it looks like the connection was broken while the bot interface was waiting for the move initially.

I believe the critical two lines are this:


Code:
 RESPONSE at Thu Mar 19 06:42:32 2009
=== http://arimaa.com/arimaa/java/ys/ms4/v5//bot1gs.cgi ====


This shows that the interface did not receive any response from the server while waiting the first time. It then issued another request and immediately gets the response back showing that it is clueless' turn.

In that response:


Code:
bused = 300


shows that 5 minutes have already passed since the beginning of the turn.

Janzert

P.S. One irony in all this is that I've tried to make OpFor fairly robust to these sort of problems, but now I wonder if this isn't a mistake as it would let the server error pass unnoticed while OpFor would be shortchanged on its thinking time.

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 20th, 2009, 3:59am

on 03/19/09 at 18:05:52, omar wrote:
I don't know what to make of this:

http://gold.arimaa.com/~clueless/logs/14494.netLog

The log file shows that at 06:42:32 clueless received move 12w and had 270 seconds in reserve plus 120 seconds for the move. Clueless took 160 seconds and sent its move at 06:44:08 and immeadiately the gameserver replied that the game had timed out. The game should not have timed out. Clueless should have still had 230 seconds in reserve. Very strange, but clearly not the fault of clueless.

Ned can you let us know how we should proceed on this. Thanks.


There's plenty of evidence that the server was not operating correctly for this game, so it should not be counted.  Added thanks to the people collecting data about this.

It's distressing that the server is acting up on clueless's games at this point.  Why now?  An easy TD directive would be to say, "Fix the server."  Omar, do you think this is fixable?

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 20th, 2009, 4:09am

on 03/19/09 at 18:43:20, 99of9 wrote:
This seems like a typical network problem to me.


If we believe this, we just keep playing and hope it doesn't happen too often.  Whenever the client can prove it has fulfilled all its obligations, it's the server's/network/s fault and because bot games can't be continued (right Omar?), such a game must be discarded and replayed.

In our current situation, if Omar has no obvious action on the server, the only alternative is to continue playing.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 20th, 2009, 4:24am

on 03/20/09 at 04:09:52, RonWeasley wrote:
because bot games can't be continued (right Omar?)

One of the Clueless vs Gnobot games was continued.

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 20th, 2009, 8:33am

on 03/20/09 at 04:24:56, 99of9 wrote:
One of the Clueless vs Gnobot games was continued.

That's a good point.  It's not obvious that a bot v human game be continued.  If it can, I would prefer that a network interrupted game be continued at the position of interruption, with clocks set at the start of the move to allow the mover to start the search tree again as if no interruption occurred, mainly out of consideration for the bot.  There may be a scheduling issue with the human, in which case the game would be discarded if the human can't finish it.

So the ruling depends on what is technically feasible.  Fix the server side if possible.  Continue interrupted games if possible.  Discard interrupted games otherwise.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 20th, 2009, 10:40am

on 03/19/09 at 18:43:20, 99of9 wrote:
I agree it's not clueless' fault, but I don't think it's very strange.

This message got to cluless at 06:42:32, but it was sent earlier (as soon as aaaa moved... most likely around 06:37).  This seems like a typical network problem to me.


You're right the move was sent at 37:32 and got to clueless at 42:32. That explains it.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 20th, 2009, 10:46am

on 03/19/09 at 19:20:05, Janzert wrote:
P.S. One irony in all this is that I've tried to make OpFor fairly robust to these sort of problems, but now I wonder if this isn't a mistake as it would let the server error pass unnoticed while OpFor would be shortchanged on its thinking time.


I would suggest having OpFor log such situations so they do not go unnoticed.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 20th, 2009, 10:49am

on 03/20/09 at 03:59:25, RonWeasley wrote:
It's distressing that the server is acting up on clueless's games at this point.  Why now?  An easy TD directive would be to say, "Fix the server."  Omar, do you think this is fixable?

I'll try but problems that occur intermittently are very hard to trace down.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 20th, 2009, 11:09am

on 03/20/09 at 08:33:59, RonWeasley wrote:
So the ruling depends on what is technically feasible.  Fix the server side if possible.  Continue interrupted games if possible.  Discard interrupted games otherwise.


I think I can restore this game and continue it from the last position, but will have to coordinate the time aaaa.

Title: Re: 2009 Arimaa Challenge
Post by aaaa on Mar 20th, 2009, 12:14pm
If there is no objection, I'd rather play the game afresh.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 20th, 2009, 1:43pm

on 03/20/09 at 12:14:37, aaaa wrote:
If there is no objection, I'd rather play the game afresh.

I don't think we should force someone to resume a game from such a situation, but if we're not going to force a resumption, then we probably shouldn't allow one either.  It's the "two chances" theory that has already occurred in events this year.  If the human player is allowed to choose to resume or not, then the technical problem disadvantages the bot.  If we always disallow the game regardless of board position, then technical trouble is just a random event that is as likely to help the bot as to hurt the bot.

I my suggestion isn't parallel to what happens when the human has technical difficulties.  In those cases we have been saying that the time loss for the human always stands.  But the difference seems to be that for the bot we can verify exactly what happened on the server, whereas with a remote human player we have no way to prove that the timeout was unintentional.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 20th, 2009, 2:28pm

on 03/20/09 at 13:43:15, Fritzlein wrote:
If the human player is allowed to choose to resume or not, then the technical problem disadvantages the bot.

In this case it advantages the bot, because aaaa prefers to continue (or even immediately resign) when he is losing, but restart when he is winning.  Either advantage or disadvantage is equally problematic.

In principle I prefer continuations when it is an interruption to a functional game, but I don't know what to do if humans refuse (except to show them that they are distorting the system).  If that becomes a regular occurrence then I agree with Fritz that for consistency we would have to settle for always replaying.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 20th, 2009, 2:49pm

on 03/20/09 at 13:43:15, Fritzlein wrote:
I my suggestion isn't parallel to what happens when the human has technical difficulties.  In those cases we have been saying that the time loss for the human always stands.  But the difference seems to be that for the bot we can verify exactly what happened on the server, whereas with a remote human player we have no way to prove that the timeout was unintentional.

I agree that BvB timeouts and HvH timeouts are different, and should be treated differently.  I suppose that makes HvB a little unbalanced in favour of the bots if we give them a second chance upon network timeout, but never give the humans a second chance unless it's a server error.  But I don't think I mind this inconsistency too much, as it adds an incentive to omar to make sure the bot server has stable network connections.

Title: Re: 2009 Arimaa Challenge
Post by Janzert on Mar 20th, 2009, 4:00pm
I think it makes sense to discount games when the tournament organizer has complete control of the hardware involved as is the case here. If (or I hope, when) there are bot tournaments that allow bot author controlled hardware then I think the policy should be the same as it is in current HvH tournaments. In other words I don't think the distinction here is or should be HvH, HvB or BvB but rather who has control of the relevant hardware and if the system(s) under control of the tournament organizer can be shown to have been a cause of the problem.

Janzert

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 20th, 2009, 4:59pm
Good point Janzert.  When BvB is from the developer's hardware, it makes sense to treat it exactly the same as we currently do HvH.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 20th, 2009, 8:14pm

on 03/20/09 at 12:14:37, aaaa wrote:
If there is no objection, I'd rather play the game afresh.

I'll wait to see what the TD decides.

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 21st, 2009, 10:08am

on 03/20/09 at 20:14:18, omar wrote:
I'll wait to see what the TD decides.

I decide to continue this game from the position at which the network error occurred.

It's getting hard for me to keep track of all the situations here.  Our first problem was playing at the wrong time control, so the problem persisted through the entire game.  Therefore those games were invalidated.  Next bots were playing on a server that had other processes going simultaneously.  In those cases we didn't report on the fraction of times the overlap occurred.  I assumed that information was not available so the only alternative was to discard all of those games.  Now there is a specific network error that happened at a detectable time in the game.  Everything was within specifications before this so the game sequence counts during this time.  The game should continue from this point if technically feasible.

In general, while the server/network side is working correctly, moves count.  Player preference is not an issue here.  I think this principle has been applied consistently this year.

TD

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 21st, 2009, 10:32am
Thanks for the decision Ron. Yes, this year has been plagued with all kinds of problems, but I think we've navigated through all of them pretty safely thanks to your guidance.

Aaaa, please let me know what time you would like to play the game and I will get it setup at that time. I will also leave you my phone number via a private message so if it seems I didn't see your message in time, you can call me.

Title: Re: 2009 Arimaa Challenge
Post by aaaa on Mar 21st, 2009, 11:43am
Apologies in advance for what can be perceived as a power play, but, although I know I will be stepping on the toes of the tournament director with this, I cannot in good conscience resume this game if the understanding is that this is supposed to be a fair assessment of Clueless's playing strength by matching it against mine.

In the time that has passed now, I have been in the position to continue to ponder the game in its last position, while Clueless obviously hasn't. I have taken the liberty to try to unrate the game, only for it to be denied as the result of an unfavorable adjudication by Bomb2005P2. Since I wasn't entirely sure whether the returned score was with respect to the timed-out player or the person doing the unrating (I may theoretically have missed some kind of killer tactic), I felt I needed to confirm my advantage in order to be in a better position to argue for a replay, so I wouldn't be accused of being self-serving. So, I let my own bot loose on the position as well.
Given the taint of these analyses, if I were to continue this game now, I would technically be cheating, to say nothing of the time advantage.

If there is to be no replay, then I will simply end my participation in the preliminary now.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 21st, 2009, 3:22pm
That is a good point I had not even considered before. Since you know what move clueless was going to make from the logs; and knowing that clueless is deterministic it will likely make the same move, you do have an advantage if the game is continued from the current position. I think this is a good reason for the TD to reconsider your appeal. From a technical perspective it is easier for me to unrate the current game and let you play a new one then it is to restore the timed out game, but I will follow the TD's decision and will not unrate the game unless instructed to do so.


Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 21st, 2009, 3:28pm
No hard feelings here, aaaa, since you make reasonable points.  The game and its players all get better when we discuss these things.  I recognize that no ruling in this case works perfectly and there are merits and perils to different alternatives.  Nowhere in the rules does it say players can't disagree.

I encourage this debate to continue and its resolution be expressed in next year's rules.

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 21st, 2009, 3:36pm

on 03/21/09 at 15:22:31, omar wrote:
That is a good point I had not even considered before. Since you know what move clueless was going to make from the logs; and knowing that clueless is deterministic it will likely make the same move, you do have an advantage if the game is continued from the current position. I think this is a good reason for the TD to reconsider your appeal. From a technical perspective it is easier for me to unrate the current game and let you play a new one then it is to restore the timed out game, but I will follow the TD's decision and will not unrate the game unless instructed to do so.

I'm still ruling that the game be continued.  This provides consistency to this year's tournaments and I have already ruled continuation previously.  While it's reasonable to change this policy in subsequent tournaments, I'm going to try to stay consistent for this year.  Again, the basis for the ruling is that legal moves do not get taken back unless there is a technical issue that can't be overcome.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 21st, 2009, 3:37pm
Ron, you still haven't said what happens if aaaa doesn't resume the game from the timed out position.  I assume it doesn't count, and neither does aaaa's game with the same color against Gnobot, but it would be nice to hear that explicitly if that is your ruling.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 22nd, 2009, 2:12am
More timeouts to sort out.

http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100645

http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100646

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 22nd, 2009, 3:35am
In Gnobot vs ChrisB, on the final move gnobot's own log cuts off sometime after 69.3s.  I can't tell how it died, but it looks similar to how clueless lost one of those times to aaaa (the one that jdb accepted as a loss).

One oddity is that both gnobot and clueless timed out within minutes of each other, but this could well be a coincidence, especially since clueless' logs look different.  They were both playing silver, but I think clueless was playing from gold.arimaa.com.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 22nd, 2009, 6:36am
This is obviously a concern, not only for determining which bot qualifies for the Challenge, but also for the Challenge games themselves.  What do we do when a bot times out in the middle of a Challenge game?

Title: Re: 2009 Arimaa Challenge
Post by Arimabuff on Mar 22nd, 2009, 11:38am

on 03/22/09 at 06:36:04, Fritzlein wrote:
This is obviously a concern, not only for determining which bot qualifies for the Challenge, but also for the Challenge games themselves.  What do we do when a bot times out in the middle of a Challenge game?

That would look ugly; after all the challenge games are a showcase for Arimaa to the people who've never played it... I for one took a look at the chalenge games before I even started to learn the rules. I wanted to see what a top level Arimaa game looked like...

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 22nd, 2009, 2:29pm

on 03/22/09 at 11:38:51, Arimabuff wrote:
I for one took a look at the chalenge games before I even started to learn the rules. I wanted to see what a top level Arimaa game looked like...

I also looked at the 2004 Challenge games before I ever played on the server.  The Challenge is an obvious place for students of the game to start.

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 23rd, 2009, 4:31am

on 03/21/09 at 15:37:40, Fritzlein wrote:
Ron, you still haven't said what happens if aaaa doesn't resume the game from the timed out position.  I assume it doesn't count, and neither does aaaa's game with the same color against Gnobot, but it would be nice to hear that explicitly if that is your ruling.

Right.  aaaa's games would not count if the completed game is not played.

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 23rd, 2009, 4:35am

on 03/22/09 at 06:36:04, Fritzlein wrote:
This is obviously a concern, not only for determining which bot qualifies for the Challenge, but also for the Challenge games themselves.  What do we do when a bot times out in the middle of a Challenge game?

If this happens in a Challenge game and the server/network is at fault, the game must be continued from the position of the fault.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 23rd, 2009, 6:19am
Joe tried to play a bot today, and got "bots not available".  Do you know what this is about omar?

Title: Re: 2009 Arimaa Challenge
Post by Arimabuff on Mar 23rd, 2009, 6:59am

on 03/23/09 at 06:19:40, 99of9 wrote:
Joe tried to play a bot today, and got "bots not available".  Do you know what this is about omar?

Omar took the bots off before the C&G/Fritz game; he must have forgotten to put them back on.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 23rd, 2009, 7:31am

on 03/22/09 at 02:12:47, 99of9 wrote:
More timeouts to sort out.

http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100645

http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100646

Ron, have you ruled on these games yet?  Must they also be played out from the point of the timeout?

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 23rd, 2009, 1:04pm

on 03/23/09 at 07:31:25, Fritzlein wrote:
Ron, have you ruled on these games yet?  Must they also be played out from the point of the timeout?

I can't prove that the gnobot one is the server's fault, so I think it has to stand.  Jeff accepted one that looked the same as this earlier against aaaa.

Title: Re: 2009 Arimaa Challenge
Post by Janzert on Mar 23rd, 2009, 2:02pm
In game 100646 woh vs. clueless, the bot interface leaves the game while waiting for move 9g from woh. I really don't know what could cause this other than the interface script receiving a HUP or INT signal. Unfortunately I believe the script only reports the receipt of such a signal to stdio and not to any written log, so I can't confirm this. I seem to recall that possibly the manager script Omar uses does log everything received from the interface script so he may be able to check it there.

From the net log and gnobot's own log it looks like in game 100645 that gnobot exited without giving a move back to the interface script. It seems like quite a coincidence that this occured within 15 seconds of clueless' interface exiting though.

I also went back and looked at game 99969, the other timeout by Clueless against aaaa. This looks much the same as Gnobot's 100645 timeout, from the net log and clueless' own log it seems that the bot exits without sending a move to the interface.

Looking at the timestamp when the bot interface leaves the game for each of these timeouts we get this:

99969 Sun Mar 15 04:02:03 2009
100645 Sun Mar 22 04:02:19 2009
100646 Sun Mar 22 04:02:04 2009

To me the most likely explanation at this point for all of this is a cron job running at 4am on Sunday mornings that is sending a SIGHUP to the bot script.

Janzert

Title: Re: 2009 Arimaa Challenge
Post by Arimabuff on Mar 23rd, 2009, 2:08pm

on 03/23/09 at 13:04:01, 99of9 wrote:
I can't prove that the gnobot one is the server's fault, so I think it has to stand.  Jeff accepted one that looked the same as this earlier against aaaa.

I don't think we're still at this stage. Once Ron has ruled that the game should be continued. It's no longer aaaa's choice whether to play the game or not. If aaaa refuses to play then he forfeits it. Otherwise Omar may as well replace Ron by aaaa.

It doesn't make sense to have someone make a decision and then let the player choose whether he will go along with that decision. It's rather easy to set up. You start the game at a certain time after having given to aaaa the time and date (assuming he's unwilling to give one himself to Omar), if after the game starts aaaa doesn't join then he'll have forfeited it and lost that way.

Title: Re: 2009 Arimaa Challenge
Post by Arimabuff on Mar 23rd, 2009, 2:17pm

on 03/23/09 at 04:31:24, RonWeasley wrote:
Right.  aaaa's games would not count if the completed game is not played.

But isn't that rewarding aaaa for refusing your first ruling?

He refuses to continue the game and therefore he wins.

Somehow that doesn't smell right....

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 23rd, 2009, 2:45pm

on 03/23/09 at 14:02:09, Janzert wrote:
Looking at timestamp that the bot interface leaves the game for each of these timeouts we get this:

99969 Sun Mar 15 04:02:03 2009
100645 Sun Mar 22 04:02:19 2009
100646 Sun Mar 22 04:02:04 2009

To me the most likely explanation at this point for all of this is a cron job running at 4am on Sunday mornings that is sending a SIGHUP to the bot script.

Oh wow, nice work Janzert.  That is too much evidence to be assumed a coincidence.

Patrick, the game against aaaa I was referring to is this one that Janzert is analysing from quite a while back (game 99969).  I'm not talking about the one he has been asked to continue.  (Although now he will probably have two games to continue!  My question is, does he get to pick and choose which one he continues, or is it randomly assigned.)

However, may I convey my sympathy to aaaa at this point.  He has been subjected to a ridiculous number of replays.  It seems this is partly a problem related to the time of day he prefers to play!

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 23rd, 2009, 3:24pm

on 03/23/09 at 14:02:09, Janzert wrote:
To me the most likely explanation at this point for all of this is a cron job running at 4am on Sunday mornings that is sending a SIGHUP to the bot script.

One minor positive... none of the CC timeouts were caused by this particular issue.

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 24th, 2009, 8:32am

on 03/23/09 at 07:31:25, Fritzlein wrote:
Ron, have you ruled on these games yet?  Must they also be played out from the point of the timeout?


From the information in this thread, I can't tell yet why these games timed out.  Did I miss it?  If we can pin it on the network/server, or if we give up without finding the cause, they will be played from the point of timeout.

I would like to appoint Janzert as the expert witness on this technical issue, due to his expertise and that his bot is not one of this year's contenders.  If he accepts, I would like him to decide if the timeout cause can be found or we should abandon that search.  Other opinions are welcome.

There was mention that there are TWO games where clueless timed out against aaaa?  Same colors?  Both caused by network/server error at the point of timeout?  I had thought there were only one.  If there are two, the second game is not authorized as a screening game.  The earlier one takes precedence.  That game must be continued at the point of timeout.  If that game does not get finished then that and the similarly sided game vs GnoBot cannot count.  Sorry, aaaa.  You are somehow a lightning rod for these timeouts.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 24th, 2009, 1:14pm
Janzert is right. I looked around on the server and there is a root level cron job which rotates the web server log and restarts the web server at 04:02; the same time when these games timed out.

Code:
99969 Sun Mar 15 04:02:03 2009    bot_clueless vs aaaa
100645 Sun Mar 22 04:02:19 2009    ChrisB vs bot_GnoBot
100646 Sun Mar 22 04:02:04 2009    woh vs bot_clueless

I should have temporarily disabled the cron job and ran the rotation script manually. It has been so long since I set this up, I forgot about it. All these games need to be replayed. However it might get to to be a timing problem for these players to inform me when they are available to continue the game and for me to also be available at that time to restore the game especially when we have less than a week left for the screening period. Also it might become a fairness issue if I am able to coordinate a time with one player, but not another. It might be easier to allow the players to just replay these games from start. I would think that game 100358 aaaa vs bot_clueless should also be treated this way. As coordinator I would like to request the TD to take this circumstance into consideration.

As if this wasn't enough, there was also the problem that the day before yesterday (Sunday) I forgot to enable the CC bots after disabling them to run the post-analysis script for GnoBot. So we lost about 48 hours of time in the two weeks screening period. I would like to request that the screening period be extended by 2 days.

Toby even pinged me and mentioned that I probably forgot to enable the CC bots after the final WC game,  but I told him that only the bots on the arimaa.com server were disabled during the final WC game. When he pinged me there was another problem on the server that was preventing players from being able join the game after starting a bot. This was caused by a web server configuration change I made while experimenting with mod_perlite on the server.


Title: Re: 2009 Arimaa Challenge
Post by Janzert on Mar 24th, 2009, 1:19pm
[EDIT: Oops, you can ignore this. Looks like Omar already found the program causing it]

Sure I don't mind, although I think I probably have reached the limit of what can be found from the publicly accessible information.

To summarize; there are current 4 bot timeouts that are still being counted for now (games 99969, 100358, 100645 and 100646). It is pretty clear from the net log that game 100358 timed out because of a network problem.

The other three games are a less conclusive. Looking at the logs for games 99969 and 100645 it at first appears that the bot in each case has crashed mid-search and hence timed out. But then it is seen in game 100646 that the bot interface exits while waiting for the opponents move from the game server. The only reason I can see from looking at the bot interface code that it will do this is if it receives a SIGHUP or SIGINT signal. Being that all three of these timeouts occurred at about the same time on a Sunday morning (4am EST) it seems most likely that they all have the same cause. Also none of the other games that completed successfully seem to have been played at that time regardless of day.

At this point I think the only way to confirm this is if the program Omar uses to start the bot interface, logs the stdout from the interface to a file, or if he can find a cron job or other process that is sending a signal at 4am to the bot interface to close.

Janzert

Title: Re: 2009 Arimaa Challenge
Post by woh on Mar 24th, 2009, 1:33pm

on 03/24/09 at 13:14:04, omar wrote:
However it might get to to be a timing problem for these players to inform me when they are available to continue the game and for me to also be available at that time to restore the game especially when we have less than a week left for the screening period. It might be easier to allow the players to just replay these games from start.


Omar, I could replay my game against clueless Friday between 3 and 4PM GMT, Saturday between 7 and 8AM or between 12 and 15PM GMT or Sunday between 7 and 8 AM GMT.

Clueless timed out against me on move 9. So I don't think it makes much difference just to replay this game from the start.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 24th, 2009, 1:52pm
I don't think the games should be switched from "continue" to "replay" just because it's easier for the coordinator to organize.  To be brutally honest, since the coordinator was responsible for the faults, he has a responsibility to go to the extra trouble to rectify them as per the TD's ruling.  I don't mind if the screening period is extended because of this and the other delays.

Title: Re: 2009 Arimaa Challenge
Post by Janzert on Mar 24th, 2009, 2:05pm
I don't have strong feelings either way about whether the games should be continued from the current position or replayed from the start.

I do think that continuing BvB games is great as neither bot can gain anything by the interruption anyway (ignoring possible opening book changes or future learning bots). But once a human is involved I think it is impossible for a human to not benefit from the interruption. Even if the human should consciously try to avoid any further analysis of the final position there is a certain amount of subconscious review that will occur regardless.

Janzert

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 24th, 2009, 3:13pm

on 03/24/09 at 08:32:50, RonWeasley wrote:
There was mention that there are TWO games where clueless timed out against aaaa?  Same colors?  Both caused by network/server error at the point of timeout?  I had thought there were only one.  If there are two, the second game is not authorized as a screening game.  The earlier one takes precedence.  That game must be continued at the point of timeout.  If that game does not get finished then that and the similarly sided game vs GnoBot cannot count.  Sorry, aaaa.  You are somehow a lightning rod for these timeouts.

The first was quite a while ago.  At the time we could not detect any network or server error.  So although jdb did not think it was clueless' fault, he had no way to object to the result standing.  The tournament continued, and aaaa played another game(s) with the other colour.  That one also timed out, and was more obviously suspicious.  The second game is the one you ruled should be continued (and aaaa has decided not to).  Since then Janzert and omar have discovered the 4am thing, which explains the first timeout.  So now there are two unfinished games (of different colours).  Because they are of different colours, there is no need for you to void one to give the other precedence.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 24th, 2009, 3:21pm
Extending the screening period for a few days is the easiest decision of the crop of decisions.  One could argue that if we didn't extend the time, then we would have fallen short of the commitment to have the bots available for two weeks.

It is trickier to decide which games to invalidate, because I understand that multiple independent issues have thrown games into question, but given the sheer number of issues, I think the presumption at the moment should be against the server and in favor of the bot.  That is to say, it seems odd at this point to accept any game in which the bot timed out as a valid result.  Which timed out games should be accepted and why?  Are there timeouts for which we can prove it was the bot's fault, or merely some timeouts where we can't prove it was the server's fault?

The technical difficulty of resuming a timed out game is a significant argument in my mind.  The humans who play the bots are doing the community a favor.  It is a much greater hassle to resume a game than to replay it from the start; I would hesitate to put that burden on the volunteers.  Incidentally I feel differently about the Challenge itself: I would prefer Challenge games to be resumed in similar circumstances, and would expect the Challenge defenders to go the extra mile to make the continuation happen, and would not feel bad forfeiting a defender who refused to resume.

Within the screening, however, the proposal to forfeit a player who doesn't continue a timed-out game seems to give a large bonus to a bot that was hit by a server problem, because the bot might get a victory simply on the grounds that the human player might be too busy to resume the game.  It makes more sense to me to nullify invalid games that aren't later rectified, as if those games had never happened.

Unfortunately, supposing that we nullify non-resumed games, I am very uncomfortable with the implied choice given to the human player.  Perhaps I would choose to resume a game I was winning and refuse to resume a game I was losing.  That would mean the server timeout seriously advantaged me as a human player.  I don't like the situation where the choice of the human player has such a large impact.  Therefore, as before, I believe that choice should be taken out of it, and all games which were interrupted by server error should be completely invalidated, and the only option the human player should have is the option to replay from scratch or not replay at all (i.e. the same option to participate or not participate that the human player started with).

I am suggesting a different rule for the screening than for the Challenge per se, but note what I am being consistent about: In neither situation should the human player have a choice.  In one case the resumption should be mandatory, and in the other case the invalidation of the game should be mandatory.

Unforeseen circumstances like those we now face force us to choose among the lesser of evils.  How to choose?  I submit that a couple of important principles are

1) Although both are harmful, throwing out a tainted result does less harm than including it.  Admittedly, throwing out any result superficially harms the player who lost or was losing, and superficially helps the player who won or was winning.  However, if we assume that the forces that taint results will strike at random, then throwing out tainted results is equally likely to hurt or help each side.

2) The way we resolve server errors should, as much as possible, not give "two chances" based on a decision of the players or of the tournament director.  Thus a given reason for invalidating a game must always result in an invalidation, and a given reason for resuming a game must always result in resumption.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 24th, 2009, 3:47pm

on 03/24/09 at 15:21:51, Fritzlein wrote:
It is trickier to decide which games to invalidate, because I understand that multiple independent issues have thrown games into question, but given the sheer number of issues, I think the presumption at the moment should be against the server and in favor of the bot.  That is to say, it seems odd at this point to accept any game in which the bot timed out as a valid result.  Which timed out games should be accepted and why?  Are there timeouts for which we can prove it was the bot's fault, or merely some timeouts where we can't prove it was the server's fault?

POI: there are no longer any timeouts blamed on the bots.  Thanks mostly to Janzert, all of them have been clearly identified as either a network or server issue.

I'll comment on your other substantive arguments when I get a chance.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 24th, 2009, 7:16pm

on 03/24/09 at 15:21:51, Fritzlein wrote:
The humans who play the bots are doing the community a favor.  It is a much greater hassle to resume a game than to replay it from the start;

That is not correct in all cases.  For example, ChrisB played 39 moves before the bot timed out.  I'd guess that the hassle of finding a suitable time is worth less than over 2.5 hours of wasted play.


Quote:
Incidentally I feel differently about the Challenge itself: I would prefer Challenge games to be resumed in similar circumstances, and would expect the Challenge defenders to go the extra mile to make the continuation happen, and would not feel bad forfeiting a defender who refused to resume.

I agree with you in regard to the Challenge.  Perhaps the bot should be given some bonus reserve time as compensation for the human pondering.  (Although human pondering should certainly be discouraged.)


Quote:
Within the screening, however, the proposal to forfeit a player who doesn't continue a timed-out game seems to give a large bonus to a bot that was hit by a server problem, because the bot might get a victory simply on the grounds that the human player might be too busy to resume the game.  It makes more sense to me to nullify invalid games that aren't later rectified, as if those games had never happened.

I would strongly disagree if these were counted as forfeits.


Quote:
Unfortunately, supposing that we nullify non-resumed games, I am very uncomfortable with the implied choice given to the human player.  Perhaps I would choose to resume a game I was winning and refuse to resume a game I was losing.

This is most important for aaaa as he is the only one with two timed out games.  (Although because of his concept of fairness, he may actually choose to reverse the favouritism and resume from the weaker position.)  This is why I suggested that he be given the resumed game at random.

For others I'm not sure that it matters much more than players who play against one bot then on the basis of the result choose not to play the other bot.  That introduces a similar kind of bias.

I'm sympathetic to your argument here, but this was not the decision of the TD earlier in the tournament, so this would be an explicit reversal if applied in 2009, and I'm not sure that is healthy.


Quote:
if we assume that the forces that taint results will strike at random, then throwing out tainted results is equally likely to hurt or help each side.

On average it will not affect one bot relative to the other, and that is what's important in the screening process.  But any interruption will help the humans relative to the bots.  Replays give the human a chance to learn from their mistakes, continuations give the human a chance to ponder.


Quote:
The way we resolve server errors should, as much as possible, not give "two chances" based on a decision of the players or of the tournament director.  Thus a given reason for invalidating a game must always result in an invalidation, and a given reason for resuming a game must always result in resumption.

Yes.  And this implies that players should try as hard as possible to fulfill what the TD thinks is fairest, rather than make their own judgement of fairness.  Although aaaa has had a very annoying schedule, his reason for not continuing is not related to being annoyed, he is basing it on his own opinion of fairness, which I think is an error [especially since his opinion of what is fair has turned 180 degrees since the chat on the 14th of March].   (But by the way, in this instance, failing to continue his games is actually worse for clueless than not continuing them. Since he has beaten gnobot already, it can only help to give clueless a chance, even if it starts from an inferior position.)

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 25th, 2009, 7:44am
Thank you all for commenting.  I think there is enough information here to make rulings.

First, there have been enough technical problems on the network/server side that I am ruling that ALL timeouts in screening games, up to this point, are due to the network/server.

Second, considering the relative advantages of replaying or continuing the timed-out games, I rule that the timed-out games be continued from the position of timeout with clocks set as at the beginning of the bot's move.  I realize I have chosen the logistically difficult alternative, but I believe this to be the best representation of a fair game.  Bot play was not tainted before the timeout.  The human can ponder only at a single position and this advantage does not outweigh the importance to me that there are no takebacks or do-overs.  If a human player does not complete such a game, the circumstances being immaterial to this ruling, the game does not count and the same colored game against the other bot does not count.  Note that a not-completed game is neither a win or a loss.  It is a non-game from the point of view of the screening process.  I realize a human may, in theory, decide to not continue, and have the game nullified, in an attempt to manipulate the screening process, but I don't believe that to be a risk this year.

Because of the logistical difficulty of the continuation ruling, the screening period is extend seven (7) days to April 5.

[Edit: Corrected date of extension to April 5]

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 25th, 2009, 1:49pm

on 03/24/09 at 19:16:07, 99of9 wrote:
And this implies that players should try as hard as possible to fulfill what the TD thinks is fairest, rather than make their own judgement of fairness.

99of9, making my own judgment of fairness does not preclude me from respecting the decision of the Tournament Director.  This is not the first case in which I have disagreed with Ron's decision, and (should he be kind enough to continue to serve) it won't be the last.  Just because he is the Tournament Director rather than me does not mean than I can or should turn off my evaluative faculty.   Nevertheless, I do respect Ron's decisions, and have always abided by them, and will continue to do so in future events (hopefully many) where he is the Tournament Director.  I consider his ruling binding in this case for determining the outcome of the screening games.  I encourage all players to continue their timed out games if a time can be arranged with Omar.


on 03/21/09 at 11:43:50, aaaa wrote:
Apologies in advance for what can be perceived as a power play, but, although I know I will be stepping on the toes of the tournament director with this, I cannot in good conscience resume this game if the understanding is that this is supposed to be a fair assessment of Clueless's playing strength by matching it against mine.

While I agree with you, aaaa, that Ron's decision was not the fairest possible, I understand that reasonable people may differ about what is fair.  No solution is entirely fair, not even replaying every tainted game from the start as you and I would have preferred.  I encourage you to continue your participation in the screening in accordance with Ron's ruling.  Your games during the screening have been interesting and valuable.  It would be a further contribution if you would continue the ones that have been interrupted and that Ron has requested be continued.  (now including, apparently, your first game against clueless, in which you were behind when clueless timed out)  Please consider that perhaps, in this particular case, the perfect might be the enemy of the good.  Resuming your games will, I believe, be much better for the Arimaa community than not doing so.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 25th, 2009, 2:47pm

on 03/25/09 at 13:49:41, Fritzlein wrote:
99of9, making my own judgment of fairness does not preclude me from respecting the decision of the Tournament Director.  This is not the first case in which I have disagreed with Ron's decision, and (should he be kind enough to continue to serve) it won't be the last.  Just because he is the Tournament Director rather than me does not mean than I can or should turn off my evaluative faculty.   Nevertheless, I do respect Ron's decisions, and have always abided by them, and will continue to do so in future events (hopefully many) where he is the Tournament Director.  I consider his ruling binding in this case for determining the outcome of the screening games.  I encourage all players to continue their timed out games if a time can be arranged with Omar.

I was specifically referring to players.  My concern is when players change their participation or actions based on their own concept of fairness, when it is not in accordance with the TD's concept of fairness.  Since the TD was chosen as the person the community trusted to make a tourney fair, this type of mutiny is not helpful overall.

Discussing or disagreeing with a decision is perfectly fine, and will probably help shape the rules for future years.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 25th, 2009, 4:43pm
Players please send me a message through the Contact page with your preferred times for continuing the game.

Woh, I saw you posting with the preferred times. I will email you to set a time.

The silver server is scheduled to be terminated at the end of this month. So after that time there will only be one server to play on. Also the official challenge match games are scheduled to begin on April 5th, so I will have to stop the screening on April 4th.

Ron, from your ruling I am assuming that game 100123 (bot_clueless vs ChrisB) should also be continued. Please confirm.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 25th, 2009, 8:10pm

on 03/25/09 at 16:43:49, omar wrote:
Ron, from your ruling I am assuming that game 100123 (bot_clueless vs ChrisB) should also be continued. Please confirm.

No, that had to be scrapped and restarted because both bots were playing on the same server.  Ron's phrase "bot play was not tainted before the timeout" is the key to deciding which should be continued and which should be scrapped.  The games that were already scrapped including this one were either due to bots running on the same server, or  under the wrong time control.

The game ChrisB needs to continue is 100645.

Title: Re: 2009 Arimaa Challenge
Post by ChrisB on Mar 25th, 2009, 9:42pm

on 03/25/09 at 20:10:54, 99of9 wrote:
The game ChrisB needs to continue is 100645.


Yes, Omar and I already corresponded about continuing 100645, and the likely time for continuing that game will be this Friday at 0400 GMT (11 pm Thursday for me).  I also mentioned to Omar that I'm not sure whether the March 25 TD decision brings game 100123 back into play.  I see arguments both for and against continuing that game, so whatever Ned decides will be fine with me.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Mar 26th, 2009, 12:53am

on 03/25/09 at 21:42:49, ChrisB wrote:
 I also mentioned to Omar that I'm not sure whether the March 25 TD decision brings game 100123 back into play.

I very much doubt that that is what Ned meant, but am happy to wait for him to clarify.  Of course if that game came back into play, we might have to resurrect a whole bunch, some which have already been replayed!

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 26th, 2009, 3:43am

on 03/25/09 at 16:43:49, omar wrote:
Ron, from your ruling I am assuming that game 100123 (bot_clueless vs ChrisB) should also be continued. Please confirm.


No, not this one where multiple bots were running on the same server.  This game is nullified.

Title: Re: 2009 Arimaa Challenge
Post by omar on Mar 26th, 2009, 1:49pm
Thanks for the clarification.

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Mar 27th, 2009, 5:01am

on 03/25/09 at 16:43:49, omar wrote:
The silver server is scheduled to be terminated at the end of this month. So after that time there will only be one server to play on. Also the official challenge match games are scheduled to begin on April 5th, so I will have to stop the screening on April 4th.


Noted.  I approve the end of screening to be April 4.

Title: Re: 2009 Arimaa Challenge
Post by woh on Mar 28th, 2009, 2:12am
Omar, the link in the gameroom to the bot scores page isn't working. I think you need to leave out the 's' at the end.

Title: Re: 2009 Arimaa Challenge
Post by Arimabuff on Mar 28th, 2009, 9:13am

on 03/28/09 at 02:12:54, woh wrote:
Omar, the link in the gameroom to the bot scores page isn't working. I think you need to leave out the 's' at the end.


True, when I remove the "s" by hand the link works.

Title: Re: 2009 Arimaa Challenge
Post by aaaa on Mar 28th, 2009, 8:16pm
It's amazing to note that despite having made no less than 6(!) attempts to play a preliminary game, it now looks that I will have made no official contribution whatsoever! Still, considering the purposes of this screening, I think, even leaving it like that (and I'm not changing my mind here), I've done good if I do say so myself; with respect to giving the defenders games to study, six games, even including incomplete ones, should be gnarly, no? Especially since, as you'll see below, they unfolded quite diversely. Also, given the whole mess that has plagued my games, it's probably for the best that I won't be responsible for any differentiation of the two bots. Chalking that first infamous game against Clueless up to me getting back into Arimaa after a short hiatus, it's hard to identify the stronger (anti-human) bot of the two. If, however, GnoBot manages to qualify with exactly one point difference, it will still leave a bad taste in my mouth.

Since the multitude of my games seems to have caused a bit of a confusion here, here's a summary of them:

Game 99880 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=99880): After inflicting massive (what I would call fatal) material damage, Clueless times out, at first seemingly because it was being too cavalier managing its time; the game is later voided due to a wrong time control system being in use. I came close to resigning that game and the advantages I enjoyed in the other timed-out games come nowhere close to Clueless's in this one. So I reject the criticism that I'm being inconsistent; it's a fact of life that a later resumption of a game will advantage the human player over the computer, so only playing out a game that's clearly won by the latter should not be dubious.

Game 99969 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=99969): In a much more promising position for me, Clueless times out again. At first judged as a legitimate outcome, some time afterwards more timeouts point to an interfering cron job by the server as the culprit. Now ruled to be resumed.

Game 100048 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100048): I win against GnoBot in an entertaining and (more importantly) legitimate game.

Game 100135 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100135): I finally get a game against Clueless to reach its natural ending (with me successfully converting a horse frame), turns out the bot was sharing computer resources for some time. Game gets discounted (and unrated).

Game 100273 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100273): This time, I'm the one having my horse framed, in a game against GnoBot. I still manage to win eventually. Counts.

Game 100358 (http://arimaa.com/arimaa/gameroom/comments.cgi?gid=100358): Clueless times out yet again, this time early on. It is ruled to be resumed due to a mysterious network lag. At the time, I enjoyed a reasonable advantage, which Bomb2005P2, I believe, evaluated as 2.97, no doubt due to the horse I'm framing. Aided by a substantial amount of (unfair) pondering, I'm somewhat more troubled by the position.

Tentatively, I could hypothesize that GnoBot is relatively strong in the opening (not in a small part due to its book) but has trouble following through on any advantage accrued in that stage, whereas Clueless seem to be more of an all-rounder.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Mar 29th, 2009, 2:59pm
Aaaa, I'm sorry to hear that you are still declining to finish your clueless timeouts as requested by the TD, but thanks anyway for your six attempts at participation.  Although you haven't contributed to the score of the bots or helped choose between them, you have indeed shed some light on their strengths and weaknesses, which is one of the explicit purposes of the screening period.

Will you also decline to participate in any future screening period, because your decision to withdraw is not based on what is expedient at present, but rather based on the principle that the rules are inherently unfair?

Title: Re: 2009 Arimaa Challenge
Post by omar on Apr 3rd, 2009, 9:39am
Game 101769 between sforry and bot_clueless timed out today.

I think this game needs to be continued since a network problem caused the timeout.

http://gold.arimaa.com/~clueless/logs/3426.netLog

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Apr 3rd, 2009, 9:50am

on 04/03/09 at 09:39:39, omar wrote:
Game 101769 between sforry and bot_clueless timed out today.

I think this game needs to be continued since a network problem caused the timeout.

http://gold.arimaa.com/~clueless/logs/3426.netLog

Yes.  Please continue the game.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on Apr 4th, 2009, 4:45am
Congratulations in advance to Clueless/Jeff for winning the screening.  I'll try to give you a bigger challenge next year :).  After all the technical problems, I'm glad the bots ended up separated by a decent margin.  I'm also glad they both got a few wins in the end - the start of the screening was looking quite humiliating for both bots.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Apr 4th, 2009, 6:36am
Sadly, the screening period closed on another invalid game.  Woh's epic 144-move win over GnoBot, lasting six hours and thirty-nine minutes, can't count because woh was assigned the wrong color.  By my count that makes nine invalidated games for the tournament: woh's wrong-color game, four timeouts that weren't completed, and four games that overlapped on the same server.

Fortunately we had such enthusiastic participation that many games were completed and the bots were clearly discriminated.  We had fifty-one valid games, including two server timeouts that were restarted and played to completion, so we beat last year's participation in spite of all the issues.

2008 2009
---- ----
 19   21  Participants
 41   51  Valid games
 16   21  Valid Pairs
  7    7  Discriminating Pairs


Every participation statistic increased except for the number of discriminating pairs, i.e. the number of times a player beat one bot and lost to the other with the same color.  It may be that when the two bots are closer in strength, as they were this year, it is less likely for any pair of games to provide discrimination.  Last year bomb beat sharp 6-1 in the discriminating pairs; this year clueless beat GnoBot 5-2.

Clueless finished with a record of 15-10 in valid games, for a performance rating of 1941.  Clueless' performance was very consistent.  Other than splitting two games with woh (who is rated right near clueless' performance), Clueless defeated everyone rated 1910 or below while losing to everyone rated 2002 and above.  This seems to peg clueless' true strength (at this time control and hardware) in a rather narrow range.

GnoBot finished with a record of 12-14 in valid games, for a performance rating of 1791.  GnoBot's performance was much more erratic, with two wins against over-2000 players (The_Jeh and naveed) but also five losses to under-1900 players (Tuks, Simon x2, sforry, and joe).  One must wonder whether some of the holes in GnoBot's strategy could be plugged without undermining its strengths.

For comparison, during the 2008 screening, bomb's performance rating was 1918, and sharp's was 1576.  So this year the race was not only closer, the bar was higher.

All in all the screening period fully fulfilled its function.  The strengths and weaknesses of both bots were probed, giving the Challenge defenders much useful information.  The bot that would be easier for the defenders to defeat was eliminated.  This is precisely what is supposed to happen during the screening.

Title: Re: 2009 Arimaa Challenge
Post by aaaa on Apr 4th, 2009, 7:22am

on 04/04/09 at 06:36:36, Fritzlein wrote:
By my count that makes nine invalidated games for the tournament: woh's wrong-color game, four timeouts that weren't completed, and four games that overlapped on the same server.

You forgot about four games which didn't count because they used the wrong time control.

Title: Re: 2009 Arimaa Challenge
Post by omar on Apr 4th, 2009, 7:45am

on 04/03/09 at 09:50:58, RonWeasley wrote:
Yes.  Please continue the game.


I tried to continue this game, but ran into some technical problems and wasn't able to continue it. I had to void the game.

Title: Re: 2009 Arimaa Challenge
Post by Simon on Apr 4th, 2009, 8:43am
I definitely think the vulnerability to elephant blockades could be fixed without weakening GnoBot too much. It would probably best to put in an asymmetric evaluation of blockade positions, since bots tend to be much worse than humans at exploiting a blockade of an enemy elephant.

That would seem to be the most obvious and significant hole for GnoBot. There seem to also be obvious but insignificant things like not recognizing elimination as a win condition, which probably doesn't have much effect (could it be turned into a GnoBot-bashing technique? I figure it probably already defends its rabbits enough that if you can win by elimination, you could probably win without it).

Now there's also more subtle stuff like the failure to detect Tuks' goal far enough in advance.  I don't think this is a specific problem of GnoBot's as opposed to current bots in general. Anyway, that would definitely be tougher to fix. One possible approach is to do a search, separate from the regular alpha-beta search, to look for forced wins or losses. This would take place before the regular search and avoid evaluation, except perhaps evaluation directly related to win conditions, to save time. Any positions found to be forced wins for either player would then be added to the regular transposition table with the evaluation set to the appropriate value for a win/loss. Still, that would probably only help a little bit. A more radical approach might be to look at game positions not by looking at the board as a whole, but at a more local level. The dynamics of a single quarter of the board, for example, ought to be a lot more tractable then the board as a whole, and if it could be detected that there is long term trouble in one quarter, that could perhaps be used to decide whether to send in reinforcements. This could be a lot more complicated than current bots of course.

Title: Re: 2009 Arimaa Challenge
Post by Arimabuff on Apr 4th, 2009, 10:21am
I am glad that in spite of all the trouble we had this year the bot who won the screening is also the one most likely to pose problems to the defenders. With all due recognition to 99's wonderful job in developing Gnobot to such a refined level, if think this screening along with the WCC proved without the shadow of a doubt that Clueless is clearly ahead of it.

But it also proved that Gnobot is not far behind and that for next year Jdb will have his work cut out for him to keep ahead of the race.

In conclusion, I can't wait to see what game plans our worthy challenge defenders will come up with for our collective intellectual pleasure.

Let the games begin... :)

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Apr 7th, 2009, 11:58am
Silicon 1 - Humanity 0.  This is the first time the bot has led in any Challenge!

Title: Re: 2009 Arimaa Challenge
Post by omar on Apr 7th, 2009, 3:37pm
Or how about:

Silicon 1 - Carbon 0   :D

arimaa_master mentioned after the game that he was experimenting a bit to see if maybe he could win by smothering bot_clueless :-) It kind of backfired when clueless turned it into a tactical game. I think he will do fine if he plays his usual style.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on Apr 7th, 2009, 3:49pm
I agree that clueless' win was an aberration.  I expect the humans to sweep the remaining eight games.  But do you remember in the 2008 Postal Mixer when we all said OpFor was doing well just because we had been surprised?  And then OpFor went on to win games from unclear positions against wary opponents?

I was planning to give clueless a dog handicap in my first game, but that plan is officially shelved.  If I lose on Saturday, I will lose scared, not surprised.

Title: Re: 2009 Arimaa Challenge
Post by omar on Apr 10th, 2009, 3:50pm
Yes, going into a game too confidently and underestimating the opponent is always a good formula for losing. Perhaps my being overly cautious in the first 5 games of the first challenge match is what helped me win those. Then I experimented just a bit in the last three games only after getting familiar with Bomb's ability at that level.

If I were one of the three defenders this year, I would go into the games as if I am about to face the world champion. Winning the game should be first priority and experimenting should be postponed to the third game of the series if the first two games are won.

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Apr 11th, 2009, 10:44am
In the first game of clueless vs. chessandgo, chessandgo did not log into the game site at the scheduled time.  The alternate defender, omar, was available.  We waited an hour an 10 minutes, and I ruled that omar would play that game in chessandgo's place.  This game became official on move 3.

It was not logistically impractical to reschedule the game instead, but with an alternate ready to play, I decided this substitution was best for the Match as a whole.  Thanks to omar for stepping in.

Title: Re: 2009 Arimaa Challenge
Post by omar on Apr 11th, 2009, 3:07pm
Thanks Ron for being available at a critical time.

Title: Re: 2009 Arimaa Challenge
Post by chessandgo on Apr 12th, 2009, 1:18am
I apologize to the team and spectators for missing this game. Thanks a lot Omar for playing it.

Jean

Title: Re: 2009 Arimaa Challenge
Post by omar on Apr 14th, 2009, 6:14am
No problem. That's what the backup is for :-)

Title: Re: 2009 Arimaa Challenge
Post by RonWeasley on Apr 27th, 2009, 8:38am
Let me officially congratulate arimaa_master, chessandgo, Fritzlien, and omar for a successful defense of the Arimaa Challenge.  The humans won all matches and even a handicap game.  Omar gets to keep his money for another year.

Let me congratulate jdb and bot_clueless for providing the strongest challenger so far.  clueless's play has started the community thinking seriously again about when a bot will win the Challenge.

Now that the Challenge matches are over, I would like to remind the community that this is a good time to discuss any tournament rules changes we would like to see in next year's Challenge cycle.

Finally, thank you all for your cooperation and patience during these unexpectedly eventful matches.  It is always an honor to serve as TD.  It doesn't always have to be me, but I am glad for the opportunity to contribute.  Active players who can't play the WC tournament might consider contacting Omar if they would like to volunteer and provide some more variety to the contest.

Title: Re: 2009 Arimaa Challenge
Post by Fritzlein on May 27th, 2009, 3:36pm
Thank you for serving as Tournament Director, Ron.  It is reassuring to know that when unexpected events happen, someone with Arimaa experience and steady judgment will rule on how to handle it.

I agree with you that it makes sense to suggest rule changes for next year.  For the Challenge match itself, I can only think of a clarification.  I disagree umption has been that if the substitute plays some games of a mini-match, his score is added to the score of the original player to determine whether or not the bot wins the mini-match.  For example, if the regular player loses one game and the sub wins two games, then the humans win that mini-match, right?

But what if more than one original player needs a substitution?  Can the alternate sub in on more than one mini-match?  If so, suppose two original players each win twice, and the alternate player loses twice, once for each substitution.  Is it the case that humanity doesn't lose either mini-match, even though one human lost twice?

My main concerns, however, are for the screening period.  I think the current setup is too prone to manipulation.  Admittedly, we have not yet had anyone trying to throw the screening to one bot or the other, but the current rules still seem too easy to abuse.

I believe that most of the possibility for abuse would be eliminated by have the screening be invitation-only.  Omar should invite established members of the community whom he trusts to do the screening, and those people, in accepting the responsibility to be on the Screening Committee, should for their part commit to playing all four games abiding by the decisions of the Tournament Director.

Having an invitation-only Screening Committee closes the biggest loopholes for abusing the screening process, namely that a developer might play under a pseudonym in order to throw games to his bot, or a Challenge Defender might play under a pseudonym to gain experience, or any member of the community might play under a pseudonym in order to get more than four games.  If the publication of Arimaa boxed sets boosts the popularity of our game anywhere near as much as I expect it to, then being able to rely on our tiny community where we all know each other will soon be a quaint memory.  There will be strong players we know nothing about, so when some newly created account comes in and loses to one bot while beating the other, we won't be able to be sure it is a duplicate account, but neither will we be able to be sure it is legitimate.  That circumstance would put us in a position that would be impossible to judge, but a Screening Committee would spare us from ever facing such a situation.

Oh, and we should also release the names of the Challenge defenders at the beginning of the Computer Championship, rather than at the beginning of the Screening.  To keep with the spirit of learning bots, we must allow bots to play differently against different opponents.  GnoBot must be able to base its play against me on specifically my games.  But for the screening period to serve its purpose of allowing humans to probe for weaknesses in the bots, we can't allow the bots to play one way during the screening and another way during the Challenge.  If the names of the defenders are known before bot development is frozen, a developer could evade the prohibition on playing differently from the screening to the Challenge by hiding behind the permission to play differently against different opponents.  If, however, the names of the defenders are not know until after development is frozen, this loophole disappears.  This is all to say that I understand why the names of the defenders have to be secret temporarily.  But once development is frozen, I see no point to the secrecy, and it has been a mild annoyance to me as a Challenge defender.

Title: Re: 2009 Arimaa Challenge
Post by 99of9 on May 27th, 2009, 7:12pm
Perhaps we need more specific rules on game continuation vs replay vs abandonment vs judgement.  We need to cover the following situations:

1) A human player's equipment or internet connectivity fails.
2) The server equipment, internet, or gameroom server fails.
3) Connectivity fails during the human's turn, but we cannot determine whether it was their internet or the gameroom's.
4) The bot is found to be running incorrectly on the server due to something that is not the developer's fault.
5) The bot is found to be running incorrectly on the server due to something that is the developer's fault.
6) There is some evidence that the bot is running incorrectly, but the cause cannot be determined.

1 and 5 seem easy (that party loses), but we probably need to consider the consequences of all the rest.

Are the rules different in screening and challenge?
Are the rules different for the computer championship (read all of the above as two computers)?

Is there a critical number of moves after which the rules change?

Does the broken game get added to the database that the bots can learn from?



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.