Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> Events >> 2015 WCC Round 3 sharp vs marwin timeout
(Message started by: Janzert on Mar 3rd, 2015, 2:38pm)

Title: 2015 WCC Round 3 sharp vs marwin timeout
Post by Janzert on Mar 3rd, 2015, 2:38pm
Earlier today the round 3 game of sharp vs marwin ended in a timeout for sharp. Omar asked me to look into the logs to see if I could tell what caused it.

For the impatient or those that simply don't want to read all the boring details here's the executive summary. ;) There appear to have been a series of communication problems between the bot server and game server leading eventually to the timeout. Given that both servers are under the control of the tournament organizer this constitutes grounds for a game restart in the WCC. Omar plans to resume the game this evening.

Here are the sequence of events as I see it in the logs. (All times, execpt as otherwise noted, are given in EST as used in the bot log.)

When receiving move 29s at 10:58:21 the game server reported 17 seconds of gold's time already used (moveused was 17).

After playing move 31g at 11:02:46 it took 20 seconds for the send to the server to complete. A minute later at 11:03:56 the game server responded to the request looking for 31s with a 408 error response. Unless game server scripts are explicitly returning this error in the game server code, this should mean that the game server never received the request details from the bot server and timed the connection out. This looks like a network communication error.

After receiving the error response from the server the AEI interface restarts and tries to contact the server again (@11:04:27), but receives the same 408 error back (@11:04:27) while getting the game state when rejoining the game.

The interface then begins to restart again for another try, but after starting the bot and before actually reconnecting to the game server, for an unknown reason the interface and all bot related processes shutdown. The server process logs show everything is gone by 11:05 (16:05UTC server log time). The interface normally would try restarting after up to 5 errors. At first this was quite confusing to me, after investigating a bit more I think I understand what happened. While starting up if there is an error when logging onto the bot account the interface would shutdown and only print the error out to stderr. Maybe this would show up in a bot server log somewhere, possibly one of the apache logs? But quite likely is simply lost.

My interpretation of the overall situation, assuming nothing special is being done with 408 error responses, is network issues prevented the bot server communicating with the game server for a period of time. During the third restart it received another error while logging back into the game server and shutdown completely. This would then meet the requirements for a restart in the WCC.

Janzert

Title: Re: 2015 WCC Round 3 sharp vs marwin timeout
Post by quasar on Mar 3rd, 2015, 4:31pm
When will the game resume?

Title: Re: 2015 WCC Round 3 sharp vs marwin timeout
Post by omar on Mar 3rd, 2015, 8:20pm
Thanks for investigating this Brian. I've just restored the game.

Title: Re: 2015 WCC Round 3 sharp vs marwin timeout
Post by lightvector on Mar 3rd, 2015, 9:19pm
Thanks, Janzert and Omar, for working this out.



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.