|
||
Title: Server instability? Post by rbarreira on Jul 30th, 2010, 6:42am Maybe it's just me, but both yesterday and today the server seemed to be down for a while (less than an hour). Was it just bad luck, or does it have something to do with the new server? I thought I'd post this in case you hadn't noticed it omar. You may want to take a look at the logs. |
||
Title: Re: Server instability? Post by Tuks on Jul 30th, 2010, 7:05am ok, it isnt just me, i thought my internet was doing strange things because i couldnt login but i could get to the login page |
||
Title: Re: Server instability? Post by omar on Jul 30th, 2010, 9:28am rbarreira: I think it happened when you were playing Marwin2010CC. I had it set to use 4 cores and the load shot way up and made the server very slow. I've set the bots to use only 1 core now. |
||
Title: Re: Server instability? Post by rbarreira on Jul 30th, 2010, 9:41am I or my bot have not played Marwin2010CC for a long time. Today I haven't even played or watched any game. I just tend to keep checking the forum throughout the day, that's how I noticed it was down today. |
||
Title: Re: Server instability? Post by rbarreira on Jul 30th, 2010, 2:18pm I just saw another crash, this time partial. I was watching this game live: http://arimaa.com/arimaa/gameroom/comments.cgi?gid=150340 Suddenly onigawara loses on time, but then I find out the server is not responding. For a while, static pages worked fine but dynamic pages didn't (at least not the gameroom or forum). They gave a 500 Internal Server Error. Maybe his loss on time was actually caused by the server crashing. This is not the same kind of problem I saw earlier today and yesterday, but it does seem like the server is unstable. |
||
Title: Re: Server instability? Post by rbarreira on Aug 5th, 2010, 5:04pm The server went down again today, and apparently yesterday as well as Fritz commented in this game: http://arimaa.com/arimaa/gameroom/comments.cgi?gid=150730 I hope the server problem gets fixed by September before the Arimaa Festival. |
||
Title: Re: Server instability? Post by Fritzlein on Aug 5th, 2010, 9:03pm Yes, I experienced the 8/5 outage too, and e-mailed Omar about it. Seems there is some unpleasant debugging to be done. |
||
Title: Re: Server instability? Post by rabbits on Aug 14th, 2010, 10:27am There was another outage yesterday. I was playing when the server went down on the 5th as well as yesterday. Perhaps playing the 2010 bots still has something to do with it? |
||
Title: Re: Server instability? Post by rbarreira on Aug 14th, 2010, 10:30am on 08/14/10 at 10:27:12, rabbits wrote:
Clueless2010Fast seems to be running 6 times, if this page is accurate: http://arimaa.com/arimaa/bots/2010cc/clueless/bot_Clueless2010Fast/index.cgi arimaa_master noticed that a few days ago: http://arimaa.com/arimaa/gameroom/comments.cgi?gid=151257 |
||
Title: Re: Server instability? Post by Janzert on Aug 14th, 2010, 4:15pm A quick look at the logs of the processes currently running shows only two clueless (java) processes currently executing which is correct for the two current blitz games against aaaa. The bot page can incorrectly show extra instances running as a bot can leave a junk file behind in the event of a crash. [edit: Having said that though, I have noticed that I seem to reconnect opfor quite a bit more frequently for it's postal games since the server switch.] Janzert |
||
Title: Re: Server instability? Post by jdb on Aug 14th, 2010, 11:44pm on 08/14/10 at 16:15:02, Janzert wrote:
Yes, this is the same for me too. Since the switch I have to reconnect clueless often. |
||
Title: Re: Server instability? Post by rbarreira on Aug 15th, 2010, 4:41am on 08/14/10 at 16:15:02, Janzert wrote:
Oh... the logs, almost forgot about those. Well, taking a look at the logs definitely shows a problem: http://arimaa.com/logs/20100815/09/3802 "Mem: 2047680k total, 2037080k used, 10600k free, 3952k buffers" "Swap: 4192956k total, 227344k used, 3965612k free, 1023928k cached" Clearly the server is running out of RAM, even though there are only two bots playing right now. I did a little experiment with Clueless2010CC and Marwin2010CC and they use, respectively, 33% and 15% of the memory. If the P1/P2/Fast/Blitz bots are anything like that, it wouldn't take that many bots to get the server in trouble... edit - This one's even worse, Clueless2010P2 using 52.7% of the server's RAM: http://gold.arimaa.com/logs/20100815/13/1601 These transposition tables probably need to be reduced, if there's a config option for that... |
||
Title: Re: Server instability? Post by tize on Aug 15th, 2010, 2:38pm on 08/15/10 at 04:41:01, rbarreira wrote:
All variants of Marwin is the same when it comes to memory usage, they use 256MB + 32MB per thread of hash tables. So a serial version would use 288MB for hashes, even when he is just searching P1. I will remember this and add switches for the memory usage for next year. |
||
Title: Re: Server instability? Post by omar on Aug 17th, 2010, 2:12pm Since I had changed all the bots to use only one processor, I was not sure what was causing the problems we've been seeing. But after some experimentation it does seem that the server instability problems are being caused due to GnoBot2010 and Clueless2010 allocating huge chunks of system memory. GnoBot2010 seems to allocate about 1G of memory. But since only P1 and P2 versions of GnoBot2010 are enabled and they usually move within seconds, the memory will be freed up soon. Clueless2010 was set to use a max of 2G of memory (the system has only 2G). I've been experimenting with lowering the max memory Clueless is allowed to use. Seems that if I set it to less than 0.9G Clueless either fails to start up or crashes during the game with a message about not being able to allocate more memory. So I have set the max memory it can use to at least 1G. But since the CC, Fast and Blitz versions hold on to the memory while thinking, it starts to slow the system down and eventually leads to the system freezing, especially if other bots are also running at the same time. So, I have disabled the CC, Fast and Blitz version of Clueless2010. The P1 and P2 versions are still active since they move much faster and hole the memory for a very short time. Hopefully the system should be more stable now. |
||
Title: Re: Server instability? Post by jdb on Aug 17th, 2010, 3:20pm Omar, In order to make clueless use less memory, modify the clueless.cfg file. Something like this would do: max_search_depth = 100 number_of_cores = 1 number_of_hash_table_entries = 1000000 This should make it use around 512 Mb. There is no way to make it go less than that. If you want multiple copies of bots running on a server with only 2Gb memory, it is probably a good idea to let developers know if you want a reduced memory version of a bot. Do not mess with the -Xmx java option. That is the max memory java allows clueless to use, it does not reduce the amount of memory clueless tries to use. Modifying the clueless.cfg file will reduce the amount of memory clueless tries to use. |
||
Title: Re: Server instability? Post by omar on Aug 18th, 2010, 5:14pm Thanks for this info Jeff. I tried it out earlier today, but it seems that 512MB is still too big for the server to handle. I started a game against Clueless2010CC and the server load started creeping up. When the load got close to 3 I also started Clueless2010Fast and the server hung and had to be rebooted. Next year I'll ask the developers if they can provide an option to control memory usage and allow the bot to play with as little as 50MB of memory. Although this won't be required. |
||
Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1! YaBB © 2000-2003. All Rights Reserved. |