Author |
Topic: Server instability? (Read 3530 times) |
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Server instability?
« on: Jul 30th, 2010, 6:42am » |
Quote Modify
|
Maybe it's just me, but both yesterday and today the server seemed to be down for a while (less than an hour). Was it just bad luck, or does it have something to do with the new server? I thought I'd post this in case you hadn't noticed it omar. You may want to take a look at the logs.
|
|
IP Logged |
|
|
|
Tuks
Forum Guru
Arimaa player #2626
Gender:
Posts: 203
|
|
Re: Server instability?
« Reply #1 on: Jul 30th, 2010, 7:05am » |
Quote Modify
|
ok, it isnt just me, i thought my internet was doing strange things because i couldnt login but i could get to the login page
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: Server instability?
« Reply #2 on: Jul 30th, 2010, 9:28am » |
Quote Modify
|
rbarreira: I think it happened when you were playing Marwin2010CC. I had it set to use 4 cores and the load shot way up and made the server very slow. I've set the bots to use only 1 core now.
|
« Last Edit: Jul 30th, 2010, 9:29am by omar » |
IP Logged |
|
|
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Re: Server instability?
« Reply #3 on: Jul 30th, 2010, 9:41am » |
Quote Modify
|
I or my bot have not played Marwin2010CC for a long time. Today I haven't even played or watched any game. I just tend to keep checking the forum throughout the day, that's how I noticed it was down today.
|
« Last Edit: Jul 30th, 2010, 9:42am by rbarreira » |
IP Logged |
|
|
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Re: Server instability?
« Reply #4 on: Jul 30th, 2010, 2:18pm » |
Quote Modify
|
I just saw another crash, this time partial. I was watching this game live: http://arimaa.com/arimaa/gameroom/comments.cgi?gid=150340 Suddenly onigawara loses on time, but then I find out the server is not responding. For a while, static pages worked fine but dynamic pages didn't (at least not the gameroom or forum). They gave a 500 Internal Server Error. Maybe his loss on time was actually caused by the server crashing. This is not the same kind of problem I saw earlier today and yesterday, but it does seem like the server is unstable.
|
« Last Edit: Jul 30th, 2010, 2:22pm by rbarreira » |
IP Logged |
|
|
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Re: Server instability?
« Reply #5 on: Aug 5th, 2010, 5:04pm » |
Quote Modify
|
The server went down again today, and apparently yesterday as well as Fritz commented in this game: http://arimaa.com/arimaa/gameroom/comments.cgi?gid=150730 I hope the server problem gets fixed by September before the Arimaa Festival.
|
« Last Edit: Aug 5th, 2010, 5:04pm by rbarreira » |
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Server instability?
« Reply #6 on: Aug 5th, 2010, 9:03pm » |
Quote Modify
|
Yes, I experienced the 8/5 outage too, and e-mailed Omar about it. Seems there is some unpleasant debugging to be done.
|
|
IP Logged |
|
|
|
rabbits
Forum Guru
Arimaa player #1337
Gender:
Posts: 108
|
|
Re: Server instability?
« Reply #7 on: Aug 14th, 2010, 10:27am » |
Quote Modify
|
There was another outage yesterday. I was playing when the server went down on the 5th as well as yesterday. Perhaps playing the 2010 bots still has something to do with it?
|
|
IP Logged |
|
|
|
Janzert
Forum Guru
Arimaa player #247
Gender:
Posts: 1016
|
|
Re: Server instability?
« Reply #9 on: Aug 14th, 2010, 4:15pm » |
Quote Modify
|
A quick look at the logs of the processes currently running shows only two clueless (java) processes currently executing which is correct for the two current blitz games against aaaa. The bot page can incorrectly show extra instances running as a bot can leave a junk file behind in the event of a crash. [edit: Having said that though, I have noticed that I seem to reconnect opfor quite a bit more frequently for it's postal games since the server switch.] Janzert
|
« Last Edit: Aug 14th, 2010, 4:29pm by Janzert » |
IP Logged |
|
|
|
jdb
Forum Guru
Arimaa player #214
Gender:
Posts: 682
|
|
Re: Server instability?
« Reply #10 on: Aug 14th, 2010, 11:44pm » |
Quote Modify
|
on Aug 14th, 2010, 4:15pm, Janzert wrote: [edit: Having said that though, I have noticed that I seem to reconnect opfor quite a bit more frequently for it's postal games since the server switch.] |
| Yes, this is the same for me too. Since the switch I have to reconnect clueless often.
|
|
IP Logged |
|
|
|
rbarreira
Forum Guru
Arimaa player #1621
Gender:
Posts: 605
|
|
Re: Server instability?
« Reply #11 on: Aug 15th, 2010, 4:41am » |
Quote Modify
|
on Aug 14th, 2010, 4:15pm, Janzert wrote:A quick look at the logs of the processes currently running shows only two clueless (java) processes currently executing which is correct for the two current blitz games against aaaa. |
| Oh... the logs, almost forgot about those. Well, taking a look at the logs definitely shows a problem: http://arimaa.com/logs/20100815/09/3802 "Mem: 2047680k total, 2037080k used, 10600k free, 3952k buffers" "Swap: 4192956k total, 227344k used, 3965612k free, 1023928k cached" Clearly the server is running out of RAM, even though there are only two bots playing right now. I did a little experiment with Clueless2010CC and Marwin2010CC and they use, respectively, 33% and 15% of the memory. If the P1/P2/Fast/Blitz bots are anything like that, it wouldn't take that many bots to get the server in trouble... edit - This one's even worse, Clueless2010P2 using 52.7% of the server's RAM: http://gold.arimaa.com/logs/20100815/13/1601 These transposition tables probably need to be reduced, if there's a config option for that...
|
« Last Edit: Aug 15th, 2010, 8:17am by rbarreira » |
IP Logged |
|
|
|
tize
Forum Guru
Arimaa player #3121
Gender:
Posts: 118
|
|
Re: Server instability?
« Reply #12 on: Aug 15th, 2010, 2:38pm » |
Quote Modify
|
on Aug 15th, 2010, 4:41am, rbarreira wrote: I did a little experiment with Clueless2010CC and Marwin2010CC and they use, respectively, 33% and 15% of the memory. If the P1/P2/Fast/Blitz bots are anything like that, it wouldn't take that many bots to get the server in trouble... ... These transposition tables probably need to be reduced, if there's a config option for that... |
| All variants of Marwin is the same when it comes to memory usage, they use 256MB + 32MB per thread of hash tables. So a serial version would use 288MB for hashes, even when he is just searching P1. I will remember this and add switches for the memory usage for next year.
|
|
IP Logged |
|
|
|
omar
Forum Guru
Arimaa player #2
Gender:
Posts: 1003
|
|
Re: Server instability?
« Reply #13 on: Aug 17th, 2010, 2:12pm » |
Quote Modify
|
Since I had changed all the bots to use only one processor, I was not sure what was causing the problems we've been seeing. But after some experimentation it does seem that the server instability problems are being caused due to GnoBot2010 and Clueless2010 allocating huge chunks of system memory. GnoBot2010 seems to allocate about 1G of memory. But since only P1 and P2 versions of GnoBot2010 are enabled and they usually move within seconds, the memory will be freed up soon. Clueless2010 was set to use a max of 2G of memory (the system has only 2G). I've been experimenting with lowering the max memory Clueless is allowed to use. Seems that if I set it to less than 0.9G Clueless either fails to start up or crashes during the game with a message about not being able to allocate more memory. So I have set the max memory it can use to at least 1G. But since the CC, Fast and Blitz versions hold on to the memory while thinking, it starts to slow the system down and eventually leads to the system freezing, especially if other bots are also running at the same time. So, I have disabled the CC, Fast and Blitz version of Clueless2010. The P1 and P2 versions are still active since they move much faster and hole the memory for a very short time. Hopefully the system should be more stable now.
|
|
IP Logged |
|
|
|
jdb
Forum Guru
Arimaa player #214
Gender:
Posts: 682
|
|
Re: Server instability?
« Reply #14 on: Aug 17th, 2010, 3:20pm » |
Quote Modify
|
Omar, In order to make clueless use less memory, modify the clueless.cfg file. Something like this would do: max_search_depth = 100 number_of_cores = 1 number_of_hash_table_entries = 1000000 This should make it use around 512 Mb. There is no way to make it go less than that. If you want multiple copies of bots running on a server with only 2Gb memory, it is probably a good idea to let developers know if you want a reduced memory version of a bot. Do not mess with the -Xmx java option. That is the max memory java allows clueless to use, it does not reduce the amount of memory clueless tries to use. Modifying the clueless.cfg file will reduce the amount of memory clueless tries to use.
|
« Last Edit: Aug 17th, 2010, 3:24pm by jdb » |
IP Logged |
|
|
|
|