Author |
Topic: Freezing evaluation (Read 1515 times) |
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Freezing evaluation
« on: Jun 26th, 2005, 12:33am » |
Quote Modify
|
I am building the evaluation function now, any discussion on this topic is welcome. My question is: how should we handle the freezing pieces in the evaluation? The most complete static context I can think is [FreezerLevel]*[FreezeeLevel]*[FreezerPosition]*[FreezeePosition], the valide and unique combination is roughly 15*32*4, it maybe is ok for TD(lamda), but too big for manul tuning. My humble opinin is: 1) we can keep [FreezeePosition] and ignore [FreezerPosition], it doesn't make big difference. 2) If the freezing happens at the last 3 ranks, it is a big deal, if it happens at the rest of the board, the position doesn't make big difference. 3) the level of the freezer and freezee does matter, we should keep it So, it ends up 15*(4*3+1), much better. Another way is putting the control of the nearest trap into consideration. But that maybe is too dynamic for a evaluation. And I think we should do something sepcial for the E-h and E-m, like the elephant blockade, they are so powerful that worth a sepcial treatment.
|
|
IP Logged |
|
|
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Re: Freezing evaluation
« Reply #1 on: Jun 26th, 2005, 12:47am » |
Quote Modify
|
On the second thought, the level of the freezer doesn't matter, who care the hero that freeze a rabbit is a dog or an elf? emm, now it is 5*(4*3+1 ).
|
|
IP Logged |
|
|
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Re: Freezing evaluation
« Reply #2 on: Jun 26th, 2005, 3:41am » |
Quote Modify
|
Maybe it should be the last 4 ranks?
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Freezing evaluation
« Reply #3 on: Jun 26th, 2005, 4:53am » |
Quote Modify
|
Gnobot has a few very special 2-piece correlation penalty situations, but only hand coded ones that I knew about. Added to that it has a few general relationships (eg if opponent's silver camel is 1 square south of your gold elephant, you're doing well.) Being frozen is just considered to worsen whatever your situation is already. I think trap control of the nearest trap is also important, but gnobot doesn't include that.
|
|
IP Logged |
|
|
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Re: Freezing evaluation
« Reply #4 on: Jun 26th, 2005, 6:15am » |
Quote Modify
|
Thank you for sharing your experince, 99. I just put the frozen feature into the program, it doesn't seem work. I must start TD(lamda) as soon as possible. I spend several days try to manully tune the feature weights, didn't achive much. The feature weights become more and more... Hopefully TD(lamda) is the savior.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Freezing evaluation
« Reply #5 on: Jun 26th, 2005, 6:29am » |
Quote Modify
|
How many games do you expect Hazebot will have to play for TD(l) to work?
|
|
IP Logged |
|
|
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Re: Freezing evaluation
« Reply #6 on: Jun 26th, 2005, 3:58pm » |
Quote Modify
|
I think to make it work I need hundreds of game at least. But I was hoping I can see some result in 10 games... The problem is even hundreds of games is timingly affordable, if I want change any thing in the eva, I have to rerun it again. That is scary. What if I find I have to put something more in the eva? When they use TD on Chinnok the program and its eva is already well finished.
|
|
IP Logged |
|
|
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Re: Freezing evaluation
« Reply #7 on: Jun 27th, 2005, 9:53pm » |
Quote Modify
|
I just read the paper of using TD on Checker program. Here is some conclusion: 1) Yes it can be done, by just playing against the program itself, no strong trainer needed. 2) After about 5000 games the weights became stable. The learning speed somewhat depends on how many weight there are. Setting different lamda variables maybe can help to make it faster. 3) It is seaching depth sensitive. If you are using 8-ply TD result in a 12-ply game, it doesnot work as well as 12-ply search result. This is not surprising, but still bad.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Freezing evaluation
« Reply #8 on: Jun 28th, 2005, 12:34am » |
Quote Modify
|
Oh my... that is a lot of games every time you want to change your evaluation function! I'd suggest working very hard on getting all the features you want in your eval now!! Can TD be used based on the games in the database rather than self-play games? There are >15000 games in the database already.
|
|
IP Logged |
|
|
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Re: Freezing evaluation
« Reply #9 on: Jun 28th, 2005, 1:57am » |
Quote Modify
|
on Jun 28th, 2005, 12:34am, 99of9 wrote:Oh my... that is a lot of games every time you want to change your evaluation function! I'd suggest working very hard on getting all the features you want in your eval now!! Can TD be used based on the games in the database rather than self-play games? There are >15000 games in the database already. |
| Currently I already have 400 feature weights, half of them I don't have any clue what the proper value should be, or even positive or negitive; and I find after adding some new features the perfomence of my bot become worse! Manually tuning of feature weights is really a nightmare job to me. I think we can find a way to run TD() over the database, but I don't know how, and I doubt doing that will save any time. I am thinking what if I set a big lamda value to make the updating of the weights goes faster. And during the process( say after 20 games) manually modifing some values based on the tendency maybe helps too.
|
|
IP Logged |
|
|
|
|