Author |
Topic: Haizhi's Arimaa Project (Read 5364 times) |
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Haizhi's Arimaa Project
« Reply #15 on: Jun 10th, 2005, 12:49am » |
Quote Modify
|
on Jun 8th, 2005, 8:32am, jdb wrote:There are a few other types of moves that can be called second rate, even in the opening. Moves likes captures and freezing an enemy piece are usually worth investigating. Here is a list of types of moves one could prune/sort moves on: (in no particular order) a) does it capture a piece? b) does it suicide a piece? c) is it a goal threat? d) does it allow an enemy goal threat? e) does it advance a rabbit? f) does it advance an enemy rabbit? g) how does the move effect elephant mobility? h) does it place a major piece in a risky location? i) does it threaten an enemy major piece? j) does it threaten to capture an enemy piece? k) does it allow the enemy to threaten an enemy piece? l) does it send a minor piece off the back three ranks? m) reversible moves This list is by no means exhaustive. All of these items can be approximated relatively quickly and would help to focus the search on moves that are more likely to be reasonable. |
| n) does it reduce or increase the number of pieces guarding a home trap? o) does it reduce or increase the number of pieces attacking an enemy trap? (0 is normal, 1 non-elephant is usually bad, 2 is usually an interesting idea, 3 is usually good.) Quote: For example, in the opening sending a piece (other than the elephant) down the board on its own, could be flagged as highly questionable and given only brief consideration. |
| How does "brief consideration" work? I once tried putting something in where those lines were simply searched less deeply. The problem was, in a position where loss of a camel was forced (at depth 8 ), my bot would suicide rabbits... because given that suiciding a rabbit is a "bad thing", those lines were only searched to depth 6 say, and the evaluation was still better than a lost camel. I suppose I need some rule to say that when you reach a leaf node of a "bad" line, if the evaluation is not good, it is automatically very bad?
|
« Last Edit: Jun 10th, 2005, 12:50am by 99of9 » |
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Haizhi's Arimaa Project
« Reply #16 on: Jun 10th, 2005, 3:02am » |
Quote Modify
|
And even if the evaluation is good at that leaf, you have to expand the leaf to the correct depth to make sure it's still good?
|
|
IP Logged |
|
|
|
jdb
Forum Guru
Arimaa player #214
Gender:
Posts: 682
|
|
Re: Haizhi's Arimaa Project
« Reply #17 on: Jun 10th, 2005, 8:34am » |
Quote Modify
|
Quote:How does "brief consideration" work? |
| This is what I am working on in clueless now, so these ideas are still under development. Take the current position and determine all possible moves. Now apply whatever ranking criteria (for example the list a thru o) Start searching with the best moves. This is implemented as a layer on top of the alpha/beta search. If you get a good move on the shallow search, search it deeper. This way a poor move doesn't get alot of time spent on it. True, it could miss some flashy sacrifice, but those types of moves don't seem to be too common in arimaa. A possible fix for suiciding rabbits: // Check for windmill // 1) Black elephant is touching trap // 2) >=2 black pieces touching trap // 3) =1 white defender touching trap // 4) >=1 white frozen 2 steps from trap If this condition exists, the white defender is probably committing suicide. Clueless penalizes this big time.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Haizhi's Arimaa Project
« Reply #18 on: Jun 10th, 2005, 9:25am » |
Quote Modify
|
on Jun 10th, 2005, 8:34am, jdb wrote:True, it could miss some flashy sacrifice, but those types of moves don't seem to be too common in arimaa. |
| I think the most important sacrifice bots need to know about is when to leave a framed piece. (Bots think that is a major material sacrifice... which it kind of is, but it has to be done - you just have to carefully choose when to do it.)
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Haizhi's Arimaa Project
« Reply #19 on: Jun 15th, 2005, 12:54pm » |
Quote Modify
|
Thanks for the explanation of forward pruning and its drawbacks, Haizhi. Also thanks for clarifying that your target is more like 16 steps than 16 moves. I am very interested to see how your fast bot performs relative to the slower bots with more built-in knowledge.
|
|
IP Logged |
|
|
|
RonWeasley
Forum Guru
Harry's friend (Arimaa player #441)
Gender:
Posts: 882
|
|
Re: Haizhi's Arimaa Project
« Reply #20 on: Jun 27th, 2005, 12:19pm » |
Quote Modify
|
In a separate thread, Arimanator writes about accepting trades when ahead and declining trades when behind. Muggles know about this, but it's not on jdb's list. Do any of the bots consider this?
|
|
IP Logged |
|
|
|
jdb
Forum Guru
Arimaa player #214
Gender:
Posts: 682
|
|
Re: Haizhi's Arimaa Project
« Reply #21 on: Jun 27th, 2005, 1:20pm » |
Quote Modify
|
Quote:...accepting trades when ahead and declining trades when behind... |
| I think there is more to it than that. Once the board clears, the number of pieces becomes more important and their actual rank less so. Trading down too many pieces could give the weaker side a chance to slip a rabbit through. I would be interested to hear what others think is the correct strategy for a material advantage.
|
|
IP Logged |
|
|
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Re: Haizhi's Arimaa Project
« Reply #22 on: Jun 27th, 2005, 8:51pm » |
Quote Modify
|
That is the "dynamic piece value" idea. I think Clauchau brought this idea long time ago, but so far nobody try much on it. I doubt even Deepblue had that feature, it is much harder than looks like to implement and tune.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Haizhi's Arimaa Project
« Reply #23 on: Jun 27th, 2005, 9:24pm » |
Quote Modify
|
on Jun 27th, 2005, 8:51pm, haizhi wrote:That is the "dynamic piece value" idea. I think Clauchau brought this idea long time ago, but so far nobody try much on it. I doubt even Deepblue had that feature, it is much harder than looks like to implement and tune. |
| From memory dynamic piece value was about the fact that when MH is traded with MH, horses now take on the role of camel in all evaulation. Similarly, if your opponent has lost HHDD, your D's are worth exactly the same as your H's. That kind of stuff - anything based on the fact that an arimaa piece's power is only relative to the power of the other pieces on the board. I agree that is difficult to implement. However if I understand right, Jeff is talking about something different, which is much easier to implement: Consider the trade: get an M, lose CC. Should you do it? The answer is: Sometimes. At the start of the game, getting a camel is definitely a good thing. A camel is thus nominally "worth" more than C+C. BUT Just say these were the only pieces left on the board, apart from the elephants, and say 2 rabbits each. Now the exchange is much more dubious, and I'd say usually the wrong thing to do. The reason is that C+C has the advantage that it is 2 pieces not one, and can play a role in 2 different regions of the board. To put this term into an evaluation function is easy. To get it exactly right may be a little harder . Basically you just add a term f(N) which is a function of the number of pieces this player has on the board. As long as the gradient of this term is positive, and decreasing with increasing N - I think you'll be in the right ballpark. For example: f(N) = -A/N where A is a positive constant and N is the number of pieces you have on the board. positive gradient = better to have more pieces decreasing gradient = this matters less when lots of pieces are on the board, and more when less pieces are on it Given that you will use TD(lamda), an extra parameter won't hurt you, so perhaps raise N to the power of P in that equation, and let us know what the best values of A and P turn out to be .
|
|
IP Logged |
|
|
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Re: Haizhi's Arimaa Project
« Reply #24 on: Jun 27th, 2005, 10:56pm » |
Quote Modify
|
Good idea. "-A/N" is ok for TD, but I am afraid we cannot get the best "P" value, because it is not a feature weight at all. If we have to set it down, what is your best guess, 99? I need to think more about your equation, it is just seems too simple to work well.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Haizhi's Arimaa Project
« Reply #25 on: Jun 28th, 2005, 12:26am » |
Quote Modify
|
on Jun 27th, 2005, 10:56pm, haizhi wrote:"-A/N" is ok for TD, but I am afraid we cannot get the best "P" value, because it is not a feature weight at all. If we have to set it down, what is your best guess, 99? |
| Ohh... so weights aren't just parameters, they literally have to be a linear prefactor in TD(l)?? My best guess is P=1.0, but I don't really have a strong sense - it could be anywhere from 0.1 to 4.0 that works best. Quote:I need to think more about your equation, it is just seems too simple to work well. |
| Haha, don't be scared of simplicity! Another simple function you could consider is f(N) = A log_b(N) which also obeys the increasing with decreasing gradient rule. This one returns (-infinity) at N=0... so naturally knows that the game is over . Use b~2.
|
« Last Edit: Jun 28th, 2005, 12:30am by 99of9 » |
IP Logged |
|
|
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Re: Haizhi's Arimaa Project
« Reply #26 on: Jun 28th, 2005, 2:11am » |
Quote Modify
|
on Jun 28th, 2005, 12:26am, 99of9 wrote: Ohh... so weights aren't just parameters, they literally have to be a linear prefactor in TD(l)?? My best guess is P=1.0, but I don't really have a strong sense - it could be anywhere from 0.1 to 4.0 that works best. |
| Yap. The "P" value is not something can be meatured in diffrent positions, so TD() cannot update its value based on evaluation score. I think I have a problem of a strong tendency to make every design "complete". To me, the above idea "natually" leads to a 16*16 matrix, which is a little too big... That is why I have 400 feature weights and the number is increasing. Comparately Jonathan's Chinook only has 23 features. I got to study thinking simply....
|
|
IP Logged |
|
|
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Re: Haizhi's Arimaa Project
« Reply #27 on: Jun 28th, 2005, 2:26am » |
Quote Modify
|
Maybe I did not make the TD() clear, I don't what mislead you. To use TD, a feature must can be meatured in any given board positions, for example: in position (A) there is a rabbit at rank 7, if there is a fearture called "rabbit at rank 7", the number of that featured will be marked one, and the TD() will adjust the weight of this feature based on the eva score of the postion (A). Of couse we can have features much more complicate. Clearly the "N" value is also a legal feature. About the "P" value, it doesn't has anything to do with the board positions, so in TD() it will be treated as a constant.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Haizhi's Arimaa Project
« Reply #28 on: Jun 28th, 2005, 2:49am » |
Quote Modify
|
I think I'll just have to take your word for it. In my thinking P is like a weight, I'm not claiming it's like a feature. Weights are multiplied by the number of times a feature occurs to get a score. P, instead of multiplying, is simply the power to which the number of features is raised. Both are constants for a given evaluation. My question comes down to whether TD() is a linear model only, or allows fitting of a non-linear model such as a power. For that matter -A/N is already nonlinear in N, because it is inverse. Are you sure you can deal with this, even when P is set to be a constant? In this case the feature is simply "white piece". N is the number of white pieces. Usually a linear model would then say that the eval for these features is: E = wN where w is the weight of this feature. I understand that you could turn it into a linear model by making 16*16 features. The first called "one white piece and 16 black pieces", where the number of occurrences of this feature is always either 0 or 1. But as you say, that overcomplicates things by adding hundreds of tuneable constants. If non-linear TD has not yet been invented, I suggest that this should be the centrepiece of your arimaa-beating bot . I'm sure that would win you a masters degree.
|
|
IP Logged |
|
|
|
haizhi
Forum Senior Member
Arimaa player #350
Gender:
Posts: 45
|
|
Re: Haizhi's Arimaa Project
« Reply #29 on: Jun 28th, 2005, 3:28am » |
Quote Modify
|
You are right, I was wrong, the number "A" is also not a feature and doesn't suit TD(). The probelm is not linear or not, is can this "feature" be measured in a given position. Give my any board position I can tell you how many rabbits in rank 7, but I cannot tell how many "P" or "A" occured in this position, they don't change based on the postions. The only thing here can be use as a feature is "N", and we can directly set 16 or 16*16(consider both side) weights accordingly, and use TD() to tune these weights.
|
|
IP Logged |
|
|
|
|