Author |
Topic: Changing your mind and managing your time (Read 2917 times) |
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Changing your mind and managing your time
« on: Jan 12th, 2006, 12:09pm » |
Quote Modify
|
In the comments of game 23257, 99of9 posted the evaluation according to Gnobot of the position before move 20b: Quote: 2 0.3 -5046 e6-f6 e7-f7 3 0.4 -4992 e6-e5 f5-f4 e5-e6 4 0.4 -4976 e6-e5 f5-f4 e5-e6 g4-g3 5 0.5 -5021 e6-e5 f5-f4 g4-g3 f4-g4 h3-h2 6 0.6 -5083 e6-e5 d6-e6 f5-f6 f6-f7 g4-g3 g5-g4 7 1.0 -5243 e6-e5 f5-f4 e5-e6 g4-g3 c4-d4 b4-c4 c1-d1 8 38.9 -8028 e6-f6 d6-e6 f5-f4 f6-f7 h3-g3 g3-f3 f3-g3 f4-f3 9 41.9 -8034 e6-f6 d6-e6 f5-f4 f6-f7 h3-g3 g3-f3 f3-g3 f4-f3 b6-b7 10 45.9 -8028 e6-f6 d6-e6 f5-f4 f6-f7 h3-g3 g3-f3 f3-g3 f4-f3 d8-e8 e8-d8 11 77.1 -8016 e6-f6 d6-e6 f5-f4 f6-f7 h3-g3 g3-f3 f3-g3 f4-f3 e6-e5 e5-e4 e4-f4 12 263.4 -7931 e6-f6 e7-f7 f5-f4 f6-g6 h3-g3 g3-f3 f3-g3 f4-f3 f7-g7 g6-h6 h6-g6 h5-h6 At the moment, 8 ply takes 38.9 seconds on my laptop. Note that 7 ply only takes 1.0 second. This is because it has to totally change plan when it sees your goal threat, and all the move ordering help provided by searching 7 ply is totally useless. |
| I have no experience coding a bot, but I'm interested in the effects of a bot changing its mind. My understanding is that good move ordering is critical to the speed of alpha-beta searching, so most programs use iterative deepening. Each level of search costs some time to complete, but saves even more time on the next level of search by providing good move suggestions. In this case, is Gnobot's 7-step search literally "totally useless", i.e. the 8-step search would have taken exactly as long to complete from scratch? Might the 7-step search have been somewhat useful? Is it possible that it was actively detrimental, i.e. it provided move suggestions which were worse than random moves, and thus slowed down the 8-step search? Since there is a huge time loss associated with a bot changing its mind at a new level of search depth, I would think that bot developers would want to prevent their bots from switching plans, for example by accepting a more time-consuming static evaluation that somehow anticipates what is to come. Yet the article at http://supertech.lcs.mit.edu/~heinz/dt/node50.html (already referenced in a couple of other threads) suggests that the increase in playing strength by searching an extra step is directly proportional to the probability of the bot changing its mind at that step. As a consequence, the more likely a bot is to change its mind at a certain depth, the more important it is to search that depth. A practical application might be this: If Gnobot changes its mind on step 8 (from what it thought on step 7) 50% of the time, but changes its mind on step 9 (from what it thought on step 8) only 10% of the time, then deepening to 9 steps provides only 1/5 of the stength boost that deepening to 8 steps did. Therefore (perhaps) for time management purposes Gnobot should always complete the 8-step search unless it is out of time, and always forgo the 9-step search unless it has a full reserve, so that it can have enough reserve time for 8-step searches later. In general, different bots might have different good breakpoints of search depth. For example, if I remember correctly, bot_haizhi implemented 3-step static goal and trap evaluation, so completing a 5-step or a 9-step search might be an important breakpoint for bot_haizhi's time management. A second question I have is whether iterative deepening in Arimaa should be applied only at the breakpoints where it is likely for a bot to change its mind. If there is a fair probability that Gnobot's 7-step search will indeed be totally useless in searching to the eighth step, might it pay to instead do iterative deepening hitting only depths of 4n, i.e. do a 4-step search, then an 8-step search, then a 12-step search? If that turns out to be too radical, then one could at least consider that two steps are necessary for any capture, and therefore do iterative deepening only at depths of 2n. I apologize if these thoughts are way off base. If they have any merit at all, though, I'm curious what real developers think about them.
|
|
IP Logged |
|
|
|
Ryan_Cable
Forum Guru
Arimaa player #951
Gender:
Posts: 138
|
|
Re: Changing your mind and managing your time
« Reply #1 on: Jan 12th, 2006, 10:47pm » |
Quote Modify
|
Interesting, I always wondered why the fixed ply bots would have order of magnitude differences in calculation time for positions that looked qualitatively similar. It seems to me that the whole point of searching deeper is to find a better move, so I am not sure how you could improve a bot by reducing its likelihood to switch lines. The only thing I can think of is if the bot is repeatedly oscillating between lines, but that seems unlikely to cause nearly as large a delay as searching a completely new line. Has anyone done any empirical research into how useful the non 4n evaluations are? If the 4n+1 search causes a change, how likely is it to actually be a better move (I guess this would require a human opinion of the moves). If the 4n+1 search causes a change, how likely is it that the search will keep the new move at 4n+4; how likely that it will go back to the old move? Could there be any value in creating different but related evaluations for each of the step levels, or would this ruin the move orderings produced by the evaluations? PS If Gnobby saw the goal threat why didn’t it block it? Did it just run out of time and move randomly? Does Gnobby have any static goal evaluation? Would it be of any value to run static goal evaluations at the very beginning of the move and see that Fritzlein has a goal threat and Gnobby doesn’t? Then you could have some sort of mode that prioritizes goal defense moves in the search. PPS Is everyone just using standard Alpha-Beta with heuristics and iterative deepening, or does anyone use something more advanced like http://en.wikipedia.org/wiki/MTD-f ?
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Changing your mind and managing your time
« Reply #2 on: Jan 13th, 2006, 12:10pm » |
Quote Modify
|
on Jan 12th, 2006, 10:47pm, Ryan_Cable wrote:PS If Gnobby saw the goal threat why didn’t it block it? Did it just run out of time and move randomly? Does Gnobby have any static goal evaluation? |
| If I remember correctly, Gnobot ran its time down to 15 seconds, i.e. it did run out of time and give the best move it had discovered to that point. I'm guessing that, given the server it was running on, and the load at the time, the 95 seconds weren't enough to complete an 8-step search or else it would certainly have blocked the goal. I'm also guessing that Gnobot has no static goal evaluation whatsoever, not even 1-step, or else it would have seen the problem in the 7-step search.
|
« Last Edit: Jan 13th, 2006, 2:13pm by Fritzlein » |
IP Logged |
|
|
|
ntroncos
Forum Junior Member
Weiser's trainer
Gender:
Posts: 9
|
|
Re: Changing your mind and managing your time
« Reply #3 on: Jan 24th, 2006, 9:13am » |
Quote Modify
|
on Jan 12th, 2006, 10:47pm, Ryan_Cable wrote: PPS Is everyone just using standard Alpha-Beta with heuristics and iterative deepening, or does anyone use something more advanced like http://en.wikipedia.org/wiki/MTD-f ? |
| Last year i tried to attend that issue in weiser. We used an incomplete search function based on the hipotesis that much of the search is useless. Until know we have no out preformed the complete search bots (alpha-beta search). In each step we putted a fixed search apart from the incomplete search. weiser thinking: alpha-beta search 2ply depth (very fast), if he wins or looses he move ocordingly. if niether he does the incomplete search. Then he makes theincomplete search for aslong as he has time. The fixed search helped so that he didnt loose stupidly, but the use of an incomplete search funtion exposes the posibility that he never sees that he will loose in the next ply. (or that in some bizzar way its good to through his elephant in a trap ) The conclution of our develpoment were that even though alpha-beta search does aparent unusefull search it does fill in the evaluated move table. I think that a combined alpha-beta search for the first plys and the an incomplete search to go higher in depth would be better. Only incomplete search (at our present state of develpoment) never could out-rank a complete alpha-beta bot.
|
|
IP Logged |
|
|
|
Ryan_Cable
Forum Guru
Arimaa player #951
Gender:
Posts: 138
|
|
Re: Changing your mind and managing your time
« Reply #4 on: Jan 24th, 2006, 7:03pm » |
Quote Modify
|
on Jan 24th, 2006, 9:13am, ntroncos wrote:weiser thinking: alpha-beta search 2ply depth (very fast), if he wins or looses he move ocordingly. if niether he does the incomplete search. Then he makes theincomplete search for aslong as he has time. |
| This strikes me as a very good idea if one is going to do substantial pruning. The one downside it has is that it creates an additional horizon effect. Deep Blue did the first few ply of the search in software on a general purpose computer and then did the rest in specially designed hardware that had a less sophisticated evaluation function. At one point in its development, the hardware didn’t include castling, which meant that if castling appeared particularly useful to its opponent, it would try to play a line that pushed the castling move beyond the software horizon, expecting it to be impossible to do later. Of course, you have an advantage in that you are specifically pruning moves that are likely to be bad, whereas castling was left out because it was hard to do in hardware. PS I hope we will be seeing more of you and weiser around in the future.
|
|
IP Logged |
|
|
|
fotland
Forum Guru
Arimaa player #211
Gender:
Posts: 216
|
|
Re: Changing your mind and managing your time
« Reply #5 on: Jan 24th, 2006, 11:56pm » |
Quote Modify
|
Bomb does depth first alpha-beta with iterative deepening (1 step at a time) and transposition table (4 million entries, w victims). It has 6 step static trap evaluation and approximate 16 step static goal evaluation (3 steps exact, otherwise lower bound). It has a lot of selective deepening, and a quiescence search, so the deepest lines are often 4 to 8 steps deeper than the full width search. It uses null move pruning, and prunes redundant passes and 2 step reverse moves. It uses 2 killer moves and the history heuristic for move ordering. It typically searches 12 steps full width, and 15 to 18 steps in the PV including extensions and quiescence (plus typically 3 steps in the static evaluation). It usually find forced goals 16 to 20 steps away. The evaluation and move generation are very fast, using bitboards.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Changing your mind and managing your time
« Reply #6 on: Jan 5th, 2009, 5:21pm » |
Quote Modify
|
on Jan 12th, 2006, 12:09pm, Fritzlein wrote:A practical application might be this: If Gnobot changes its mind on step 8 (from what it thought on step 7) 50% of the time, but changes its mind on step 9 (from what it thought on step 8) only 10% of the time, then deepening to 9 steps provides only 1/5 of the stength boost that deepening to 8 steps did. Therefore (perhaps) for time management purposes Gnobot should always complete the 8-step search unless it is out of time, and always forgo the 9-step search unless it has a full reserve, so that it can have enough reserve time for 8-step searches later. |
| There is nothing new under the sun! I just ran across this old thread and am amazed that I did not contribute to it, perhaps I didn't even see it. Fritz is of course right, and I subsequently implemented this exact time control feature (I'm not sure if I came to the conclusion independently or failed to acknowledge this helpful thread). In fact, a few weeks ago, I was discussing my time control with Fritz in the chatroom, and had to convince him of the value of this very idea! Quote:I apologize if these thoughts are way off base. If they have any merit at all, though, I'm curious what real developers think about them. |
| Now you know ;-). However it is now starting to get a bit redundant, the 4n stuff wears off once you get static trap extensions in. There is no typical search profile anymore, but here's an example of one of my test positions: r - r - - r - r - r r - c h d r E c r h H R e M m R - H C R R R R - - - - - D C - R - - - - - - - - R - - - - - - - - - - - - - initial key 8809538553158063007 Depth Elapsed Knps Eval Main Line ----- -------- -------- ------ ------------------------------------------ 2 0.3 1.1 -7772 d5-c5 c5-c4 c6-c5 3 0.3 10.8 -6594 e7-d7 e6-e7 d7-d8 e7-d7 4 0.3 58.4 -6594 e7-d7 e6-e7 d7-d8 e7-d7 5 0.3 108.3 -10078 d5-c5 c5-c4 c6-c5 g4-g3 g7-g8 6 0.4 162.8 -12041 d5-c5 c5-c4 c6-c5 g4-g3 g7-g8 a8-a7 7 0.5 233.5 -12552 d5-c5 c5-d5 c6-c5 g4-g3 g7-g8 b7-a7 c7-b7 8 0.7 318.6 -13116 d5-c5 c5-c4 c6-c5 g4-g3 g7-g8 a8-a7 c7-c6 c5-d5 9 1.1 374.7 -9429 d5-c5 c5-c4 c6-c5 g4-g3 g5-g4 g6-g5 g5-g6 a8-a7 f5-g5 10 1.6 419.0 -8160 d5-c5 c5-c4 c6-c5 g4-g3 a8-a7 d6-c6 c5-d5 c6-d6 b6-c6 a6-b6 11 5.7 414.7 -7641 d5-c5 c5-c4 c6-c5 h4-h3 a8-a7 d6-c6 c5-d5 c6-d6 b6-c6 a6-b6 d5-d4 e5-d5 12 21.0 563.3 -5458 b3-b4 b5-c5 a5-b5 a6-a5 b4-b3 b5-b4 b6-b5 b7-b6 a5-a6 b5-a5 b6-b5 a6-b6 13 129.7 708.3 -10160 b3-a3 a4-b4 b4-c4 c4-d4 b5-b4 b6-b5 a5-a4 b5-b6 d5-c5 c5-c4 c6-c5 h4-h3 g7-g8 Total: time 129.7 nodes 91903914 evals 16519192 (18.0%) beta 912581 ( 1.0%) null 21384127 (23.3%) Applying this move: Rb3w Ra4e Rb4e Rc4e
|
« Last Edit: Jan 5th, 2009, 5:22pm by 99of9 » |
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Changing your mind and managing your time
« Reply #7 on: Jan 5th, 2009, 6:07pm » |
Quote Modify
|
On the other hand, the 4n problem still does show its head in some positions. I reran the exact position you posted, and here are my current results: r r r r - r - r - - r - r - - - - h - c c - - - - - - - - d M R - H E - - - h - R R m e - - - H - C R C D R R - R - D - - - R - initial key 4439579325424901324 Depth Elapsed Knps Eval Main Line ----- -------- -------- ------ ------------------------------------------ 2 0.3 2.8 -24898 e6-f6 f6-g6 3 0.3 36.7 -24998 e6-f6 e7-f7 f6-g6 4 0.4 318.7 -23952 e6-e5 f5-f4 f8-g8 g4-g3 5 0.5 484.4 -25178 e6-e5 f5-f4 f8-g8 g4-g3 h5-h6 6 0.7 582.2 -26354 e6-e5 f8-g8 f5-f4 g4-g3 h5-h6 h6-g6 7 1.1 589.5 -27392 e6-e5 f8-g8 f5-f4 g4-g3 h5-h6 g5-g4 g4-g5 g3-g4 8 4.2 528.4 -27847 f8-g8 e6-f6 e7-f7 f5-f4 h3-g3 g3-f3 f3-g3 f4-f3 9 10.4 507.9 -27323 e6-f6 e7-e6 f5-f4 f6-g6 h3-g3 g3-f3 f3-e3 f4-f3 f8-f7 10 15.2 535.5 -25469 e6-e5 f5-f4 g4-g3 f8-g8 g5-g4 g2-h2 g3-g2 g4-g3 e7-f7 e5-e6 11 37.4 541.5 -24924 e6-e5 f5-f4 f8-g8 g4-g3 g5-g4 g4-g5 g3-g4 h3-g3 e7-f7 e5-e4 e4-e3 12 315.2 918.9 -24835 e6-e5 f8-g8 f5-f4 g4-g3 g5-g4 g2-h2 g3-g2 g4-g3 e7-f7 d6-e6 e5-e4 e4-e3 13 1165.4 974.1 -25447 e6-e5 f8-g8 f5-f4 g4-g3 g5-g4 g2-h2 g3-g2 g4-g3 e7-f7 e5-e4 e4-e3 e3-f3 g3-g4 Total: time 1165.4 nodes 1135288779 evals 203153473 (17.9%) beta 4469561 ( 0.4%) null 245830473 (21.7%) Applying this move: ce6s rf8e df5s hg4s The sad thing is that this was done on a dual core pc using the search improvements and the better eval, and now it takes longer to finish 12 ply!!!
|
« Last Edit: Jan 5th, 2009, 6:09pm by 99of9 » |
IP Logged |
|
|
|
Janzert
Forum Guru
Arimaa player #247
Gender:
Posts: 1016
|
|
Re: Changing your mind and managing your time
« Reply #8 on: Jan 5th, 2009, 9:03pm » |
Quote Modify
|
But presumably it's finding a much(?) better move in that 12 ply, at least that's the hope of course. Janzert
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Changing your mind and managing your time
« Reply #9 on: Jan 5th, 2009, 9:10pm » |
Quote Modify
|
Yes, I hope so. It's playing a riskier line, but if it gets away with it, then it will not lose the dog.
|
|
IP Logged |
|
|
|
jdb
Forum Guru
Arimaa player #214
Gender:
Posts: 682
|
|
Re: Changing your mind and managing your time
« Reply #10 on: Jan 5th, 2009, 11:07pm » |
Quote Modify
|
I am not sure I understand the idea being presented here, so this could be missing the point, but here goes. Clueless will not move until it has completed an 8 step search. (Except if its going to lose on time!) Typically this only takes a couple seconds, but if it takes a few minutes so be it. Its just way too dangerous to move without looking that far ahead. Stopping the search early, if there is likely not enough time left to complete the next iteration, is questionable in my opinion. Lets say it just finished an 8 ply search and there is only 5 seconds until it is time to move. Might as well just stop searching and save the 5 seconds. But at ply 9, disaster and the move's score drops big time. If a move's score is going to drop huge on the next ply, the search of that move will complete very quickly, typically in under a second. Now extra time will be needed to see if a new move can be found. So even if there is no hope of completing the 9 ply search, in effect that 5 seconds gives an extra ply of security on the move that is going to be played.
|
|
IP Logged |
|
|
|
Oystein
Forum Full Member
Arimaa player #3272
Gender:
Posts: 21
|
|
Re: Changing your mind and managing your time
« Reply #11 on: Jan 6th, 2009, 12:29am » |
Quote Modify
|
Just a little observation: on Jan 5th, 2009, 6:07pm, 99of9 wrote:Depth Elapsed Knps Eval Main Line ----- -------- -------- ------ ------------------------------------------ 7 1.1 589.5 -27392 e6-e5 f8-g8 f5-f4 g4-g3 ... ... 10 15.2 535.5 -25469 e6-e5 f5-f4 g4-g3 f8-g8 ... 11 37.4 541.5 -24924 e6-e5 f5-f4 f8-g8 g4-g3 ... 12 315.2 918.9 -24835 e6-e5 f8-g8 f5-f4 g4-g3 ... |
| Same first move, but different order of the steps. Normally this would not happen because of the transposition table. But here the position after 4 steps is apparently overwritten, and also the position after 3 steps for the 12 step search. Have you tried to change the Zobrist keys (or do you use something else to get hash values) to see the effect of the search time?
|
|
IP Logged |
|
|
|
Oystein
Forum Full Member
Arimaa player #3272
Gender:
Posts: 21
|
|
Re: Changing your mind and managing your time
« Reply #12 on: Jan 6th, 2009, 1:13am » |
Quote Modify
|
on Jan 5th, 2009, 11:07pm, jdb wrote:Stopping the search early, if there is likely not enough time left to complete the next iteration, is questionable in my opinion. |
| I have hardly any experience in bot writing, but I give my opinion anyway. I think it is best to strive to never do an incomplete iteration. Quote: Lets say it just finished an 8 ply search and there is only 5 seconds until it is time to move. Might as well just stop searching and save the 5 seconds. But at ply 9, disaster and the move's score drops big time. If a move's score is going to drop huge on the next ply, the search of that move will complete very quickly, typically in under a second. Now extra time will be needed to see if a new move can be found. |
| But because there are little time left, it is quite likely that we dont find the best substitute move. It is hard to make good use of an incomplete iteration. Quote:So even if there is no hope of completing the 9 ply search, in effect that 5 seconds gives an extra ply of security on the move that is going to be played. |
| The 5 seconds is better spent later when it is enough time to complete a iteration. Then it is possible to both discover a disaster and to find a good substitute move.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Changing your mind and managing your time
« Reply #13 on: Jan 6th, 2009, 1:20am » |
Quote Modify
|
on Jan 6th, 2009, 12:29am, Oystein wrote:Just a little observation: Same first move, but different order of the steps. Normally this would not happen because of the transposition table. But here the position after 4 steps is apparently overwritten, and also the position after 3 steps for the 12 step search. Have you tried to change the Zobrist keys (or do you use something else to get hash values) to see the effect of the search time? |
| Thanks for picking that up Oystein, that is unusual. I wonder if it has anything to do with the parallelization returning irregularly. I'll check out your Zobrist suggestion too.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Changing your mind and managing your time
« Reply #14 on: Jan 6th, 2009, 1:25am » |
Quote Modify
|
on Jan 6th, 2009, 1:13am, Oystein wrote:It is hard to make good use of an incomplete iteration. |
| That's not quite true, Jeff is right that even incomplete iterations can be very useful. Unfortunately Gnobot can only make use of complete iterations, hence the time control. If it were able to use incomplete ones, then I agree with Jeff it would probably just be better to use the whole time.
|
|
IP Logged |
|
|
|
|