Welcome, Guest. Please Login or Register.
May 19th, 2024, 7:08am

Home Home Help Help Search Search Members Members Login Login Register Register
Arimaa Forum « Changing your mind and managing your time »


   Arimaa Forum
   Arimaa
   Bot Development
(Moderator: supersamu)
   Changing your mind and managing your time
« Previous topic | Next topic »
Pages: 1 2  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: Changing your mind and managing your time  (Read 2793 times)
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Changing your mind and managing your time
« on: Jan 12th, 2006, 12:09pm »
Quote Quote Modify Modify

In the comments of game 23257, 99of9 posted the evaluation according to Gnobot of the position before move 20b:
Quote:

2 0.3 -5046 e6-f6 e7-f7  
3 0.4 -4992 e6-e5 f5-f4 e5-e6  
4 0.4 -4976 e6-e5 f5-f4 e5-e6 g4-g3  
5 0.5 -5021 e6-e5 f5-f4 g4-g3 f4-g4 h3-h2  
6 0.6 -5083 e6-e5 d6-e6 f5-f6 f6-f7 g4-g3 g5-g4  
7 1.0 -5243 e6-e5 f5-f4 e5-e6 g4-g3 c4-d4 b4-c4 c1-d1  
8 38.9 -8028 e6-f6 d6-e6 f5-f4 f6-f7 h3-g3 g3-f3 f3-g3 f4-f3  
9 41.9 -8034 e6-f6 d6-e6 f5-f4 f6-f7 h3-g3 g3-f3 f3-g3 f4-f3 b6-b7  
10 45.9 -8028 e6-f6 d6-e6 f5-f4 f6-f7 h3-g3 g3-f3 f3-g3 f4-f3 d8-e8 e8-d8  
11 77.1 -8016 e6-f6 d6-e6 f5-f4 f6-f7 h3-g3 g3-f3 f3-g3 f4-f3 e6-e5 e5-e4 e4-f4  
12 263.4 -7931 e6-f6 e7-f7 f5-f4 f6-g6 h3-g3 g3-f3 f3-g3 f4-f3 f7-g7 g6-h6 h6-g6 h5-h6  
 
At the moment, 8 ply takes 38.9 seconds on my laptop. Note that 7 ply only takes 1.0 second.  
 
This is because it has to totally change plan when it sees your goal threat, and all the move ordering help provided by searching 7 ply is totally useless.

I have no experience coding a bot, but I'm interested in the effects of a bot changing its mind.  My understanding is that good move ordering is critical to the speed of alpha-beta searching, so most programs use iterative deepening.  Each level of search costs some time to complete, but saves even more time on the next level of search by providing good move suggestions.  
 
In this case, is Gnobot's 7-step search literally "totally useless", i.e. the 8-step search would have taken exactly as long to complete from scratch?  Might the 7-step search have been somewhat useful?  Is it possible that it was actively detrimental, i.e. it provided move suggestions which were worse than random moves, and thus slowed down the 8-step search?
 
Since there is a huge time loss associated with a bot changing its mind at a new level of search depth, I would think that bot developers would want to prevent their bots from switching plans, for example by accepting a more time-consuming static evaluation that somehow anticipates what is to come.  Yet the article at
http://supertech.lcs.mit.edu/~heinz/dt/node50.html (already referenced in a couple of other threads) suggests that the increase in playing strength by searching an extra step is directly proportional to the probability of the bot changing its mind at that step.  As a consequence, the more likely a bot is to change its mind at a certain depth, the more important it is to search that depth.
 
A practical application might be this: If Gnobot changes its mind on step 8 (from what it thought on step 7) 50% of the time, but changes its mind on step 9 (from what it thought on step 8) only 10% of the time, then deepening to 9 steps provides only 1/5 of the stength boost that deepening to 8 steps did.  Therefore (perhaps) for time management purposes Gnobot should always complete the 8-step search unless it is out of time, and always forgo the 9-step search unless it has a full reserve, so that it can have enough reserve time for 8-step searches later.
 
In general, different bots might have different good breakpoints of search depth.  For example, if I remember correctly, bot_haizhi implemented 3-step static goal and trap evaluation, so completing a 5-step or a 9-step search might be an important breakpoint for bot_haizhi's time management.
 
A second question I have is whether iterative deepening in Arimaa should be applied only at the breakpoints where it is likely for a bot to change its mind.  If there is a fair probability that Gnobot's 7-step search will indeed be totally useless in searching to the eighth step, might it pay to instead do iterative deepening hitting only depths of 4n, i.e. do a 4-step search, then an 8-step search, then a 12-step search?  If that turns out to be too radical, then one could at least consider that two steps are necessary for any capture, and therefore do iterative deepening only at depths of 2n.
 
I apologize if these thoughts are way off base.  If they have any merit at all, though, I'm curious what real developers think about them.
IP Logged

Ryan_Cable
Forum Guru
*****



Arimaa player #951

   


Gender: male
Posts: 138
Re: Changing your mind and managing your time
« Reply #1 on: Jan 12th, 2006, 10:47pm »
Quote Quote Modify Modify

Interesting, I always wondered why the fixed ply bots would have order of magnitude differences in calculation time for positions that looked qualitatively similar.
 
It seems to me that the whole point of searching deeper is to find a better move, so I am not sure how you could improve a bot by reducing its likelihood to switch lines.  The only thing I can think of is if the bot is repeatedly oscillating between lines, but that seems unlikely to cause nearly as large a delay as searching a completely new line.
 
Has anyone done any empirical research into how useful the non 4n evaluations are?  If the 4n+1 search causes a change, how likely is it to actually be a better move (I guess this would require a human opinion of the moves).  If the 4n+1 search causes a change, how likely is it that the search will keep the new move at 4n+4; how likely that it will go back to the old move?  Could there be any value in creating different but related evaluations for each of the step levels, or would this ruin the move orderings produced by the evaluations?
 
PS  If Gnobby saw the goal threat why didn’t it block it?  Did it just run out of time and move randomly?  Does Gnobby have any static goal evaluation?  Would it be of any value to run static goal evaluations at the very beginning of the move and see that Fritzlein has a goal threat and Gnobby doesn’t?  Then you could have some sort of mode that prioritizes goal defense moves in the search.
 
PPS  Is everyone just using standard Alpha-Beta with heuristics and iterative deepening, or does anyone use something more advanced like http://en.wikipedia.org/wiki/MTD-f ?
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Changing your mind and managing your time
« Reply #2 on: Jan 13th, 2006, 12:10pm »
Quote Quote Modify Modify

on Jan 12th, 2006, 10:47pm, Ryan_Cable wrote:
PS  If Gnobby saw the goal threat why didn’t it block it?  Did it just run out of time and move randomly?  Does Gnobby have any static goal evaluation?

 
If I remember correctly, Gnobot ran its time down to 15 seconds, i.e. it did run out of time and give the best move it had discovered to that point.  I'm guessing that, given the server it was running on, and the load at the time, the 95 seconds weren't enough to complete an 8-step search or else it would certainly have blocked the goal.  I'm also guessing that Gnobot has no static goal evaluation whatsoever, not even 1-step, or else it would have seen the problem in the 7-step search.  
« Last Edit: Jan 13th, 2006, 2:13pm by Fritzlein » IP Logged

ntroncos
Forum Junior Member
**



Weiser's trainer

   
WWW

Gender: male
Posts: 9
Re: Changing your mind and managing your time
« Reply #3 on: Jan 24th, 2006, 9:13am »
Quote Quote Modify Modify

on Jan 12th, 2006, 10:47pm, Ryan_Cable wrote:

PPS  Is everyone just using standard Alpha-Beta with heuristics and iterative deepening, or does anyone use something more advanced like http://en.wikipedia.org/wiki/MTD-f ?

 
Last year i tried to attend that issue in weiser.  We used an incomplete search function based on the hipotesis that much of the search is useless. Until know we have no out preformed the complete search bots (alpha-beta search).
 
In each step we putted a fixed search apart from the incomplete search.
 
weiser thinking:
alpha-beta search 2ply depth (very fast), if he wins or looses he move ocordingly. if niether he does the incomplete search.
Then he makes theincomplete search for aslong as he has time.
 
The fixed search helped so that he didnt loose stupidly, but the use of an incomplete search funtion exposes the posibility that he never sees that he will loose in the next ply. (or that in some bizzar way its good to through his elephant in a trap Grin)
 
The conclution of our develpoment were that even though alpha-beta search does aparent unusefull search it does fill in the evaluated move table. I think that a combined alpha-beta search for the first plys and the an incomplete search to go higher in depth would be better. Only incomplete search (at our present state of develpoment) never could out-rank a complete alpha-beta bot.
IP Logged
Ryan_Cable
Forum Guru
*****



Arimaa player #951

   


Gender: male
Posts: 138
Re: Changing your mind and managing your time
« Reply #4 on: Jan 24th, 2006, 7:03pm »
Quote Quote Modify Modify

on Jan 24th, 2006, 9:13am, ntroncos wrote:
weiser thinking:
alpha-beta search 2ply depth (very fast), if he wins or looses he move ocordingly. if niether he does the incomplete search.
Then he makes theincomplete search for aslong as he has time.

This strikes me as a very good idea if one is going to do substantial pruning.  The one downside it has is that it creates an additional horizon effect.
 
Deep Blue did the first few ply of the search in software on a general purpose computer and then did the rest in specially designed hardware that had a less sophisticated evaluation function.  At one point in its development, the hardware didn’t include castling, which meant that if castling appeared particularly useful to its opponent, it would try to play a line that pushed the castling move beyond the software horizon, expecting it to be impossible to do later.  Of course, you have an advantage in that you are specifically pruning moves that are likely to be bad, whereas castling was left out because it was hard to do in hardware.
 
PS I hope we will be seeing more of you and weiser around in the future.
IP Logged
fotland
Forum Guru
*****



Arimaa player #211

   


Gender: male
Posts: 216
Re: Changing your mind and managing your time
« Reply #5 on: Jan 24th, 2006, 11:56pm »
Quote Quote Modify Modify

Bomb does depth first alpha-beta with iterative deepening (1 step at a time) and transposition table (4 million entries, w victims).  It has 6 step static trap evaluation and approximate 16 step static goal evaluation (3 steps exact, otherwise lower bound).  It has a lot of selective deepening, and a quiescence search, so the deepest lines are often 4 to 8 steps deeper than the full width search.  It uses null move pruning, and prunes redundant passes and 2 step reverse moves.  It uses 2 killer moves and the history heuristic for move ordering.  It typically searches 12 steps full width, and 15 to 18 steps in the PV including extensions and quiescence (plus typically 3 steps in the static evaluation).  It usually find forced goals 16 to 20 steps away.  The evaluation and move generation are very fast, using bitboards.
IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: Changing your mind and managing your time
« Reply #6 on: Jan 5th, 2009, 5:21pm »
Quote Quote Modify Modify

on Jan 12th, 2006, 12:09pm, Fritzlein wrote:
A practical application might be this: If Gnobot changes its mind on step 8 (from what it thought on step 7) 50% of the time, but changes its mind on step 9 (from what it thought on step 8) only 10% of the time, then deepening to 9 steps provides only 1/5 of the stength boost that deepening to 8 steps did.  Therefore (perhaps) for time management purposes Gnobot should always complete the 8-step search unless it is out of time, and always forgo the 9-step search unless it has a full reserve, so that it can have enough reserve time for 8-step searches later.

There is nothing new under the sun!  I just ran across this old thread and am amazed that I did not contribute to it, perhaps I didn't even see it.  Fritz is of course right, and I subsequently implemented this exact time control feature (I'm not sure if I came to the conclusion independently or failed to acknowledge this helpful thread).
 
In fact, a few weeks ago, I was discussing my time control with Fritz in the chatroom, and had to convince him of the value of this very idea!
 
Quote:
I apologize if these thoughts are way off base.  If they have any merit at all, though, I'm curious what real developers think about them.

 
Now you know ;-).
 
However it is now starting to get a bit redundant, the 4n stuff wears off once you get static trap extensions in.  There is no typical search profile anymore, but here's an example of one of my test positions:
 

 r - r - - r - r
 - r r - c h d r
 E c r h H R e M
 m R - H C R R R
 R - - - - - D C
 - R - - - - - -
 - - R - - - - -
 - - - - - - - -
initial key 8809538553158063007
 
Depth   Elapsed      Knps    Eval  Main Line
-----  --------  --------  ------  ------------------------------------------
    2       0.3       1.1   -7772  d5-c5 c5-c4 c6-c5  
    3       0.3      10.8   -6594  e7-d7 e6-e7 d7-d8 e7-d7  
    4       0.3      58.4   -6594  e7-d7 e6-e7 d7-d8 e7-d7  
    5       0.3     108.3  -10078  d5-c5 c5-c4 c6-c5 g4-g3 g7-g8  
    6       0.4     162.8  -12041  d5-c5 c5-c4 c6-c5 g4-g3 g7-g8 a8-a7  
    7       0.5     233.5  -12552  d5-c5 c5-d5 c6-c5 g4-g3 g7-g8 b7-a7 c7-b7  
    8       0.7     318.6  -13116  d5-c5 c5-c4 c6-c5 g4-g3 g7-g8 a8-a7 c7-c6 c5-d5  
    9       1.1     374.7   -9429  d5-c5 c5-c4 c6-c5 g4-g3 g5-g4 g6-g5 g5-g6 a8-a7 f5-g5  
   10       1.6     419.0   -8160  d5-c5 c5-c4 c6-c5 g4-g3 a8-a7 d6-c6 c5-d5 c6-d6 b6-c6 a6-b6  
   11       5.7     414.7   -7641  d5-c5 c5-c4 c6-c5 h4-h3 a8-a7 d6-c6 c5-d5 c6-d6 b6-c6 a6-b6 d5-d4 e5-d5  
   12      21.0     563.3   -5458  b3-b4 b5-c5 a5-b5 a6-a5 b4-b3 b5-b4 b6-b5 b7-b6 a5-a6 b5-a5 b6-b5 a6-b6  
   13     129.7     708.3  -10160  b3-a3 a4-b4 b4-c4 c4-d4 b5-b4 b6-b5 a5-a4 b5-b6 d5-c5 c5-c4 c6-c5 h4-h3 g7-g8  
Total: time 129.7  nodes 91903914  evals 16519192 (18.0%) beta 912581 ( 1.0%) null 21384127 (23.3%)
 
Applying this move: Rb3w Ra4e Rb4e Rc4e
« Last Edit: Jan 5th, 2009, 5:22pm by 99of9 » IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: Changing your mind and managing your time
« Reply #7 on: Jan 5th, 2009, 6:07pm »
Quote Quote Modify Modify

On the other hand, the 4n problem still does show its head in some positions.  I reran the exact position you posted, and here are my current results:
 

 r r r r - r - r
 - - r - r - - -
 - h - c c - - -
 - - - - - d M R
 - H E - - - h -
 R R m e - - - H
 - C R C D R R -
 R - D - - - R -
initial key 4439579325424901324
 
Depth   Elapsed      Knps    Eval  Main Line
-----  --------  --------  ------  ------------------------------------------
    2       0.3       2.8  -24898  e6-f6 f6-g6  
    3       0.3      36.7  -24998  e6-f6 e7-f7 f6-g6  
    4       0.4     318.7  -23952  e6-e5 f5-f4 f8-g8 g4-g3  
    5       0.5     484.4  -25178  e6-e5 f5-f4 f8-g8 g4-g3 h5-h6  
    6       0.7     582.2  -26354  e6-e5 f8-g8 f5-f4 g4-g3 h5-h6 h6-g6  
    7       1.1     589.5  -27392  e6-e5 f8-g8 f5-f4 g4-g3 h5-h6 g5-g4 g4-g5 g3-g4  
    8       4.2     528.4  -27847  f8-g8 e6-f6 e7-f7 f5-f4 h3-g3 g3-f3 f3-g3 f4-f3  
    9      10.4     507.9  -27323  e6-f6 e7-e6 f5-f4 f6-g6 h3-g3 g3-f3 f3-e3 f4-f3 f8-f7  
   10      15.2     535.5  -25469  e6-e5 f5-f4 g4-g3 f8-g8 g5-g4 g2-h2 g3-g2 g4-g3 e7-f7 e5-e6  
   11      37.4     541.5  -24924  e6-e5 f5-f4 f8-g8 g4-g3 g5-g4 g4-g5 g3-g4 h3-g3 e7-f7 e5-e4 e4-e3  
   12     315.2     918.9  -24835  e6-e5 f8-g8 f5-f4 g4-g3 g5-g4 g2-h2 g3-g2 g4-g3 e7-f7 d6-e6 e5-e4 e4-e3  
   13    1165.4     974.1  -25447  e6-e5 f8-g8 f5-f4 g4-g3 g5-g4 g2-h2 g3-g2 g4-g3 e7-f7 e5-e4 e4-e3 e3-f3 g3-g4  
Total: time 1165.4  nodes 1135288779  evals 203153473 (17.9%) beta 4469561 ( 0.4%) null 245830473 (21.7%)
 
Applying this move: ce6s rf8e df5s hg4s

 
The sad thing is that this was done on a dual core pc using the search improvements and the better eval, and now it takes longer to finish 12 ply!!!
« Last Edit: Jan 5th, 2009, 6:09pm by 99of9 » IP Logged
Janzert
Forum Guru
*****



Arimaa player #247

   


Gender: male
Posts: 1016
Re: Changing your mind and managing your time
« Reply #8 on: Jan 5th, 2009, 9:03pm »
Quote Quote Modify Modify

But presumably it's finding a much(?) better move in that 12 ply, at least that's the hope of course.
 
Janzert
IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: Changing your mind and managing your time
« Reply #9 on: Jan 5th, 2009, 9:10pm »
Quote Quote Modify Modify

Yes, I hope so.  It's playing a riskier line, but if it gets away with it, then it will not lose the dog.
IP Logged
jdb
Forum Guru
*****



Arimaa player #214

   


Gender: male
Posts: 682
Re: Changing your mind and managing your time
« Reply #10 on: Jan 5th, 2009, 11:07pm »
Quote Quote Modify Modify

I am not sure I understand the idea being presented here, so this could be missing the point, but here goes.
 
Clueless will not move until it has completed an 8 step search. (Except if its going to lose on time!) Typically this only takes a couple seconds, but if it takes a few minutes so be it. Its just way too dangerous to move without looking that far ahead.
 
Stopping the search early, if there is likely not enough time left to complete the next iteration, is questionable in my opinion. Lets say it just finished an 8 ply search and there is only 5 seconds until it is time to move. Might as well just stop searching and save the 5 seconds. But at ply 9, disaster and the move's score drops big time. If a move's score is going to drop huge on the next ply, the search of that move will complete very quickly, typically in under a second. Now extra time will be needed to see if a new move can be found.
 
So even if there is no hope of completing the 9 ply search, in effect that 5 seconds gives an extra ply of security on the move that is going to be played.
IP Logged
Oystein
Forum Full Member
***



Arimaa player #3272

   


Gender: male
Posts: 21
Re: Changing your mind and managing your time
« Reply #11 on: Jan 6th, 2009, 12:29am »
Quote Quote Modify Modify

Just a little observation:
 
on Jan 5th, 2009, 6:07pm, 99of9 wrote:
Depth   Elapsed      Knps    Eval  Main Line
-----  --------  --------  ------  ------------------------------------------
    7       1.1     589.5  -27392  e6-e5 f8-g8 f5-f4 g4-g3 ...
...
   10      15.2     535.5  -25469  e6-e5 f5-f4 g4-g3 f8-g8 ...
   11      37.4     541.5  -24924  e6-e5 f5-f4 f8-g8 g4-g3 ...
   12     315.2     918.9  -24835  e6-e5 f8-g8 f5-f4 g4-g3 ...


Same first move, but different order of the steps. Normally this would not happen because of the transposition table. But here the position after 4 steps  is apparently overwritten, and also the position after 3 steps for the 12 step search.
 
Have you tried to change the Zobrist keys (or do you use something else to get hash values) to see the effect of the search time?
 
IP Logged
Oystein
Forum Full Member
***



Arimaa player #3272

   


Gender: male
Posts: 21
Re: Changing your mind and managing your time
« Reply #12 on: Jan 6th, 2009, 1:13am »
Quote Quote Modify Modify

on Jan 5th, 2009, 11:07pm, jdb wrote:
Stopping the search early, if there is likely not enough time left to complete the next iteration, is questionable in my opinion.
I have hardly any experience in bot writing, but I give my opinion anyway.
I think it is best to strive to never do an incomplete iteration.
 
Quote:
Lets say it just finished an 8 ply search and there is only 5 seconds until it is time to move. Might as well just stop searching and save the 5 seconds. But at ply 9, disaster and the move's score drops big time. If a move's score is going to drop huge on the next ply, the search of that move will complete very quickly, typically in under a second. Now extra time will be needed to see if a new move can be found.
But because there are little time left, it is quite likely that we dont find the best substitute move. It is hard to make good use of an incomplete iteration.
 
Quote:
So even if there is no hope of completing the 9 ply search, in effect that 5 seconds gives an extra ply of security on the move that is going to be played.
The 5 seconds is better spent later when it is enough time to complete a iteration. Then it is possible to both discover a disaster and to find a good substitute move.
IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: Changing your mind and managing your time
« Reply #13 on: Jan 6th, 2009, 1:20am »
Quote Quote Modify Modify

on Jan 6th, 2009, 12:29am, Oystein wrote:
Just a little observation:
 
Same first move, but different order of the steps. Normally this would not happen because of the transposition table. But here the position after 4 steps  is apparently overwritten, and also the position after 3 steps for the 12 step search.
 
Have you tried to change the Zobrist keys (or do you use something else to get hash values) to see the effect of the search time?

Thanks for picking that up Oystein, that is unusual.  I wonder if it has anything to do with the parallelization returning irregularly.  I'll check out your Zobrist suggestion too.
IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: Changing your mind and managing your time
« Reply #14 on: Jan 6th, 2009, 1:25am »
Quote Quote Modify Modify

on Jan 6th, 2009, 1:13am, Oystein wrote:
It is hard to make good use of an incomplete iteration.

That's not quite true, Jeff is right that even incomplete iterations can be very useful.  Unfortunately Gnobot can only make use of complete iterations, hence the time control.  If it were able to use incomplete ones, then I agree with Jeff it would probably just be better to use the whole time.
IP Logged
Pages: 1 2  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.