Arimaa Forum - Print Page


    
      
        Arimaa Forum
        (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
      

        Arimaa >> Bot Development >> Board Representation
        
(Message started by: doublep on Oct 16^th, 2005, 4:09pm)

Title: Board Representation
Post by doublep on Oct 16^th, 2005, 4:09pm

Discussion here seems to concentrate on important things, like search algorithms and evaluation. Why not talk about something more to the ground?

I wonder which board representation is more popular among Arimaa bot authors. I'm aware of three standard schemes: a collection of bitmaps, a two-dimensional array or a single-dimension array (or there could be not one but a few arrays.) Which one do you use and why? Maybe you use something completely different? Do you prefer several arrays of primitive types or an array of objects?

Aami-ra uses two single-dimension arrays. The main array stores actual board situation: each element (of type `char') contains a code for a piece, an empty square or an off-board code. Second array (also `char') contains auxiliary data that allows to quickly determine if a piece is frozen and how many neighbors of different kinds it has. We prefer single-dimension arrays since they are more efficient (you only pass one index around) and actually easier to work with, once you get used to. Arrays of objects are usually less efficient too, because CPU memory caching suffers. Anyway, in our case the object would have been just two `char's if we had used them.

Both arrays are updated incrementally. Move/step undoing is performed using a change stack: whenever a value in either of the two arrays changes, Aami-ra stores the old value together with its address. Undoing then simply consists of copying the old values back.

Title: Re: Board Representation
Post by nbarriga on Oct 16^th, 2005, 4:44pm

I never gave it too much thought. I just assumed everybody used 64-bit bitmaps.

It's memory efficient and you just use fast bitwise operations.

I thought using arrays would be much slower, but your solution of using a second array keeping auxiliar data seems like a clever solution.

Title: Re: Board Representation
Post by jdb on Oct 16^th, 2005, 6:14pm

Clueless uses 64 bit bitmaps.

The board is stored as an array of 12 bit maps plus an int for the number of steps used in the game so far.

Last year's move representation was a 4 step move compressed into a 64 bit long.

This year a move is stored in the same format as the board. The makemove routine xor's the current position with the move to get the new position. Nothing is done incrementally.

Title: Re: Board Representation
Post by Fritzlein on Oct 16^th, 2005, 9:42pm

I suppose I should let Fotland speak for himself, but he seems not to be active at the moment. I seem to recall him telling me that Bomb uses bitmaps for speed.

Title: Re: Board Representation
Post by 99of9 on Oct 17^th, 2005, 12:22am

Gnobby uses the bitmap representation used by Don in the sample bot (because I don't know enough to better it).

Title: Re: Board Representation
Post by doublep on Oct 17^th, 2005, 3:42pm

We use arrays since they seem to easier to work with in tactical reading (we have unfinished 4 steps goal lookahead and directed search for trapping moves intended to improve move ordering.) Basically, each board square is described with just two values, which you don't even need to extract from a bitmask. This approach may be more efficient in some cases, for instance the check if a piece is frozen is a single comparison. Additionally, our GNU Go experience might have played its role :)

Title: Re: Board Representation
Post by PMertens on Oct 17^th, 2005, 4:33pm

my first bot used VB (no comment, it was a first try ever) arrays, but after I read the sample bot I learned C and used bitmaps ...

suddenly everything became faster (and move generation as well as evaluation much easier)

Title: Re: Board Representation
Post by doublep on Oct 18^th, 2005, 2:40pm

on 10/17/05 at 16:33:20, PMertens wrote:

Well, in this case the speedup is of course because of changing the language. No matter how you do it in C, it will be faster than VB. Actually, C and C++ are you best languages if you want something to be _real_ fast.

Title: Re: Board Representation
Post by ntroncos on Nov 13^th, 2005, 2:16pm

It seems that 64 bitmap is the common usage. By the way this is quite good and they will benifit from 64bit procesor. With 32bit Procesors we need 2 (normaly) instructions to handle the bitmap, with the new 64bit this is reduced to 1 (normaly), hence we see a speed enhancement.

Normaly what you should try is to avoid is a representation where you store to much zeros, bitmap in this case. But the tradeof of this representation is that we have a vision of the board as a whole, be sides the space overhead for each zero is only 1bit.

Never the less other other representation should be explored, maybe there is a whole new, more eficient representation waiting to be implemented ;).

Title: Re: Board Representation
Post by Swynndla on May 17^th, 2006, 6:59am

Can someone help me understand bitboards? ... and how the Sample C bot implements it?

Title: Re: Board Representation
Post by jdb on May 17^th, 2006, 8:26am

http://en.wikipedia.org/wiki/Bitboard is a decent description.

The sample bot also defines some Arimaa specific bitboards, like trap squares and squares adjacent to trap squares etc.

Title: Re: Board Representation
Post by unic on May 17^th, 2006, 9:29am

Fairy represents the board as an array[100]. I find it easier to understand and to write evaluation for than a bitboard - might be slower in execution, but quicker in development for me.

Title: Re: Board Representation
Post by nbarriga on May 17^th, 2006, 10:12pm

on 05/17/06 at 09:29:42, unic wrote:

Fairy represents the board as an array[100]. I find it easier to understand and to write evaluation for than a bitboard - might be slower in execution, but quicker in development for me.

You should rethink that, i did some tests a while ago, and it really is a LOT slower.

Maybe you should write a couple of functions to manipulate the bitboard, so you write the bitwise operations only once.

Title: Re: Board Representation
Post by Swynndla on May 18^th, 2006, 2:42am

Is it clear cut? ... I didn't think it was when reading the wikipedia article and also this:
http://www.cis.uab.edu/hyatt/boardrep.html

It seemed to me that the two links are saying that bitboards may be better, but not by much (if both methods are done properly)? Although there may be a significant difference with a 64 bit processor?

I know next to nothing about both methods, and so I don't know what I'm talking about ... I'm interested in you're opinions.

From what I've read so far, bitboards seem more logical to me, but if I use them, it may slow my bot development down quite a bit while I get to grips with bitboards. At the moment, I'm just wondering if I spent that time on developing an effective pruning strategy instead, that I'd have a faster bot in the end. I'm wondering this as both articles warn that bitboards take a long time to get to grips with, and that any bitboard code must be well-commented in order to develop more code and to debug it too ... I'm not sure that the sample C bot is all that well-commented?

Title: Re: Board Representation
Post by 99of9 on May 18^th, 2006, 3:08am

I learnt bitboards from the sample code. I don't think they're *that* hard to get to grips with.

Title: Re: Board Representation
Post by BlackKnight on May 18^th, 2006, 3:37am

I think it also helps a lot to write down the following:

// The squares on the board (looking from Gold's side) have the indices
// (within the bitstring):
// 0 1 2 3 4 5 6 7
// 8 9 10 11 12 13 14 15
// 16 17 18 19 20 21 22 23
// 24 25 26 27 28 29 30 31
// 32 33 34 35 36 37 38 39
// 40 41 42 43 44 45 46 47
// 48 49 50 51 52 53 54 55
// 56 57 58 59 60 61 62 63

// And the bitstring is numbered 63 62 61 ... 2 1 0.

Then you can easily get the following constants:

#define A_FILE 0x0101010101010101ULL
#define B_FILE 0x0202020202020202ULL
#define C_FILE 0x0404040404040404ULL
#define D_FILE 0x0808080808080808ULL
#define E_FILE 0x1010101010101010ULL
#define F_FILE 0x2020202020202020ULL
#define G_FILE 0x4040404040404040ULL
#define H_FILE 0x8080808080808080ULL
#define RANK_1 0xff00000000000000ULL
#define RANK_2 0x00ff000000000000ULL
#define RANK_3 0x0000ff0000000000ULL
#define RANK_4 0x000000ff00000000ULL
#define RANK_5 0x00000000ff000000ULL
#define RANK_6 0x0000000000ff0000ULL
#define RANK_7 0x000000000000ff00ULL
#define RANK_8 0x00000000000000ffULL

#define LEFT_SIDE 0x0f0f0f0f0f0f0f0fULL // a, b, c, d
#define RIGHT_SIDE 0xf0f0f0f0f0f0f0f0ULL // e, f, g, h

#define TRAPS 0x0000240000240000ULL
#define TRAPGUARDS 0x00245a24245a2400ULL

#define RING0 0x0000001818000000ULL // same as small center
#define RING1 0x00003c24243c0000ULL
#define RING2 0x007e424242427e00ULL
#define RING3 0xff818181818181ffULL // edges

Title: Re: Board Representation
Post by nbarriga on May 19^th, 2006, 11:52pm

on 05/18/06 at 02:42:31, Swynndla wrote:

I'm not sure that the sample C bot is all that well-commented?

Well, with ntroncos we may contribute a bit on that. We are giving his students an assignment, consisting on understanding the sample C bot, and commenting using doxygen style. I think in about a month we may publish here the best commented program of the class.

Title: Re: Board Representation
Post by Swynndla on May 20^th, 2006, 6:49am

I'm getting a lot of help ... thanks to all your posts!

on 05/19/06 at 23:52:43, nbarriga wrote:

Nice! ... I look forward to it :) What course are the students studying?

BlackKnight, thanks for those lines ... why is the board numbered from a8 instead of a1?

Title: Re: Board Representation
Post by nbarriga on May 20^th, 2006, 11:26am

on 05/20/06 at 06:49:21, Swynndla wrote:

Nice! ... I look forward to it :) What course are the students studying?

It's a second year course called Programming Workshop, where they learn programming good practices, and some tools. first they are introduced to the all the dark corners of C :) , then some tools like gdb, autotools, doxygen.

Title: Re: Board Representation
Post by nbarriga on May 25^th, 2006, 9:06pm

I just made a simple test to find out if 64 bit CPUs are worth it. I used bot_SampleC compiled using gcc 4.1.0 with -O4, searching 10 steps, on the following position:

5w
+-----------------+
8| r r r r r r r |
7| h d c m h |
6| X c d |
5| e E r |
4| |
3| X X |
2| H D C M R C D H |
1| R R R R R R R |
+-----------------+
a b c d e f g h

(sorry for the ugly board, i just copy and paste the matchMove file)

One PC, an athlonXP 2600+ clocked at 1900Mhz with 512kb L2 cache, i compiled with -march=athlon-xp. The other, a Sempron64 2800+ clocked at 1600Mhz with 256kb L2 cache, compiled with -march=athlon64

Results from 3 consecutive runs:
XP:
real 0m13.628s
user 0m13.461s
sys 0m0.152s

real 0m11.492s
user 0m11.353s
sys 0m0.116s

real 0m11.222s
user 0m11.025s
sys 0m0.184s
------------------------------------
and Sempron64:
real 0m7.916s
user 0m7.792s
sys 0m0.048s

real 0m8.431s
user 0m8.217s
sys 0m0.052s

real 0m9.581s
user 0m9.425s
sys 0m0.056s
------------------------------

If would be nice if someone not using bitboards could send me his bot compiled for the two different architectures to run the same test.

Title: Re: Board Representation
Post by chessandgo on May 25^th, 2006, 9:26pm

Is there any translation from chinese to english of this post ? :)

Title: Re: Board Representation
Post by jdb on May 25^th, 2006, 9:32pm

Checking through my notes, here are some results of a perft using clueless's makemove/undo. They are all pure bitboard methods.

This is on an AMD64 2.2Ghz

The java and c versions are as close to the same code as I could make them.

java32 3719 Knps
java64 13791 Knps
gcc64 23065 Knps

Title: Re: Board Representation
Post by unic on May 25^th, 2006, 9:35pm

I offered in the chat earlier today to upload a non-bitboard sample bot in C (i.e. a somewhat cleaned-up and simplified version of Fairy). I was hoping to get around to doing that during the weekend... I'll post once I have it done.

Title: Re: Board Representation
Post by nbarriga on May 25^th, 2006, 11:15pm

on 05/25/06 at 21:26:16, chessandgo wrote:

Is there any translation from chinese to english of this post ? :)

Yes, the athlon XP averaged about 12 seconds, while the sempron 64 averaged about 8.5 seconds on analizing 10 steps from the move i gave them.

Title: Re: Board Representation
Post by Swynndla on May 27^th, 2006, 8:10pm

nbarriga, I'm looking forward to your results of testing a version of Fairy on the 32 and 64 bit cpu's :)

Title: Re: Board Representation
Post by nbarriga on May 28^th, 2006, 1:05pm

Ok, i run Fairy_light, with the same position as the earlier test:

XP:
real 0m44.878s
user 0m43.911s
sys 0m0.824s

real 0m44.707s
user 0m43.791s
sys 0m0.748s

real 0m45.078s
user 0m43.819s
sys 0m0.792s

Sempron64:
real 0m42.045s
user 0m40.383s
sys 0m0.420s

real 0m43.482s
user 0m41.447s
sys 0m0.524s

real 0m42.849s
user 0m41.307s
sys 0m0.476s

Title: Re: Board Representation
Post by nbarriga on May 28^th, 2006, 1:09pm

If someone has a bot that uses many threads or processes, i can test it in an athlon 64 dual-core i have access to.

Title: Re: Board Representation
Post by Swynndla on May 28^th, 2006, 9:23pm

So this is hard proof of the theory that 64 bit cpu's are free speed upgrades for bitboards, while not doing a lot for non-bitboards! ...

http://www.cis.uab.edu/hyatt/bitmaps.html ...
Quote:

for new 64 bit architectures such as the Intel Merced processor, bitmaps make a great deal of sense, because bitmaps take maximum advantage of internal processor register and bus size. For example, most of the normal bitmap operations (AND, OR, XOR, shift, and so forth) take at least two instructions on a 32 bit machine, one for each half of the 64 bit bitmap. On a 64 bit architecture, with no clock speed change at all, these operations run twice as fast, because they now require only one instruction to perform them. That is a performance gain that can not be overlooked, because it is completely free.

and http://www.cis.uab.edu/hyatt/boardrep.html ...
Quote:

It will see immediate improvement on 64-bit machines while the offset programs will not, because they have no need for 64-bit integers.

Title: Re: Board Representation
Post by fotland on Aug 14^th, 2006, 10:34pm

bomb uses bitboads. bmap is a 64-bit bitmap class with many member functions for and, or, set-bit, find-first-bit, etc.

Here is the code:

// the basic core data of a board, must be copied to new board - change DATASIZE when add to this
bmap p[NUMBITMAPS]; // locations of pieces, by PIECE NAMES
bmap frozen; // bit set if that piece is frozen
char pieces[64]; // piece on each point, 0 is upper left (black side of board )
char strongest[2][64]; // strongest piece next to each square, by color (or EMPTY)
unsigned int hashindex, hashkey1; // transposition table values
int material; // material balance, 1000 per pawn, positive for gold
int bestmateval; // best pawn matestep value
char bestmateloc; // best mate location
char tm; // side to move (0-white, 1-black)
char phase; // phase of the the move, 0-3
char wtotmat, btotmat; // white and black total material, 1, 2, 3, 5, 9, 11
unsigned char pullsquare, pullpiece; // last move's pulling square (empty square where piece was)
// and piece doing the pulling - unless pullpiece is empty
unsigned char pushsquare, pushpiece; // last move pushed square (empty point where piece was)
// and enemy piece that was pushed -
// unless pushpiece is empty - next step is forced
unsigned char pawncount[2]; // how many pawns left by color

Title: Re: Board Representation
Post by Fritzlein on Aug 15^th, 2006, 2:08pm

on 08/14/06 at 22:34:50, fotland wrote:

char wtotmat, btotmat; // white and black total material, 1, 2, 3, 5, 9, 11

If you ever get motivated to work on Bomb some more, consider changing the material values to 1, 1.5, 2, 3, 5, 10. We've had a lot of experience with uneven trades since the old days, and it turns out camels and horses are not worth as many little pieces as we used to think. (Yes, I realize that many other parameters would have to change as well to correspond, but Bomb's overvaluation of camels and horses is actually an exploitable bug. Indeed, overvaluing the camel may have caused Bomb its one loss in the 2006 CC.)

Title: Re: Board Representation
Post by clauchau on Feb 2^nd, 2008, 9:58am

I'm going to store and compare lossless representations of positions instead of colliding hash values, so that
1) checking for games lost by repetition is not done wrong, and
2) I'm sure that the simple bots I have playing over a million games don't ever overlook moves by wrongly believing they've already been evaluated.

How few bits as possible would you need to code every possible Arimaa position (without needing too much time to code it)? I've a simple solution that goes down to using 153 bits (or 154 if I also need the active side).

Title: Re: Board Representation
Post by Fritzlein on Feb 2^nd, 2008, 6:31pm

OK, you got me thinking about it for a few minutes, and I couldn't beat your answer. I did a Huffman Code on the pieces, listed in order of the squares. Bit one is empty/occupied. Bit two is Silver/Gold. Bit 3 is rabbit/piece. Bit 4 is minor/major piece. Etc. The full code is

0 empty
100 silver rabbit
10100 silver cat
10101 silver dog
10110 silver horse
101110 silver camel
101111 silver elephant
110 gold rabbit
11100 gold cat
11101 gold dog
11110 gold horse
111110 gold camel
111111 gold elephant

My description length when all pieces are on the board is 32*1 + 16*3 + 12*5 + 4*6 = 164 bits, plus one for side to move. Of course it is shorter after captures, but it's eleven bits longer than yours worst case. >:(

Now you explain your 153-bit solution. It had better be more complicated than mine! :P

Title: Re: Board Representation
Post by jdb on Feb 2^nd, 2008, 6:39pm

on 02/02/08 at 09:58:22, clauchau wrote:

If 64 bit zobrist hashing is used, there is a better chance of being killed by a falling coconut while playing arimaa, than having an incorrect hash collision.

Title: Re: Board Representation
Post by lightvector on Feb 3^rd, 2008, 10:34am

I don't know whether this is considered easy to code, but I have a practical way to achieve 146 bits (147 with side to move, 149 if you want to specify 1-4 steps as well).

First, compute a unique index for each possible combination of squares occupied on the board. There are 64C32 + 64C31 + ... +64C0 possible combinations for occupancy (ignoring traps, etc).

Adding all this up and taking the log base 2, we find we can store this index in 64 bits.

To represent what pieces are on the board, we can write a string such as RcEreMrrRrHDrDRC, specifying sequentially the pieces that are in each subsequent occupied position. Each string will be exactly 32 characters long, and will contain exactly 8 r's, 8 R's, 2 c's, 2 C's, and so on. If there are fewer than 32 occupied positions on the board, then we simply fill in the rest of the string past that point however we wish (this is wasteful, so there is still room to shave off a few more bits without resorting to Arimaa-specific properties like traps or the fact that there can never be more than 4 rabbits that have reached the goal line). We then compute a unique index for each possible 32 character string. The total number of possibilities is 32!/8!/8!/2!/2!/2!/2!/2!/2!, and taking the log base 2, we find we can store this index in 82 bits. This gives a grand total of 146 bits.

I hope I didn't get anything wrong above.

I will post how to calculate the indices shortly. It's not actually that bad.

Title: Re: Board Representation
Post by lightvector on Feb 3^rd, 2008, 11:14am

How do we compute the required indices? Here is one way (there may be better ways). This requires 64 bit unsigned integer math.

To compute the index for occupied board positions, first count how many there are. We have a list matching number of positions to the desired combination.

32 Positions: 64C32
31 Positions: 64C31
30 Positions: 64C30
...

To get an index, we sum all the values of the previous, then add the specific index for the combination we want. For example there are 30 occupied positions, then we want the case 64C30. So our index is 64C32 + 64C31 + (the unique index for combinations of 30 elements from 64).

--------

To find the index for combinations of C squares from N, we number the board squares 0-N. Then we consider the smallest square chosen, x. Removing that square, we recursively have a choice of C-1 squares out of the remaining N-x-1 subsequent squares.

Again, to get an index in the case we want, we sum all the values of the previous, then recursively add the unique index for the combination we want. Say in our example x = 3, C=30, N = 64. Then, our list of the possible cases looks like:

x = 0: 63C29
x = 1: 62C29
x = 2: 61C29
x = 3: 60C29
...

and our index is 63C29+62C29+61C29 +(the unique index for a choice 29 squares from 60).

Recursively proceed until we have gone through each occupied square (recursively reaching a case of the form xC0). This gives us our final unique index.

-------

If we want to go from the index back to the position, we simply following the same procedure, but make the choice at each step that would give us the index we want. For example, in the initial counting of board squares, if our index X is such that 64C32+64C31...+64C10 <= X < 64C32+64C31...+64C10+64C9, then we know that there are 9 squares occupied, and to find out which squares, we continue into the case 64C9.

All of this can be made relatively fast by precomputing all the required values xCy up to 64C32 and storing them in an array.

Calculating the index for the piece string can be done very similarly (but will require up to 82 bit math).

Somewhat complicated, and probably a pain to code, but definitely feasible. Of course, the extra effort required to implement something like this is probably not worth the tiny reduction in size, but it is certainly fun to see how it can be done.

My explanation is probably overly confusing, but I hope it is good enough.

Title: Re: Board Representation
Post by Janzert on Feb 3^rd, 2008, 11:50am

Just a note since it was mentioned storing how many steps are left. If you have a mid-move position then you also need to know the type of the last piece to move and the square it moved from. Otherwise you can't correctly deal with pushes and pulls.

Janzert

Title: Re: Board Representation
Post by 99of9 on Feb 3^rd, 2008, 5:51pm

I can get 160 bits quite easily (for the board configuration, not the move status).

Lightvector, you've given us a good bound to shoot for! Your first 64 bits are overly complex though... you can fully indicate occupancy by storing a 64 bit checkerboard with 1 for occupied and 0 for unoccupied, just as Fritz suggested for his first bit. No need to calculate anything.

Title: Re: Board Representation
Post by 99of9 on Feb 3^rd, 2008, 8:02pm

Using a slightly complex, but human-decodeable scheme, I can get 156. I'll also be fascinated to see the 153!

Title: Re: Board Representation
Post by lightvector on Feb 3^rd, 2008, 11:33pm

on 02/03/08 at 11:50:07, Janzert wrote:

It might depend on your search algorithm though. In my bot, I generate and search pushpulls first as 2-step moves, and never end in an intermediate position. But yes, if you have to store that info, it will increase the size.

on 02/03/08 at 17:51:15, 99of9 wrote:

Haha! I suppose I should have check for easier ways first. =).

Title: Re: Board Representation
Post by clauchau on Feb 4^th, 2008, 6:19am

Lightvector, I'm delighted to see your optimal solution for the piece string. Fascinating.

Poor me, I miscalculated my solution (while trying to sleep and the next day making my groceries :-/ ). I need 155 bits indeed, not 153. Forgive me. Here is what I had in mind.

The idea is to start with the 164-bits solution and simply drop 9 redundant bits!

For each board square, one bit tells whether it is empty or occupied => 64 bits.

For each occupied square except the possible 32nd one, one bit tells whether it is occupied by Gold or by Silver => 31 bits. If there are 32 occupied squares, it means 16 are by Gold and 16 are by Silver, so you always know the 32nd color is the one missing from the 31 first squares.

For each square occupied by Gold except the possible 16th one, one bit tells whether it is by rabbit or not => 15 bits. If there are 16 squares occupied by Gold, it means 8 are by rabbits and 8 are by non-rabbits, so you always know about the 16th according to how many rabbits are already counted for in the 15 first squares occupied by Gold.

For each square occupied by a non-rabbit Gold piece except the possible 8th, two bits tell whether it is 00 = by a cat, 01 = by a dog, 10 = by a horse, or 11 = by a camel or an elephant, not distinguishing camels from elephants => 14 bits. Same reasoning as above in case there are 8 non-rabbit Gold pieces - there should be two 00, two 01, two 10 and two 11, so the 8th one is the one missing.

For the first square occupied by a Gold camel or elephant, one bit tells whether it is a camel or an elephant => 1 bit. If there are two such squares, the 2nd is always of the other species.

Do the same for Silver, and you need 64+31+(15+14+1)x2 = 155 bits.

99of9, what's your own solution to get to 160?

Title: Re: Board Representation
Post by 99of9 on Feb 4^th, 2008, 6:41am

Here is my 160 bit solution:
64 occupied/not
32 gold/not (as clauchau correctly notes, I should only use 31)
For each colour, we now have up to 16 squares. A location amongst 16 squares can be denoted with 4 bits. So listing the locations for all of the 8 majour pieces takes 8*4 bits. Times 2 for the two colours = 64.

The rest of the occupied squares (of known colour) must be rabbits.

64+32+64 = 160 (or 159 if you use clauchau's trick)

Title: Re: Board Representation
Post by 99of9 on Feb 4^th, 2008, 6:43am

Using clauchau's trick, my complicated method comes down to 153!

64 occupied / not
31 gold / not

From now on everything must be doubled for both colours:
15 rabbit / not

Now we have 8 squares on which our majour pieces must be placed. An address amongst 8 requires 3 bits. Specify the addresses of the elephants and camels: 3 bits * 2 pieces = 6.

Now we have 6 squares to place 2C 2D and 2H. Divide these 6 into two sets of 3.

Are both cats in the same set of 3? 1 bit.
Are both dogs in the same set of 3? 1 bit.

a) If both answers were YES:
Are cats both in first set? 1 bit. The answer to this question tells us whether we have {c,c,h}{d,d,h} or {d,d,h}{c,c,h}. Either way, we just need to find the horse in each set of 3.
A location amongst 3 spots requires 2 bits. Do this for both sets of 3. 4 bits.
(total required under option a is 5 bits)

b) If both answers were NO:
Now we know we have {c,d,h} in both sets, which we have to sort out. Do the following for each set:
Specify the location of the cat. 2 bits.
Now, out of the D/H, is the D first? 1 bit.
(total required under option b is 6 bits)

c) If one answer was YES, and the other NO (wlog cats were the YES):
We have a similar situation to answer (a) except dogs have been switched for horses. We know we either have:
{c,c,d}{h,h,d} or {h,h,d}{c,c,d}
(figuring this out requires 5 bits, similar to answer (a)).

So, at worst, we use:
64 + 31 + (15 + 6 + 2 + 6) * 2 = 153

Title: Re: Board Representation
Post by clauchau on Feb 4^th, 2008, 7:13am

Edit : Ouch, I used ruby to compute it, and forgot ^ is xor, not exponentiation, which is **. So what I describe below doesn't miraculously use only 122 bits but 149 :-X

Ah yes, poor poor me, I had really managed 153 bits. Instead of 14 bits for distinguishing Gold cats from dogs from horses from camel-elephant, first use 7 to distinguish cats-dogs from horses-camel-elephant (the possible 8th is the missing one), then 3 to distinguish cats from dogs (the possible 4th is the missing one) and 3 to distinguish horses from camel-elephant.

Combining lightvector arithmetics of combinations with this removal of 11 bits, and packing it all together we can get down to 149 bits :

There are A = 64C32 + 64C31 + ... + 64C0 = 10139684107326071075 ways to choose 0 to 32 occupied squares among 64.

There are B = 31C16 + 31C15 + ... + 31C0 = 1374282019 ways to choose 0 to 16 Gold pieces among 31.

There are C = 15C8 + 15C7 + ... + 15C0 = 22816 ways to choose 0 to 8 rabbits among 15 Gold pieces.

There are D = 7C4 + 7C3 + ... + 7C0 = 99 ways to choose 0 to 4 cats-dogs among 7 non-rabbit Gold pieces.

There are E = 3C2 + 3C1 + 3C0 = 7 ways to choose 0 to 2 cats among 3 Gold cats or dogs and 0 to 2 horses among 3 Gold horses or elephant or camel.

There are F = 2 ways to choose 0 to 1 camel among 1 Gold camel or elephant.

So we only need G = A*B*(C*D*E*E*F)^2 = 682813102718264674657879442836350524710195200 ~ 2^148.94 ~ 10^44.83

The drawback is you need arithmetics on 149-bits integers though.

Title: Re: Board Representation
Post by 99of9 on Feb 4^th, 2008, 7:35am

Nice! When computers go to 128 bits, someone might start trying this...

Title: Re: Board Representation
Post by clauchau on Feb 4^th, 2008, 8:26am

Sorry 99of9, I was highly dreaming for a while there, it seems there really are more than 2^128 different positions !

Title: Re: Board Representation
Post by lightvector on Feb 4^th, 2008, 8:47am

I suspect that something in the above calculations is wrong. It should not be possible to get to 122 bits simply by piece combinations.

Take the subcase where there are 32 pieces on the board. Then, the number of unique positions is given by:

64!/32!/8!/8!/2!/2!/2!/2!/2!/2! = 4.63 * 10^42 = 2^141.7

Any lossless encoding, regardless of how well it compresses, must distinguish each of these positions, and therefore must use at least 141.7 bits on average. And this doesn't even account for the cases where there are 31, 30, 29, etc. pieces. Taking into account things like traps, it could be possible to reduce it a bit, but the true optimal value should be close to 140 bits, if not slightly greater.

Edit: This was addressed just prior to my post.

Title: Re: Board Representation
Post by clauchau on Feb 4^th, 2008, 10:36am

Well spotted, lightvector.

So I'll use 20 bytes = 160 bits to store each position that has already occured during the game.

Now I realize I can use fewer bits when comparing a move with the other moves already generated for a given position, since most of the board is the same within the distance of a move (1 to 4 steps by the same player). Ideally I would like to compare the steps themselves modulo permutations and equivalent paths, but it seems a bit complicated to me for the moment.

I have a simple solution with 58 bits, i.e. 58 bits are enough to code every unique positions 1 move away from a known given position, whatever that fixed position is. Any astute improvement?

Title: Re: Board Representation
Post by leo on Feb 4^th, 2008, 3:06pm

on 02/04/08 at 10:36:52, clauchau wrote:

I have a simple solution with 58 bits, i.e. 58 bits are enough to code every unique positions 1 move away from a known given position, whatever that fixed position is. Any astute improvement?

I probably miss your point, but moves can be stored in 36 bits:

For each step:
- 6 bits for the index of the origin square
- 2 bits for the direction
- 1 bit telling whether the nearest trap was activated

(6+2+1)*4 = 9*4 = 36

But you are probably looking for a more convenient way to compute board state difference? Please tell us more!

EDIT: One can even use piece index on 5 bits instead of square index, since we start from an already analysed position. Which reduces the data to 32 bits. But I keep feeling that this is not what you're after. Do you search a system that allows generating the resulting full state of the board by a simple operation out of the previous state and the difference?

Title: Re: Board Representation
Post by clauchau on Feb 5^th, 2008, 6:40am

Leo, it's a good idea but I am not looking after a fast and compact way to code moves so that it's simple and fast to compose a board state with the code to get a new board state. For now listing steps is good enough for me for that purpose.

I need a fast equivalence test for comparing the million moves I generate for the same position : If I show you two moves with different steps, can you tell me if they result in getting the same position? If they do, they are equivalent and I ignore one of them.

So I would like a way to recode moves that would yield the same code if the moves are equivalent. Then I would only have to check equality of a few bytes. That's what hash/zobrist codes do on board states, except I also want to be sure the codes are different when the moves are not equivalent.

Your code is a good start, but it doesn't say how to give the same code to equivalent moves with different steps.

One solution is to compare the state boards. We could use the schemes we have designed here to deal with only 160 bits or so.

With the help of lightvector's combinatorial wizardry and your reduction from 6 to 5 bits, I now claim there is a scheme that only need 31 bits. Something at the complication level of 99of9's and lightvector's ideas, but reasonable enough, especially because it's now only 31 bits.

(By the way, the best solution would be to radically avoid any comparison by generating only one move for each set of equivalent moves. The move that represents his equivalence set could be minimal with respect to something. Every time I generate a move, I would only check whether it is minimal among all valid permutations of its steps and other null-effect changes on its paths. Anyway, it's a bit complicated to design right now.)

Title: Re: Board Representation
Post by clauchau on Feb 5^th, 2008, 7:52am

on 02/02/08 at 18:39:45, jdb wrote:

If 64 bit zobrist hashing is used, there is a better chance of being killed by a falling coconut while playing arimaa, than having an incorrect hash collision.

I realize you are right. I calculated that during a million games between a random mover and a piece-count-optimizing mover (57 turns on average and 14600 unique best moves on average for each turn), there is a risk of 1 over 3000 to have a collision if 64 bits are used.

I have been using 32 bit zobrist hashing, that's where I was wrong.

Anyway, I think I'll now use an exact 31-bit coding scheme for the move generator. I'm crazy about being as tight and exact as possible.