Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> Bot Development >> Asymmetrical static evaluation?
(Message started by: IdahoEv on Nov 28th, 2006, 5:33pm)

Title: Asymmetrical static evaluation?
Post by IdahoEv on Nov 28th, 2006, 5:33pm
Question for those who know more about game AI programming than myself.

Are static evaluators always symmetrical with respect to which player's turn it is?   That is, for a given board position, will they return the same evaluation score if it is white's turn as opposed to if it is black's turn?

It occurs to me that many, maybe even most, positions are of different relative merit depending on side-to-move.   Consider the position of Rc4 Ec3 ed3.    This is an immediately horrible position for gold if it is silver's turn; gold will lose its elephant and the game in three steps.  But if it is gold's turn to move, it's not so bad.

You could find out this fact by searching another ply, of course, but if your evaluator takes side-to-move into account you can save yourself that ply of search depth.  

Just curious if and how other programmers take that into account.  

Title: Re: Asymmetrical static evaluation?
Post by nbarriga on Nov 28th, 2006, 7:28pm
I don't think I fully understand the question, but as far as i know, even the sample_c bot takes that into consideration.

Title: Re: Asymmetrical static evaluation?
Post by Fritzlein on Nov 28th, 2006, 10:40pm
I have read about chess programs that suffer from an even-ply/odd-ply problem.  They always think they are doing better after searching to an odd depth, because they have made an extra move, than they think they are doing after searching to an even depth.  This implies that their static evaluation doesn't take side to move into account.  I don't know what, if anything, chess programmers have done to counteract this.  Maybe if they have quiescence search, and are searching deep enough, it just doesn't matter.

Title: Re: Asymmetrical static evaluation?
Post by ddyer on Jan 9th, 2007, 12:00pm
The short answer is yes, evaluators are symmetric for current-player and next-player.  I'm not aware of any result that says it has to be that way, but anything else tends to lead to perverse choices.

Suppose you wanted to score "piece en prise" as a minus unless it is your move.  Your evaluator would think that you can improve your position by making any move, even those that do not change the actual situation.

Because of this effect, searches generally require that everything be evaluated to the same search depth, so that merely kicking the can down the road doesn't affect the result.  But, if everything is evaluated to the same depth, and that depth does not include the "piece en prise" penalty, then effectively the penalty doesn't exist at all.

Title: Re: Asymmetrical static evaluation?
Post by IdahoEv on Jan 19th, 2007, 2:15pm
You had me, and then you lost me again.

Clearly, if your eval was asymmetric and had a penalty for piece en prise, you would need to search all branches to the same depth.  

But it would seem that searching to the same depth would be important anyway, because in most cases a player will be able to improve his position (as far as eval is concerned) by moving.

So if you're searching to a consistent depth anyway, why wouldn't checking piece en prise with knowledge of side-to-move be a good idea?

Title: Re: Asymmetrical static evaluation?
Post by ddyer on Jan 19th, 2007, 3:51pm
suppose you were set up for a queen exchange, but had
the piece-en-prise penalty for only one side.   That would
seriously skew your evaluation of the situation.

Title: Re: Asymmetrical static evaluation?
Post by NIC1138 on Mar 2nd, 2007, 10:48pm
I don't know if I really got the question, but I know that go is an asymmetrical game... As most of you may know, white (I think) tends to win by a little margin... Wouldn't be this the only situation where the parity of the ply requires perhaps a different attitude?

What about arimaa, do our estimated masters detect any advantage in being the first of the second to play? What's the statistics in the "pro" games? Do we even have enough games to make a statistically-relevant measurement? :)

Title: Re: Asymmetrical static evaluation?
Post by Fritzlein on Mar 3rd, 2007, 9:12am
Yes, I think you are asking a slightly different question than IdahoEv was, but we have looked at that one a little bit too.

If you look only at human vs. bot games, then Gold has a tiny advantage in winning percentage, but not enough to be statistically significant.

If you look only at bot vs. bot games, the advantage of Gold is greater, but still not statistically significant.

If you look only at human vs. human games, there is a statistically significant advantage for Silver, a much larger gap than the measured advantage for the other two cases.  I can't explain it, but that's what the numbers say.

My own opinion is that objectively Gold has a tiny advantage, but it is too small to worry about.  However, the statistics make me keep an open mind that the second setup is worth more than the first move.

Title: Re: Asymmetrical static evaluation?
Post by JacquesB on Mar 4th, 2007, 2:17pm
Just two digressions and some questions:

Reason A. I would say gold gets the advantage of initiative. If there were some key points to be taken, gold could take them before, but that doesn't sound very likely. In fact, against correct defense, I am still not sure what atacking strategies work, if any.

Reason B. But, if Arimaa was solved (Its impossible, I know) I can see a reason why silver could have an advantage: The first move. Gold deploys its pieces blindly while silver answers gold's deployment. Since the game is not solved, it is pointless to make both players deploy their pieces in eight turns each, but that would be smaller advantage for silver. A deployment of silvers's pieces with the current ruleset optimized to beat gold could be the reason why silver could win a perfect game.

For me, reason A sounds very small and B even smaller. But I would like stronger players answer: Is there a something to be won in the first moves that justifies a race in which gold could have an advantage? When playing silver, do you deploy your pieces to suit your favorite strategic ideas or are you answering gold's position? (Except for the "two elephants in the same column" issue. BTW Is that really important? I recently played that as gold as was not able to win anything from silver's supposed weakness.)

Title: Re: Asymmetrical static evaluation?
Post by chessandgo on Mar 4th, 2007, 3:02pm
Jacques :

I'm not aware of any way for silver to take advantage of your B-reason when gold choses one of the most usual setups (99of9-like setups), so I'd say only advantage A is relevant (and indeed it is, whatvever the right strategy might be, rabbit-pulling or activated pieces and attacking, or ..., better have an extra move to do it than being a move down :)). So I think gold has an advantage, but we don't play well enough for it to be relevant for a game result ... even for 10000 games results :)

As for the elpehants on the same column, that's not a big problem, but it makes silver lose a few more steps, roughly.

Please be confident in attacking strategies :) and enjoy your games ;)

Title: Re: Asymmetrical static evaluation?
Post by NIC1138 on Mar 6th, 2007, 10:28pm
Watch out, people, let's not start looking for rasons for the assymetry before we actually detect it! :)

I'm not very sure about how to estimate the error of the measurement, but from this arcticle

http://en.wikipedia.org/wiki/Checking_if_a_coin_is_fair

it looks like 10000 games would be enough to detect a 1% unbalance with a 95.45% level of confidence!... It seems 1600 games would be enough to give us an estimative with  2,5% error.

From the file  

http://arimaa.com/arimaa/download/gameData/ratedgames.tgz

It seems there are only  2353 human-human rated  games, of which only 1658 went to "the last consequences". 928 were won by black,  730 by white. Not a very short margin if you ask me!!...

The current imbalance would be a 56% chance of black winning, against a 44% of white... Looks like it's detected!

In a 4 sigma confidence margin, this would mean an error of E = 4/(2*sqrt( 1658 )) = 4,91%. So the probability of white winning would be inside the interval 39,117%  48,941%

Can anyone with more statistical training and feeling less sleepy do that rigorous calcultions to us? ;D  (peer-review is paramount)


Now, another theory on the causes for the imbalance... Not a systematic one!... It could be that most of the times the game starts with a stronger player inviting a weaker player, and giving him the gold position!  ::)


EDIT
Only now I saw that this topic (the way I put it!) is being discussed somewhere else!! :)  http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=talk;action=display;num=1133407311;start=15


Title: Re: Asymmetrical static evaluation?
Post by Fritzlein on Mar 7th, 2007, 9:54am
You might also find this thread interesting:
http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=talk;action=display;num=1163650023;start=0#0

Title: Re: Asymmetrical static evaluation?
Post by JacquesB on Mar 7th, 2007, 1:28pm
Hi Nicolau

From the proportions in your post 928 out of 1658 you must select ridiculously high confidence levels (0.999999) to include 1/2 in the confidence interval.

You can use the calculator at:

http://www.causascientia.org/math_stat/ProportionCI.html

The real value of p (the unknown probability of gold to win) lies:

between:   and                 with confidence
0.499694   0.618489   0.999999
0.528145   0.590862   0.99
0.535700   0.583441   0.95

It is biased at 99% of confidence, but not at 99.9999%

Title: Re: Asymmetrical static evaluation?
Post by Fritzlein on Mar 7th, 2007, 4:56pm

on 03/06/07 at 22:28:56, NIC1138 wrote:
It seems there are only  2353 human-human rated  games, of which only 1658 went to "the last consequences". 928 were won by black,  730 by white. Not a very short margin if you ask me!!...
[...]
Now, another theory on the causes for the imbalance... Not a systematic one!... It could be that most of the times the game starts with a stronger player inviting a weaker player, and giving him the gold position!

Yes, the raw numbers are not to be trusted, because often the stronger player will invite, giving himself the Silver pieces.  In 2478 rated HvH games in my database, the Silver player was rated an average of 34.7 rating points higher than the Gold player.  It is beside the point to consider the confidence interval of a statistic that is known to be biased in the first place.

In the other threads where this question was discussed, I accounted for possible color imbalance due to invitations by only counting pairs of games where the two players involved played with reversed colors.  JacquesB, I would  be interested to hear your comments on the confidence of the statistics I gave here: http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=talk;action=display;nu m=1163650023;start=0#0

Title: Re: Asymmetrical static evaluation?
Post by IdahoEv on Mar 7th, 2007, 6:16pm

on 03/07/07 at 16:56:30, Fritzlein wrote:
often the stronger player will invite, giving himself the Silver pieces.


Which reminds me ... A possible gameroom feature I could work up on the dev server: an invite option with side set to "random".   Neither player knows which side he will play until after the invite is accepted.    I know I would use it for almost all of my invites; I hate the stress of deciding which side to offer.

Title: Re: Asymmetrical static evaluation?
Post by woh on Mar 8th, 2007, 11:24am

on 03/07/07 at 18:16:25, IdahoEv wrote:
Which reminds me ... A possible gameroom feature I could work up on the dev server: an invite option with side set to "random".


Great idea!
I would make use of such an option.

Title: Re: Asymmetrical static evaluation?
Post by 99of9 on Mar 8th, 2007, 4:23pm
So would I.

Title: Re: Asymmetrical static evaluation?
Post by NIC1138 on Mar 8th, 2007, 5:15pm

on 03/07/07 at 13:28:36, JacquesB wrote:
http://www.causascientia.org/math_stat/ProportionCI.html


Nice site, thanks!! ;)

Title: Re: Asymmetrical static evaluation?
Post by NIC1138 on Mar 8th, 2007, 5:34pm

on 03/07/07 at 16:56:30, Fritzlein wrote:
It is beside the point to consider the confidence interval of a statistic that is known to be biased in the first place.


Not quite so, if you allow me a moment of pedantism license!... ;D What happens is that we will measure with accuracy the probability of silver winning "in a usual gameroom match". The usual gameroom match has a stronger silver player...

As yourself hinted on another thread, what we would really like is to measure with accuracy the probability of silver (or gold)  winning given the difference of the player ratings... As you said, if there is an imbalance, a (little) stronger gold player would have 50% chance to win a (little)  weaker silver player.

Even worse, what we would appreciate is to predict the winner given the ratings of each one, not only the difference!... It could be, for example, that for new players there is no imbalance, and the imbalance shows up for better players...

But unfortunately, the dimensional curse pokes a needle in our butt for each new variable we want to consider!... The less naïveté, the more games we need to play! :-/

Bringing the concept from particle physics, I would say we need to enhance the ilumminance of our arimaa player collider!  8)

Title: Re: Asymmetrical static evaluation?
Post by JacquesB on Mar 9th, 2007, 12:14pm
I just computed the intervals from the data in Nicolau's post. Knowing that it is a common procedure that the stronger player takes silver (which I didn't), that was clearly an error. This is a good example of wrong sampling or selection bias.

Nic's answer sounds justified, its not pedantic.

Furthermore, reading the thread pointed by Fritzlein (your method of taking the samples in pairs sounds ok) I see the bias is not measurable. Also, what PMertens says that it depends on the user's style is true. In go weak players don't really understand how important one move is. Therefore, they are balanced with a smaller komi (= compensation for playing first) than strong players. If the advantage exists, it is very possible that weak players miss it completely.

Title: Re: Asymmetrical static evaluation?
Post by Fritzlein on Mar 10th, 2007, 12:16pm
I agree, JacquesB, that an advantage which is important at a high level of play may not be noticeable at a lower level.  

In keeping with NIC's (non-pedantic) point, it really depends on what we are trying to measure.  Maybe for 1500-rated players it doesn't matter who moves first, and for 1800-rated players Silver has an advantage, while for 2100-rated players Gold has an advantage.  Even if we measure something about the best players who exist today, what does it prove about Arimaa per se?  At a still higher level, the advantage may swing again.

This is much different from chess where tests seem to show an advantage of about 50 rating points for white at all levels of play from beginner to grandmaster.  Maybe that advantage disappears for both random play and for perfect play, but it is remarkable how constant it is in between.

When we have a large enough game database, we might be able to prove that neither side has an appreciable advantage at any level of play.  Until then, it's anybody's guess why Silver apparently has a statistically significant advantage in rated human vs. human games.  I'm leaning towards it being a fluke, but maybe my methodology is flawed in some important way we haven't seen yet.

Title: Re: Asymmetrical static evaluation?
Post by NIC1138 on Mar 28th, 2007, 12:10am

on 03/10/07 at 12:16:22, Fritzlein wrote:
This is much different from chess where tests seem to show an advantage of about 50 rating points for white at all levels of play from beginner to grandmaster.  Maybe that advantage disappears for both random play and for perfect play, but it is remarkable how constant it is in between.

It's sure a remarkable fact... Do you have any biblio about this? I can't say what is more interesting: an advantage to change with rating, or to keep stactic! :)  As a researcher I actually have an answer: the stactic advantage, since it's easier to detect! ::)

I do believe we will stumble into an advantage in the future... No reason. Should we start a bet, to run in parallel to the Arimaa Challange? ;)

Title: Re: Asymmetrical static evaluation?
Post by aaaa on Apr 22nd, 2007, 7:27pm
Based on my last game against bot_Clueless2006Fast I ponder another kind of asymmetrical evaluation I could imagine making sense. In that game the bot managed to immobilize my elephant and subsequently when to extreme lengths maintaining the blockade. Problem was that it needed its own elephant for this, making it in effect worse than useless. Trying to rotate it out was clearly infeasible in that position especially given its lack of strategic insight. Since, however, one would like to prevent a bot from becoming itself a victim of an elephant blockade, it would make sense to evaluate a position differently depending on which side the bot is.

Title: Re: Asymmetrical static evaluation?
Post by NIC1138 on Apr 22nd, 2007, 9:46pm

on 04/22/07 at 19:27:21, aaaa wrote:
Since, however, one would like to prevent a bot from becoming itself a victim of an elephant blockade, it would make sense to evaluate a position differently depending on which side the bot is.

Where does the assymetry come in?... There is no contradiction in making a blockade and escaping one. You just need to set wat is the point when you would rather one than the other...

You mean that the program must have a form of "strategy register"  where he decides what to do in the long-term, and then acts accordingly? BECause that is something I've been thinking about lately, and I wonder who would have done something like this already, does anybody has any references?....

Title: Re: Asymmetrical static evaluation?
Post by aaaa on Apr 22nd, 2007, 10:15pm
What I'm saying here is that to a bot being blockaded is more a disadvantage than being the one doing the blockading is an advantage. That's the asymmetry here.

Title: Re: Asymmetrical static evaluation?
Post by 99of9 on Apr 22nd, 2007, 11:38pm
You are right aaaa.

I have not put that into my bot, but I can see that it would sometimes be useful.

Title: Re: Asymmetrical static evaluation?
Post by camelback on Apr 23rd, 2007, 11:52pm

on 04/22/07 at 19:27:21, aaaa wrote:
Based on my last game against bot_Clueless2006Fast I ponder another kind of asymmetrical evaluation I could imagine making sense. In that game the bot managed to immobilize my elephant and subsequently when to extreme lengths maintaining the blockade. Problem was that it needed its own elephant for this, making it in effect worse than useless. Trying to rotate it out was clearly infeasible in that position especially given its lack of strategic insight. Since, however, one would like to prevent a bot from becoming itself a victim of an elephant blockade, it would make sense to evaluate a position differently depending on which side the bot is.



aaaa, here is another example of bot trying to maintain a useless blockade.
Gnobby maintained the blockade using all the pieces to the end and to my surprise lost the game :D

http://arimaa.com/arimaa/games/jsShowGame.cgi?gid=50422&s=w

Title: Re: Asymmetrical static evaluation?
Post by ddyer on May 17th, 2007, 12:55pm
This thread is mixing discussions of two very different
things.   The original question is if the evaluation FUNCTION
should be the same for both sides.  The other question.
which took over the later stages of the thread. is if the
VALUE of the function initially favors silver or gold.

Title: Re: Asymmetrical static evaluation?
Post by Fritzlein on May 17th, 2007, 2:02pm

on 05/17/07 at 12:55:15, ddyer wrote:
This thread is mixing discussions of two very different things.

Actually three different things: whether the side to move affects static evaluation, which side has the advantage from the starting position, and whether the evaluations for the two sides are equal and opposite.

On the third point (which aaaa raised), I recall that the chess software Crafty used an asymmetrical evaluation, at least at some point in its evolution.  In order to give Crafty a better chance of beating humans, Hyatt made Crafty hate locked pawns.  If the pawn structure was static, Crafty could think that it was losing both sides of the position.

For a computer which is able to beat humans in most positions, it is an interesting concept to avoid the few positions that give humans an edge.  On the other hand, I'm not sure how useful that concept is for Arimaa at present.  Right now there are many strategic features that bots don't grasp well, so I would like any of those features to be present no matter which side I am playing.  For example, against a bot I would like to

* Give my horse hostage to its elephant or get its horse hostage with my elephant
* Frame its camel or get my camel framed
* Get its elephant blockaded (almost always) or allow my elephant to be blockaded (in some circumstances)
* Get in a race where we are each capturing as fast as possible (bots don't weigh material properly against remote goal threats)
* Pull out the bot's rabbits (almost always) or advance my own rabbits (in some circumstances)

In short, many strategic features of Arimaa are double-edged.  My advantage over bots comes from recognizing the importance of each feature, and how it changes my long-term objectives.  You can't get around that by programming a bot to avoid all strategic features.  Instead it has to learn how to play differently (and well) in each circumstance.


Title: Re: Asymmetrical static evaluation?
Post by aaaa on May 17th, 2007, 7:24pm
To go back to the originating issue brought up by IdahoEv: The idea of according "en-prise penalties" can be intriguing at first, but you have to realize that complications set in if you are in a situation where different pieces can be captured in one move. Then you need your engine to figure out how many of them can be captured in one move. If it's just one, then instead of the sum the maximum of the en-prise penalties should be added to the evaluation score and if it's more, then it gets even more complicated. All in all, it just doesn't seem to be worth it given the concept of quiescence search (although in that case you might be able to find a use for asymmetric evaluation for the purpose of distinguishing stable and unstable positions).

Title: Re: Asymmetrical static evaluation?
Post by IdahoEv on Jun 11th, 2007, 7:31pm
I'm glad to see someone bring it back to the original issue.

It occurs to me that Arimaa frequently goes through many turns of both players alternating en prise or goal threats while one player attempts to build up two threats at once.   If, say, gold has a threat on the board, silver may be able to delay it without preventing gold from re-establishing it, and this alternation may proceed 5 or 6 turns until silver either breaks the threat entirely, establishes a matching threat, or gold establishes two threats at once.

It seems to me that the kind of state that would lead to this play would be impossible to evaluate correctly if you did not take side to move into account in your evaluation function.    



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.