Welcome, Guest. Please Login or Register.
Dec 8th, 2024, 4:06pm

Home Home Help Help Search Search Members Members Login Login Register Register
Arimaa Forum « Empirically derived material evaluators Part II »


   Arimaa Forum
   Arimaa
   General Discussion
(Moderator: supersamu)
   Empirically derived material evaluators Part II
« Previous topic | Next topic »
Pages: 1 2 3  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: Empirically derived material evaluators Part II  (Read 11744 times)
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: Empirically derived material evaluators Part I
« Reply #30 on: Apr 22nd, 2008, 2:15am »
Quote Quote Modify Modify

on Nov 16th, 2006, 9:56pm, Fritzlein wrote:
Which of the systems is the IdahoEv System?  My vote would be for the optimized linearAB to take a place next to the "official" FAME and DAPE scores on Janzert's material calculator.  Or maybe you aren't done yet?

 
I'd recommend against putting your name to this method just yet Smiley.    Reasons below.  
 
on Nov 16th, 2006, 4:43pm, IdahoEv wrote:

LinearAB
 
The optimized values:
A=1.241 +/- 0.004
B=1.316 +/- 0.002
 
I'm continually surprised at how well this evaluator has performed.   I originally implemented it as a toss-off algorithm just to test my optimizer.   In particular, the fact that it simply values all rabbits at 1.0 points seems incredibly naive to me, as I believe that the 8th rabbit lost should be much more expensive than the 1st rabbit lost.
 
The empirical optimizer seems to disagree.  What can I say.

 
I just discovered a fairly serious problem with linearAB (and everything based on it).  Sometimes it will refuse to kill free enemy pieces!!
 
Say you have a full army and your opponent has d8r, the collapsed notation is:
004228-000108 linearAB_score = 12.71
 
Now you have the opportunity to kill the dog, leaving the board as:
000088-000008 linearAB_score = 9.93
 
Because it reduces your score (!) you will not make the capture.  Or if the sides were reversed, this could result in deliberate suicides.
 
The problem is related to the fact that levels collapse down, but the value of each level remains the same.  Perhaps collapsing up would be better, but this might cause other problems.
 
PS. Under this eval, Arimaabuff's CR_only handicap is actually more valuable than an R_only handicap!!!
« Last Edit: Apr 22nd, 2008, 2:22am by 99of9 » IP Logged
aaaa
Forum Guru
*****



Arimaa player #958

   


Posts: 768
Re: Empirically derived material evaluators Part I
« Reply #31 on: Apr 22nd, 2008, 8:49am »
Quote Quote Modify Modify

Best then to just forgo any collapsing at all with this system and take the lack of ability to recognize equivalent states for granted.
IP Logged
IdahoEv
Forum Guru
*****



Arimaa player #1753

   


Gender: male
Posts: 405
Re: Empirically derived material evaluators Part I
« Reply #32 on: Apr 22nd, 2008, 1:03pm »
Quote Quote Modify Modify

on Apr 22nd, 2008, 2:15am, 99of9 wrote:

I just discovered a fairly serious problem with linearAB (and everything based on it).  Sometimes it will refuse to kill free enemy pieces!!
 
Say you have a full army and your opponent has d8r, the collapsed notation is:
004228-000108 linearAB_score = 12.71
 
Now you have the opportunity to kill the dog, leaving the board as:
000088-000008 linearAB_score = 9.93

 
That's definitely a very interesting point.  
 
However, trying it out I am unable to construct any such cases (where capturing reduces your score) when both sides still possess their elephant.   And even after a lost elephant, it looks like that sort of collapse fault only occurs in extreme corner cases where one side has essentially no material left.
 
The primary goal of a material eval function, for me, is bot development, and I don't think cases like this one occur in competitive games.
 
IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: Empirically derived material evaluators Part I
« Reply #33 on: Apr 22nd, 2008, 8:34pm »
Quote Quote Modify Modify

on Apr 22nd, 2008, 1:03pm, IdahoEv wrote:
However, trying it out I am unable to construct any such cases (where capturing reduces your score) when both sides still possess their elephant.   And even after a lost elephant, it looks like that sort of collapse fault only occurs in extreme corner cases where one side has essentially no material left.

 
When you have a full army and your opponent has ed8r the score is  
[013228-010108] score 10.56
 
when you kill his dog it becomes:
[000178-000108] score 8.69
 
Now I agree these anti-capture situations are only for quite unbalanced armies (or very strange trades), but the issue of collapsing may cause other more subtle problems even in more normal situations.
« Last Edit: Apr 22nd, 2008, 8:37pm by 99of9 » IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: Empirically derived material evaluators Part I
« Reply #34 on: Apr 22nd, 2008, 8:48pm »
Quote Quote Modify Modify

on Apr 22nd, 2008, 8:34pm, 99of9 wrote:
the issue of collapsing may cause other more subtle problems even in more normal situations.

 
Here's one (which according to linearAB is almost balanced):
EMHH4R vs ed8r (silver ahead by 0.34)
EMHH4R vs e8r (silver ahead by 0.28)
 
In this case gold may prefer some positional advantage worth more than 0.06 rather than killing the opponent's second strongest piece!!
« Last Edit: Apr 22nd, 2008, 8:48pm by 99of9 » IP Logged
IdahoEv
Forum Guru
*****



Arimaa player #1753

   


Gender: male
Posts: 405
Re: Empirically derived material evaluators Part I
« Reply #35 on: Apr 23rd, 2008, 4:05am »
Quote Quote Modify Modify

Okay!  You've convinced me.   Smiley
 
I still believe that understanding level collapse at some level is important, but it is clear that simple downward collapse will not suffice.
IP Logged
99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: Empirically derived material evaluators Part I
« Reply #36 on: Apr 23rd, 2008, 5:39am »
Quote Quote Modify Modify

on Apr 23rd, 2008, 4:05am, IdahoEv wrote:
I still believe that understanding level collapse at some level is important, but it is clear that simple downward collapse will not suffice.

Yes, I agree with that.  DAPE accounts for it automatically because values only depend on the number of stronger and the number of equal pieces.  FAME collapses up in a fancy way from memory?
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: Empirically derived material evaluators Part I
« Reply #37 on: Apr 23rd, 2008, 8:09am »
Quote Quote Modify Modify

on Apr 23rd, 2008, 5:39am, 99of9 wrote:
FAME collapses up in a fancy way from memory?

Yes, FAME collapses the pieces up, and the rabbit value is divided by the amount of enemy material, so it goes up too as pieces are captured.  One thing that FAME does generally right (although probably not accurately) is consider the same absolute material advantage to be worth more as the board empties out.  One thing that FAME does generally wrong (and DAPE does right?) is fail to consider the each piece relative to all opposing pieces instead of only the piece it "lines up against".
« Last Edit: Apr 23rd, 2008, 8:16am by Fritzlein » IP Logged

99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: Empirically derived material evaluators Part I
« Reply #38 on: Apr 23rd, 2008, 8:42am »
Quote Quote Modify Modify

on Apr 23rd, 2008, 8:09am, Fritzlein wrote:
One thing that FAME does generally right (although probably not accurately) is consider the same absolute material advantage to be worth more as the board empties out.

That is only good if the pieces that are emptying are higher in value than the place where the inequality is, or if the imbalance includes an imbalance in the number of pieces.  FAME and DAPE do that right.  If the pieces that are emptying are lower than the place where the inequality is, and the inequality is an equal numbered trade (say M for H), I believe the same absolute material advantage decreases in value, but both DAPE and FAME keep it constant.
 
Quote:
One thing that FAME does generally wrong (and DAPE does right?) is fail to consider the each piece relative to all opposing pieces instead of only the piece it "lines up against".

Yes, DAPE does this, but everything depends on the word "consider"... I'm sure there are better and worse ways to consider them.
IP Logged
Pages: 1 2 3  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.