Arimaa Forum - Empirically derived material evaluators Part II

Welcome, Guest. Please Login or Register.
Jul 9^th, 2025, 4:00pm

Home

Help

Members

Arimaa Forum « Empirically derived material evaluators Part II »

   Arimaa Forum
   Arimaa
   General Discussion (Moderator: supersamu)
   Empirically derived material evaluators Part II

« Previous topic | Next topic »

Pages: 1 2 3

Notify of replies

Send Topic

Author

Topic: Empirically derived material evaluators Part II (Read 12107 times)

99of9
Forum Guru

Gnobby's creator (player #314)

Gender:

Posts: 1413

Re: Empirically derived material evaluators Part I
« Reply #30 on: Apr 22^nd, 2008, 2:15am »

Quote

Modify

on Nov 16^th, 2006, 9:56pm, Fritzlein wrote:

Which of the systems is the IdahoEv System? My vote would be for the optimized linearAB to take a place next to the "official" FAME and DAPE scores on Janzert's material calculator. Or maybe you aren't done yet?

I'd recommend against putting your name to this method just yet

. Reasons below.

on Nov 16^th, 2006, 4:43pm, IdahoEv wrote:

LinearAB

The optimized values:
A=1.241 +/- 0.004
B=1.316 +/- 0.002

I'm continually surprised at how well this evaluator has performed. I originally implemented it as a toss-off algorithm just to test my optimizer. In particular, the fact that it simply values all rabbits at 1.0 points seems incredibly naive to me, as I believe that the 8th rabbit lost should be much more expensive than the 1st rabbit lost.

The empirical optimizer seems to disagree. What can I say.

I just discovered a fairly serious problem with linearAB (and everything based on it). Sometimes it will refuse to kill free enemy pieces!!

Say you have a full army and your opponent has d8r, the collapsed notation is:
004228-000108 linearAB_score = 12.71

Now you have the opportunity to kill the dog, leaving the board as:
000088-000008 linearAB_score = 9.93

Because it reduces your score (!) you will not make the capture. Or if the sides were reversed, this could result in deliberate suicides.

The problem is related to the fact that levels collapse down, but the value of each level remains the same. Perhaps collapsing up would be better, but this might cause other problems.

PS. Under this eval, Arimaabuff's CR_only handicap is actually more valuable than an R_only handicap!!!

« Last Edit: Apr 22^nd, 2008, 2:22am by 99of9 »

IP Logged

aaaa
Forum Guru

Arimaa player #958

Posts: 768

Re: Empirically derived material evaluators Part I
« Reply #31 on: Apr 22^nd, 2008, 8:49am »

Quote

Modify

Best then to just forgo any collapsing at all with this system and take the lack of ability to recognize equivalent states for granted.

IP Logged

IdahoEv
Forum Guru

Arimaa player #1753

Gender: male

Posts: 405

Re: Empirically derived material evaluators Part I
« Reply #32 on: Apr 22^nd, 2008, 1:03pm »

Quote

Modify

on Apr 22^nd, 2008, 2:15am, 99of9 wrote:

That's definitely a very interesting point.

However, trying it out I am unable to construct any such cases (where capturing reduces your score) when both sides still possess their elephant. And even after a lost elephant, it looks like that sort of collapse fault only occurs in extreme corner cases where one side has essentially no material left.

The primary goal of a material eval function, for me, is bot development, and I don't think cases like this one occur in competitive games.

IP Logged

99of9
Forum Guru

Gnobby's creator (player #314)

Gender:

Posts: 1413

Re: Empirically derived material evaluators Part I
« Reply #33 on: Apr 22^nd, 2008, 8:34pm »

Quote

Modify

on Apr 22^nd, 2008, 1:03pm, IdahoEv wrote:

However, trying it out I am unable to construct any such cases (where capturing reduces your score) when both sides still possess their elephant. And even after a lost elephant, it looks like that sort of collapse fault only occurs in extreme corner cases where one side has essentially no material left.

When you have a full army and your opponent has ed8r the score is
[013228-010108] score 10.56

when you kill his dog it becomes:
[000178-000108] score 8.69

Now I agree these anti-capture situations are only for quite unbalanced armies (or very strange trades), but the issue of collapsing may cause other more subtle problems even in more normal situations.

« Last Edit: Apr 22^nd, 2008, 8:37pm by 99of9 »

IP Logged

99of9
Forum Guru

Gnobby's creator (player #314)

Gender:

Posts: 1413

Re: Empirically derived material evaluators Part I
« Reply #34 on: Apr 22^nd, 2008, 8:48pm »

Quote

Modify

on Apr 22^nd, 2008, 8:34pm, 99of9 wrote:

the issue of collapsing may cause other more subtle problems even in more normal situations.

Here's one (which according to linearAB is almost balanced):
EMHH4R vs ed8r (silver ahead by 0.34)
EMHH4R vs e8r (silver ahead by 0.28)

In this case gold may prefer some positional advantage worth more than 0.06 rather than killing the opponent's second strongest piece!!

« Last Edit: Apr 22^nd, 2008, 8:48pm by 99of9 »

IP Logged

IdahoEv
Forum Guru

Arimaa player #1753

Gender: male

Posts: 405

Re: Empirically derived material evaluators Part I
« Reply #35 on: Apr 23^rd, 2008, 4:05am »

Quote

Modify

Okay! You've convinced me.

I still believe that understanding level collapse at some level is important, but it is clear that simple downward collapse will not suffice.

IP Logged

99of9
Forum Guru

Gnobby's creator (player #314)

Gender:

Posts: 1413

Re: Empirically derived material evaluators Part I
« Reply #36 on: Apr 23^rd, 2008, 5:39am »

Quote

Modify

on Apr 23^rd, 2008, 4:05am, IdahoEv wrote:

I still believe that understanding level collapse at some level is important, but it is clear that simple downward collapse will not suffice.

Yes, I agree with that. DAPE accounts for it automatically because values only depend on the number of stronger and the number of equal pieces. FAME collapses up in a fancy way from memory?

IP Logged

Fritzlein
Forum Guru

Arimaa player #706

Gender:

Posts: 5928

Re: Empirically derived material evaluators Part I
« Reply #37 on: Apr 23^rd, 2008, 8:09am »

Quote

Modify

on Apr 23^rd, 2008, 5:39am, 99of9 wrote:

FAME collapses up in a fancy way from memory?

Yes, FAME collapses the pieces up, and the rabbit value is divided by the amount of enemy material, so it goes up too as pieces are captured. One thing that FAME does generally right (although probably not accurately) is consider the same absolute material advantage to be worth more as the board empties out. One thing that FAME does generally wrong (and DAPE does right?) is fail to consider the each piece relative to all opposing pieces instead of only the piece it "lines up against".

« Last Edit: Apr 23^rd, 2008, 8:16am by Fritzlein »

IP Logged

99of9
Forum Guru

Gnobby's creator (player #314)

Gender:

Posts: 1413

Re: Empirically derived material evaluators Part I
« Reply #38 on: Apr 23^rd, 2008, 8:42am »

Quote

Modify

on Apr 23^rd, 2008, 8:09am, Fritzlein wrote:

One thing that FAME does generally right (although probably not accurately) is consider the same absolute material advantage to be worth more as the board empties out.

That is only good if the pieces that are emptying are higher in value than the place where the inequality is, or if the imbalance includes an imbalance in the number of pieces. FAME and DAPE do that right. If the pieces that are emptying are lower than the place where the inequality is, and the inequality is an equal numbered trade (say M for H), I believe the same absolute material advantage decreases in value, but both DAPE and FAME keep it constant.

Quote:

One thing that FAME does generally wrong (and DAPE does right?) is fail to consider the each piece relative to all opposing pieces instead of only the piece it "lines up against".

Yes, DAPE does this, but everything depends on the word "consider"... I'm sure there are better and worse ways to consider them.

IP Logged

Pages: 1 2 3

Notify of replies

Send Topic


« Previous topic \| Next topic »