Author |
Topic: Empirically derived material evaluators Part II (Read 11744 times) |
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Empirically derived material evaluators Part I
« Reply #30 on: Apr 22nd, 2008, 2:15am » |
Quote Modify
|
on Nov 16th, 2006, 9:56pm, Fritzlein wrote:Which of the systems is the IdahoEv System? My vote would be for the optimized linearAB to take a place next to the "official" FAME and DAPE scores on Janzert's material calculator. Or maybe you aren't done yet? |
| I'd recommend against putting your name to this method just yet . Reasons below. on Nov 16th, 2006, 4:43pm, IdahoEv wrote: LinearAB The optimized values: A=1.241 +/- 0.004 B=1.316 +/- 0.002 I'm continually surprised at how well this evaluator has performed. I originally implemented it as a toss-off algorithm just to test my optimizer. In particular, the fact that it simply values all rabbits at 1.0 points seems incredibly naive to me, as I believe that the 8th rabbit lost should be much more expensive than the 1st rabbit lost. The empirical optimizer seems to disagree. What can I say. |
| I just discovered a fairly serious problem with linearAB (and everything based on it). Sometimes it will refuse to kill free enemy pieces!! Say you have a full army and your opponent has d8r, the collapsed notation is: 004228-000108 linearAB_score = 12.71 Now you have the opportunity to kill the dog, leaving the board as: 000088-000008 linearAB_score = 9.93 Because it reduces your score (!) you will not make the capture. Or if the sides were reversed, this could result in deliberate suicides. The problem is related to the fact that levels collapse down, but the value of each level remains the same. Perhaps collapsing up would be better, but this might cause other problems. PS. Under this eval, Arimaabuff's CR_only handicap is actually more valuable than an R_only handicap!!!
|
« Last Edit: Apr 22nd, 2008, 2:22am by 99of9 » |
IP Logged |
|
|
|
aaaa
Forum Guru
Arimaa player #958
Posts: 768
|
|
Re: Empirically derived material evaluators Part I
« Reply #31 on: Apr 22nd, 2008, 8:49am » |
Quote Modify
|
Best then to just forgo any collapsing at all with this system and take the lack of ability to recognize equivalent states for granted.
|
|
IP Logged |
|
|
|
IdahoEv
Forum Guru
Arimaa player #1753
Gender:
Posts: 405
|
|
Re: Empirically derived material evaluators Part I
« Reply #32 on: Apr 22nd, 2008, 1:03pm » |
Quote Modify
|
on Apr 22nd, 2008, 2:15am, 99of9 wrote: I just discovered a fairly serious problem with linearAB (and everything based on it). Sometimes it will refuse to kill free enemy pieces!! Say you have a full army and your opponent has d8r, the collapsed notation is: 004228-000108 linearAB_score = 12.71 Now you have the opportunity to kill the dog, leaving the board as: 000088-000008 linearAB_score = 9.93 |
| That's definitely a very interesting point. However, trying it out I am unable to construct any such cases (where capturing reduces your score) when both sides still possess their elephant. And even after a lost elephant, it looks like that sort of collapse fault only occurs in extreme corner cases where one side has essentially no material left. The primary goal of a material eval function, for me, is bot development, and I don't think cases like this one occur in competitive games.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Empirically derived material evaluators Part I
« Reply #33 on: Apr 22nd, 2008, 8:34pm » |
Quote Modify
|
on Apr 22nd, 2008, 1:03pm, IdahoEv wrote:However, trying it out I am unable to construct any such cases (where capturing reduces your score) when both sides still possess their elephant. And even after a lost elephant, it looks like that sort of collapse fault only occurs in extreme corner cases where one side has essentially no material left. |
| When you have a full army and your opponent has ed8r the score is [013228-010108] score 10.56 when you kill his dog it becomes: [000178-000108] score 8.69 Now I agree these anti-capture situations are only for quite unbalanced armies (or very strange trades), but the issue of collapsing may cause other more subtle problems even in more normal situations.
|
« Last Edit: Apr 22nd, 2008, 8:37pm by 99of9 » |
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Empirically derived material evaluators Part I
« Reply #34 on: Apr 22nd, 2008, 8:48pm » |
Quote Modify
|
on Apr 22nd, 2008, 8:34pm, 99of9 wrote:the issue of collapsing may cause other more subtle problems even in more normal situations. |
| Here's one (which according to linearAB is almost balanced): EMHH4R vs ed8r (silver ahead by 0.34) EMHH4R vs e8r (silver ahead by 0.28) In this case gold may prefer some positional advantage worth more than 0.06 rather than killing the opponent's second strongest piece!!
|
« Last Edit: Apr 22nd, 2008, 8:48pm by 99of9 » |
IP Logged |
|
|
|
IdahoEv
Forum Guru
Arimaa player #1753
Gender:
Posts: 405
|
|
Re: Empirically derived material evaluators Part I
« Reply #35 on: Apr 23rd, 2008, 4:05am » |
Quote Modify
|
Okay! You've convinced me. I still believe that understanding level collapse at some level is important, but it is clear that simple downward collapse will not suffice.
|
|
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Empirically derived material evaluators Part I
« Reply #36 on: Apr 23rd, 2008, 5:39am » |
Quote Modify
|
on Apr 23rd, 2008, 4:05am, IdahoEv wrote:I still believe that understanding level collapse at some level is important, but it is clear that simple downward collapse will not suffice. |
| Yes, I agree with that. DAPE accounts for it automatically because values only depend on the number of stronger and the number of equal pieces. FAME collapses up in a fancy way from memory?
|
|
IP Logged |
|
|
|
Fritzlein
Forum Guru
Arimaa player #706
Gender:
Posts: 5928
|
|
Re: Empirically derived material evaluators Part I
« Reply #37 on: Apr 23rd, 2008, 8:09am » |
Quote Modify
|
on Apr 23rd, 2008, 5:39am, 99of9 wrote:FAME collapses up in a fancy way from memory? |
| Yes, FAME collapses the pieces up, and the rabbit value is divided by the amount of enemy material, so it goes up too as pieces are captured. One thing that FAME does generally right (although probably not accurately) is consider the same absolute material advantage to be worth more as the board empties out. One thing that FAME does generally wrong (and DAPE does right?) is fail to consider the each piece relative to all opposing pieces instead of only the piece it "lines up against".
|
« Last Edit: Apr 23rd, 2008, 8:16am by Fritzlein » |
IP Logged |
|
|
|
99of9
Forum Guru
Gnobby's creator (player #314)
Gender:
Posts: 1413
|
|
Re: Empirically derived material evaluators Part I
« Reply #38 on: Apr 23rd, 2008, 8:42am » |
Quote Modify
|
on Apr 23rd, 2008, 8:09am, Fritzlein wrote:One thing that FAME does generally right (although probably not accurately) is consider the same absolute material advantage to be worth more as the board empties out. |
| That is only good if the pieces that are emptying are higher in value than the place where the inequality is, or if the imbalance includes an imbalance in the number of pieces. FAME and DAPE do that right. If the pieces that are emptying are lower than the place where the inequality is, and the inequality is an equal numbered trade (say M for H), I believe the same absolute material advantage decreases in value, but both DAPE and FAME keep it constant. Quote:One thing that FAME does generally wrong (and DAPE does right?) is fail to consider the each piece relative to all opposing pieces instead of only the piece it "lines up against". |
| Yes, DAPE does this, but everything depends on the word "consider"... I'm sure there are better and worse ways to consider them.
|
|
IP Logged |
|
|
|
|