Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> Events >> WCCC Protest
(Message started by: jdb on Jan 23rd, 2006, 6:29pm)

Title: WCCC Protest
Post by jdb on Jan 23rd, 2006, 6:29pm
I sent this email to Omar. It was suggested I post a copy of the email in the forum.


Quote:
Hi Omar,

As Clueless's manager, I wish to lodge a protest concerning the pairings. I think it is unfair for Clueless to have to play again in round 6. Clueless has played Bomb twice and Aamira once. While Aamira and Bomb have not played at all.

I don't know how to contact the tournament director, so if you could contact him for me, it would be much appreciated.

I can go into more detail regarding why I feel the pairings are unfair, if requested.

Thanks for your time

Jeff

Title: Re: WCCC Protest
Post by Fritzlein on Jan 24th, 2006, 11:04am
Now that Clueless has beaten Aamira in round six, the protest gains even more force.  Aamira's claim to the seventh-round bye rests on having a higher pretournament ratings (as per the tournament rules), whereas Clueless' claim to the bye rests on Bomb vs. Aamira being a first-time matchup as opposed to Bomb vs. Clueless being a third-time matchup.

The non-repeat matchup should always take precedence in assigning byes between players with an equal number of byes, regardless of W-L record.  This time, however, W-L record is also tied at 3-2!  We should not let a broken pairing algorithm subvert the intent of the pairing scheme.  JDB, I hope Don rules in your favor, at least for round seven.

Title: Re: WCCC Protest
Post by doublep on Jan 24th, 2006, 2:08pm
Just to note, I don't mind a rule change even right in the middle of the tournament.  Aami-ra was already too lucky to have two games against Loc, and now it has lost both games to Clueless, so it just doesn't deserve the second place.  Let it lose to Bomb :)

Title: Re: WCCC Protest
Post by Ryan_Cable on Jan 24th, 2006, 4:12pm
Thank you for this very sporting statement doublep.  I am very happy to be able to avoid having the tournament marred by a messy dispute.  With the concession that Clueless deserves second place unless Aamira beats Bomb, I think it becomes a moot point which bot plays Bomb tonight and which bot plays Bomb in the morning.

Also, I think you should be quite proud of a third place finish from a bot that wasn’t even in the last CC.  And a look through their game records shows Clueless was much luckier to get both of its wins than is apparent from the tournament games.

Title: Re: WCCC Protest
Post by omar on Jan 24th, 2006, 4:42pm
Sorry I did not see Jeff's posting or email until this afternoon.

I had sent Jeff the following reply by email:


Quote:
Hi Jeff,

Im sorry I did not see this message until Tuesday afternoon.

However the pairing is done using the Floating Triple Elimination program we developed last summer. It is completely deterministic and only the color assignment is sometimes done randomly. The tournament director who is David Levy of the ICGA would only be consulted if there is a dispute about the outcome of a game. He is not involved with the pairing or color assignments.

Since we have started the tournament using the FTE
program for doing the pairing and color assignent, we should stick with it for the remainder of the tournament even if it sometimes does not do the optimal assignments. However, we should discuss this in the forum so that we can improve the program for future tournaments.

Omar


Jeff replied back that since we are using a relatively untested, newly developed, pairing system we should consider intervention since the situation appears to be clearly unfair.

I have decided to pause the next round until we resolve this issue. I will consult David Levy for his decision on whether or not we should intervene in this situation. I would also like to invite the Arimaa community to post their opinion on this issue. I will refer David to this thread when consulting him.

My own opinion now is that I think we should intervene and the round 7 game should be:
 Bomb vs Aamira


Title: Re: WCCC Protest
Post by 99of9 on Jan 24th, 2006, 5:39pm
The question is, which bot deserves the next bye?

The tournament records so far:

Clueless:
beat Aamira
bye
beat Gnobot
lost Bomb
lost Bomb
beat Aamira

Aamira:
lost Clueless
beat Loc
beat Loc
beat Gnobot
bye
lost Clueless

I've crossed out the results that are the same for both bots (disregarding which colour they played).

If we didn't know how good Loc and Bomb were, this record would basically come down to "Clueless wins head to head but loses to the other bot it plays, and vice versa for Aamira".  Because bot results  are often not transitive (ie you often find A beats B, B beats C, and C beats A), then both records would be basically equivalent.

In Aamira's favour:
* higher pretournament rating
* gets bye according to tourney rules
* according to pretournament ratings we would think that loc is a harder opponent than bomb

In Clueless' favour:
* head to head wins are better discriminators and hence more valuable than wins against others
* we think from the tournament results that bomb is a harder opponent than loc

I think it is such an unclear situation that if they both lose their last game against bomb, the 2nd place prizemoney should be split.

Title: Re: WCCC Protest
Post by fotland on Jan 24th, 2006, 11:20pm
I'm happy with whatever Omar decides.  I would ignore the pre-tournament ratings, since the bot's ratings move around so much depending on who  is playing them, and since Bomb was broken.

Title: Re: WCCC Protest
Post by Ryan_Cable on Jan 25th, 2006, 12:48am

on 01/24/06 at 17:39:33, 99of9 wrote:
The question is, which bot deserves the next bye? ...

If we didn't know how good Loc and Bomb were, this record would basically come down to "Clueless wins head to head but loses to the other bot it plays, and vice versa for Aamira".  Because bot results  are often not transitive (ie you often find A beats B, B beats C, and C beats A), then both records would be basically equivalent.

In Aamira's favour:
* higher pretournament rating
* gets bye according to tourney rules
* according to pretournament ratings we would think that loc is a harder opponent than bomb

In Clueless' favour:
* head to head wins are better discriminators and hence more valuable than wins against others
* we think from the tournament results that bomb is a harder opponent than loc

I think it is such an unclear situation that if they both lose their last game against bomb, the 2nd place prizemoney should be split.

This is one of many reasons why I prefer to go with the letter of the rules when they are unambiguous.  Once we bring in our belief that Loc was the weakest bot in the tournament and would have been crushed by Clueless to claim that Clueless has been treated unfairly, the tournament starts to be reduced to a beauty contest.

In the WC, we had upsets and near upsets in games with 300 to 500 point favorites.  Neither first, second, or third went to Fritzlein and 99of9 though even the most pessimistic gave them 75%+ to win.  In this tournament, we had the rather surprising double upset of Aamira by Clueless.  Even though I think Bomb is 98%+ to go undefeated, I wouldn’t want to just give Bomb first place today.

Still, Clueless manifestly outperformed Aamira despite unlucky pairings, and based on doublep’s statement above, it doesn’t seem like there is any dispute left to resolve.

Title: Re: WCCC Protest
Post by RonWeasley on Jan 25th, 2006, 8:41am
We should be very reluctant to deviate from the agreed tournament rules since the precedent can have an effect on the reputation of arimaa and future sponsorship of the arimaa challenge.  We need to assure the muggle world that our promises will be kept so that the motivation to develop bots remains strong.

However, here we have a case where the interested parties, and most spectators, agree that the current rules, without intervention, are not serving their intended purpose.  Current rules do allow an appeal to the tournament director.  At this point, abiding by his decision also falls within the current rules and we can proceed.  As his decision will set a precedent, which may or may not be followed in the future, we should make note of what was important in making the decision, such as, if the change is made, the fact that all particpants agreed to the change.  The point is to protect the integrity of the tournament administration.

Title: Re: WCCC Protest
Post by jdb on Jan 25th, 2006, 8:51am
99of9 said:

Quote:
... Because bot results  are often not transitive (ie you often find A beats B, B beats C, and C beats A), ...


Upon further reflection, I think this is the root of the problem.

One could even argue that the pairings are unfair to Aamira. Consider what happens if Aamira is able to beat Bomb?

Title: Re: WCCC Protest
Post by Fritzlein on Jan 25th, 2006, 11:22am

on 01/24/06 at 17:39:33, 99of9 wrote:
The question is, which bot deserves the next bye?
I don't think that this is the correct question.  Yes, it is a consideration which bot is most deserving, but that is of less importance than which pairing is best.


Quote:
If we didn't know how good Loc and Bomb were, this record would basically come down to "Clueless wins head to head but loses to the other bot it plays, and vice versa for Aamira".  Because bot results  are often not transitive (ie you often find A beats B, B beats C, and C beats A), then both records would be basically equivalent.
The records aren't quite equivalent, because we know (within the tournament) that Bomb has beaten Loc.  But again, it isn't of primary importance which of the two bots has the better record.  Even if Aamira had a better won-loss record, the next match should be Bomb-Aamira.

The fact that domination isn't transitive is a strong argument in favor of avoiding repeat matchups whenever possible.  If there is some kind of fluke that determines the outcome of games between two particular bots, it shouldn't be exaggerated by playing over and over.  Imagine, for example, that Bomb had lost its game with  Loc, and that Bomb could be beaten easily by Aamira.  Imagine that Clueless was the strongest bot except that it was losing every game to Bomb due to some quirk.  That would make an even stronger case against Bomb playing Clueless for a third time before playing against Aamira once.

The idea "first decide who deserves a bye, second look for the best pairing among the others" is an artifact of my dumb algorithm, which is sadly part of the rules.  A much more desirable view is "Find the best pairing (including who gets the bye) among all available parings".  Yes, I wouldn't let any player get two byes more than another, but beyond that avoiding repeat matchups is of much greater importance than deciding who is deserving of a bye.

Whatever Levy rules (and I do see the merit in sticking with a bad algorithm because it's the rule in force), I will argue strenuously that future floating elimination be done on the basis of "best pairing" rather than on the basis of "most deserving of a bye".


Title: Re: WCCC Protest
Post by acheron on Jan 25th, 2006, 3:33pm

Quote:
If there is some kind of fluke that determines the outcome of games between two particular bots, it shouldn't be exaggerated by playing over and over.


This is the key point that I believe should determine the outcome in this matter.  Pairings should be selected to provide the largest possible crossover, thereby demonstrating which bots are effective in a general way, rather than a singly-targeted effectiveness that allows them to defeat a single computer opponent.  such single-target effectiveness, does not demonstrate any greater skill which will enable it to be effective against human players, and as such I'd rather see its impact minimized in pairings.

Title: Re: WCCC Protest
Post by omar on Jan 25th, 2006, 4:43pm
Thanks everyone for providing feedback on this issue.

I just received a reply from David Levy stating:


Quote:
I agree that Bomb should play Aamria in round 7.

I feel that the algorithm is wrong - ties shoudl be broken in favour of (i.e. give the next bye to) the program with the most losses so far (not the fewest) and further ties broken in favour of the lowest pre-tournament rating.


That's not quite the reason I was expecting, but anyways, it settles the issue.

I have manually changed the pairing for round 7 to be bot_Bomb vs bot_Aamira with bot_Clueless getting the bye. I've scheduled the game for tomorrow morning since there would not be much notice time if the game was started this evening.

I'd like to thanks David for giving a quick and definite decision which resolved this issue.

Title: Re: WCCC Protest
Post by Ryan_Cable on Jan 25th, 2006, 7:44pm
Wow, I am just stunned! :-(  I hope this is enough to scare us into writing up ironclad rules, and following them to the letter next year.  If we had followed David Levy’s system in the WC, robinson would have had to win seven games if he went undefeated:

Round 5
Adanac - bye
robinson - PMertens
Fritzlein - 99of9

Round 6
Fritzlein/99of9 - bye
robinson - Adanac

Round 7
robnson - Fritzlein/99of9

But worse still, if in Round 5 PMertens beat robinson and Fritzlein beat 99of9, robinson would have to face PMertens again.

Round 6
Fritzlein - Adanac
robinson - PMertens

Round 7
Adanac/Fritzlein - PMertens/robinson

Perhaps the most horrible thing is that Adanac would be able to win the Championship with only 5 wins as a reward for being the lowest rated looser.

Oh well, we are going to change the pairing algorithm before the next WC anyway.  I am just glad we can come away from this dispute without any hurt feelings.

Title: Re: WCCC Protest
Post by Fritzlein on Jan 26th, 2006, 12:49pm
Good points, Ryan.  It worked very well in the WC to give the bye to the undefeated player rather than the lowest loser.  When I argued for the most deserving player to get the first bye, it was because I was afraid of the most deserving player never getting a bye while others did, which would be a bit silly.

Currently my attitude is that repeated matchups are a great source of unfairness, and need to be eliminated whenever possible, even if that means a more deserving player might need an extra win to get through.  However, I'm not sure I would still like it if it meant that one player needed two wins more than another to win it all.  I wonder whether my latest proposal is vulnerable to this problem, i.e. whether considering W-L record for the bye only after considering eliminating repeat pairings could also mean that one player has a path to victory with two wins fewer than another player.

I totally agree that it would be good to hash this out more to make sure our rules are ironclad before the next elimination tournament.  I guess we'll just have to run lots of sample scenarios, and/or try to create counter-examples in which a given system of rules is unfair.

Title: Re: WCCC Protest
Post by omar on Jan 26th, 2006, 11:08pm

on 01/25/06 at 19:44:05, Ryan_Cable wrote:
Wow, I am just stunned! :-(  I hope this is enough to scare us into writing up ironclad rules, and following them to the letter next year.


We do have very well defined rules for the tournament format this year also. It's just that when we came up with the rules we focused mainly on the ability of the format to find the best player and did not pay attention to other details like repeated pairings. This tournament happened to highlight the need to avoid repeated pairings. We should certianly rethink our tournament format to incorporate what we've learned from this tournament. However I am not sure if I agree with David's proposal. I don't think that it would aviod repeated pairings in all cases and my gut feel says that it probably would not preform as well in selecting the best player. But of course we need to run more simulations to know for sure.

Title: Re: WCCC Protest
Post by Ryan_Cable on Jan 27th, 2006, 1:43am
It was widely assumed that second place went to the last player eliminated, but for next year, I think it is important to spell out exactly how second place is determined.  And since I think any pairing system will have at least the possibility of two players leaving the tournament with the same record, we should at least consider some better system of deciding second place than just order of elimination.

My example with the WC had a repeat pairing, but it was in a round where any pairing would result in a repeat.  The fact that Adanac would have been able to win the tournament with a 5-1 record, but robinson would still have had an opponent to face after going 6-0 strikes me as almost certain to be less likely to pick the True Champion than our system where you won the tournament iff you won 6 games.  Maybe there exists some number of players and number of eliminations where David’s system works best, but I would be quite stunned if it was best in general.  I am sure this will get a lot of discussion and research before the next WC.

However, the thing I find really scary is that David made a change to the tournament pairing based on an opinion that went against the letter of the rules, against the spirit in which the pairing algorithm was developed, and to some degree against the reasoning with which the FTE was created and selected.  In this case, it wasn’t all that important since jdb’s protest was not disputed by doublep, but it is a bad precedent.  In the future, I think you should select someone from within the Aamira community as tournament director if possible, so that he is more likely to be familiar with the process in which our unique tournament structure was developed and how it is intended to work.  I think Fritzlein would make an especially good choice for the CC director.

PS Clueless’s miracle win over Bomb means that, regardless of the final outcome, the tournament we conducted will now be isomorphic to a tournament where the rules were followed to the letter.  Also, the fact that a bot as dominant as Bomb failed to go undefeated drives home how important it is to have a tournament structure focused on picking the True Champion, which FTE seems to be doing very well.

Title: Re: WCCC Protest
Post by Fritzlein on Jan 27th, 2006, 10:53am
As usual, I agree with much of what Ryan says.  We are lucky that DoubleP is so sporting.  We are lucky that Clueless won round 8, so that if the order of Round 7 and 8 had been reversed, it wouldn't have made any difference.  We are lucky that Clueless is unambiguously the second-strongest bot.

Ryan has also convinced me that we might want to consider a different way of ranking than order of elimination when the eliminated players finish with a tied record.  One possible tiebreaker used in swiss-paired chess tournaments is the sum of the wins of the opponents played.  If Clueless and Aamira had ended tied on a 3-3 record, then the tiebreaker would have been:

Aamira's opponents had 12 wins
3 Clueless
0 Loc
0 Loc
0 Gnobot
3 Clueless
6 Bomb

whereas Clueless' opponents had 24 wins
3 Aamira
0 Gnobot
6 Bomb
6 Bomb
3 Aamira
6 Bomb

That's clearly a tougher schedule for Clueless, and a good argument for Clueless to have gotten second place regardless of game order.

In the WC, if Adanac hadn't beaten Robinson once, then Adanac and PMertens would have had the same record of 4-2.  Order of elimination would have given PMertens second place, but the strength of schedule would have been:

PMertens' opponents had 16 wins
0-2 Grey0x2A
4-2 Adanac
1-2 Jdb
2-2 RyanCable
3-2 99of9
6-0 Robinson

Adanac's opponents had 22 wins
(forfeit) Megamau
4-2 PMertens
3-2 99of9
6-0 Robinson
3-2 Fritzlein
6-0 Robinson

I recall now that this tiebreaker usually gives someone zero points for a win by forfeit, which makes sense since a forfeit is as weak as the schedule can get, but anyway Adanac clearly would have had the tougher schedule, so this would have been another instance where order of elimination would have been a poor way to distinguish between second and third.

A second tiebreaker is then usually the number of wins among opponents defeated.  Third tiebreaker could be resurrected from last year's CC: fewest moves in all won games.  Fourth tiebreaker could be most moves in all lost games?

On one point, however, I'm not sure I can agree with Ryan:  I might not make a good tournament director for the 2007 Computer Championships.  Who's to say I won't have a vested interest next year?  ;-)

Title: Re: WCCC Protest
Post by 99of9 on Jan 27th, 2006, 4:56pm
If the places are determined primarily by for-and-against record, there is now a strong disincentive to getting the early round byes.  This may provide an incentive to enter the tournament with a lower ranking.

In all of this discussion, we need to remember some important requirements:

1) No incentive to lose a game at any point.
2) No incentive to enter the tournament with a lower measured-rating.
3) Gives the highest real-rated player the highest chance of winning (assuming transitivity).
4) Gives the second highest real-rated player the second highest chance of winning (assuming transitivity)... etc down the order monotonically.
5) Gives the highest real-rated non-winner the highest chance of taking second place.
6) Gives similar real-rated players similar chances of winning (but as per rule 2, can slightly favour the one with the highest measured-rating).

... more coming later, i'm getting called away.  Can anyone else think of others?  I haven't yet formulated one about repeat pairings - but I think it's not as self-evident as these others.

Title: Re: WCCC Protest
Post by Fritzlein on Jan 27th, 2006, 10:16pm

on 01/27/06 at 16:56:01, 99of9 wrote:
If the places are determined primarily by for-and-against record, there is now a strong disincentive to getting the early round byes.
Yes.  Also (although this may be an instance of the tail wagging the dog) swiss-paired chess tournaments usually use sliding pairing rather than folding pairing.  If they used folding pairing, the higher seeds would usually get worse tiebreaker points from playing weaker opponents.  Adanac had a tougher road than PMertens partly by losing later, but also partly by being a lower seed when folding pairing was in use.  In a swiss tournament where both players lose and win in the same rounds, sliding pairing is necessary to give the higher seed the better tiebreaker.


Quote:
3) Gives the highest real-rated player the highest chance of winning (assuming transitivity).

I don't think we should assume transitivity, for two reasons.  First, I think non-transitive situations are fairly common among bots, and not rare among humans.  We won't get a good measure of performance if we make assumptions that don't hold.

Second, and more importantly, I think that fairness is the primary principle here.  Wanting the best player to have the best chance of winning is an instance of fairness, but in looking out for the best player, we shouldn't unequally crush the chances of weaker players.

One instance of what I am talking about is the way Loc got hosed in round 3 to have to play Aamira again instead of having a shot a Clueless.  The fact that Clueless had already beaten Aamira and Aamira had already beaten Loc doesn't change my mind about that, because I don't trust transitivity.  Loc might well have had better winning chances against Clueless' defensive style of play than against Aamira's kamikaze assault.  This has negligible effect on determining the strongest player, but we still ought to be fair to Loc.

Here's a more involved example, but not too farfetched.  Suppose there are three bots A, B and C.  (Think of Aamira, Bomb, and Clueless if you must ...)  Suppose A beats B 60% of the time, B beats C 70% of the time, and C beats A 70% of the time.  In an elimination situation, each bot really, really needs the other two to play.  Bot B needs lots of C vs. A games, so that C will eliminate A, removing the threat to B.  Similarly A needs lots of B vs. C games so C will be eliminated and A's chances improve.  Whichever matchup doesn't happen (by assignment of byes or otherwise) hurts the third bot.

Now suppose that A, B, and C are in a double-elimination with bot D, which loses 80% of the time to each of them.  In tenfold round-robin, the expected finishing order (and wins) would be B(19) C(18) A(17) D(6) so B-C-A-D is by definition an ordering of the bots from best to worst.  We want B to have the best chance of winning the tournament.  Let's assume they are seeded in true order, and that the first two rounds have B beats D and C beats A, then B beats C in the winner's bracket while A beats D in the loser's braket.  D is eliminated, B is 2-0 and A and C are both 1-1.

Now if we are trying to give the best bot the maximum chance of winning, the first bye should go the the undefeated bot, which would cause A vs. C for the second time.  Yes, this helps give the best bot the best chance of winning: the odds of tournament victory are then B(82.9%) C(6.3%) A(10.8%), but notice that C, the second-best bot, has been hosed because B vs. A doesn't happen before a repeat of C vs. A.

If we elevate the importance of avoiding repeat matchups, the first bye will go to C so that B vs. A can happen for the first time.  If there needs to be another bye because A wins, then C vs. B  and C vs. A have each happened once, and also B and A would each be 2-1, so pretournament rating could break the ties and dictate C vs. A as the next game.  By this pairing scheme the odds of tournament victory become B(73%) C(16.2%) A(10.8%).

In my mind the latter pairing scheme is manifestly more fair, and this is the sort of scenario where we need keep non-transitivity in mind to avoid potential unfairness.

That said, I don't think that there is no transitivity whatsoever.  A round-robin is all you have to fall back on to define playing strength when there is no transitivity.  Yet I do believe somewhat in transitivity, i.e. that stronger players tend to be stronger against any opponent, in spite of certain quirks of style that affect particular matchups.  I would prefer more games among top bots to more games that are mismatches in an attempt to fill out a round-robin.

We need to keep generating scenarios.  Now that we have a couple of examples of unfairness when we don't avoid repeat matchups, I think we should try to generate examples where it is unfair if we do avoid repeat matchups.  We need to anticipate problems that haven't happened yet.

Title: Re: WCCC Protest
Post by fotland on Jan 28th, 2006, 12:42am
Why not combine round robin with triple elimination?  Do a single round robin, then eliminate all bots with 3 or more losses, then another single round robin, then another elimination of all with 3 or more losses (total), etc.  This eliminates the problem with assigning byes, and still lets you spend more playing time with the stronger bots playing.

Title: Re: WCCC Protest
Post by omar on Jan 28th, 2006, 8:22am

on 01/27/06 at 01:43:19, Ryan_Cable wrote:
However, the thing I find really scary is that David made a change to the tournament pairing based on an opinion that went against the letter of the rules, against the spirit in which the pairing algorithm was developed, and to some degree against the reasoning with which the FTE was created and selected.  In this case, it wasn’t all that important since jdb’s protest was not disputed by doublep, but it is a bad precedent.  In the future, I think you should select someone from within the Aamira community as tournament director if possible, so that he is more likely to be familiar with the process in which our unique tournament structure was developed and how it is intended to work.  I think Fritzlein would make an especially good choice for the CC director.


Yes, it scared me also :-) Im sure David being the president of the ICGA is quite busy and also was not familiar with the reasoning behind current tournament rules. For next years events I think it will be much better if we follow Ryan's suggestion and have active members of the Arimaa community serve as arbiters.

Title: Re: WCCC Protest
Post by omar on Jan 28th, 2006, 8:55am

on 01/28/06 at 00:42:47, fotland wrote:
Why not combine round robin with triple elimination?  Do a single round robin, then eliminate all bots with 3 or more losses, then another single round robin, then another elimination of all with 3 or more losses (total), etc.  This eliminates the problem with assigning byes, and still lets you spend more playing time with the stronger bots playing.


We were trying to reduce the number of rounds as much as possible while trying to maximize the chances of the true highest rated player winning the tournament.

Our tournament was not really designed to recognize second place. I had assumed that the program which lost in the final round would be most deserving of second place. But I realize now that this is not alway the case. I think trying to have a format recognize second place while still being as good and efficient at finding the best player will be difficult and will complicate the otherwise clean and simple rules of the current FxE formats. For next year I am inclined to eleminate second place and instead award a participation prize to all the contestents that entered and did not win first place.

Title: Re: WCCC Protest
Post by jdb on Jan 28th, 2006, 9:09am
I would be in favour of a proposal along the lines fotland suggested.

Having extra games in a bot tourny is not the same as having extra games in a human event.

FxE does not address non-transitive results. Round robin does.

In round robin, every opponent has the same "strength of schedule" which is not the case in some other formats.

His proposal also preserves the drama of a championship game.


Title: Re: WCCC Protest
Post by Fritzlein on Jan 28th, 2006, 10:56am
I like Fotland's proposal when the number of players is small, because it is fair and the number of games won't be that much larger than FXE tournaments.  For our five-player triple elimination, the first round would probably have had only one extra game, (probably Loc or Gnobot getting a fourth loss.)  The second round would also have had only one extra game at most (perhaps Aamira getting a fourth loss), and then it would have been head to head the rest of the way.  Two extra games in a tournament that was already 13 games long is a small price to pay for having no issues of fairness in pairing.

Already with 8 bots, however, there would start to be a significant number of extra games.  FTE with 8 takes 21 to 23 games, but shrinking round robin triple elimination (SSRTE) will take 28 games even if a miracle decides things in the first round robin, and more likely will take 32 or more games, for an increase of 50% over FTE.

Maybe there could simply be a cutoff below which it is SSRTE and above which it falls back to FTE.  This makes sense for another reason: the larger the tournament, the less of an issue repeated pairings are.

Title: Re: WCCC Protest
Post by omar on Feb 3rd, 2006, 10:29pm

on 01/28/06 at 10:56:15, Fritzlein wrote:
Maybe there could simply be a cutoff below which it is SSRTE and above which it falls back to FTE.  This makes sense for another reason: the larger the tournament, the less of an issue repeated pairings are.


Interesting idea. For example a 16 player tournament could start out as FTE and when the number of players is perhaps 4 or less it switches to SSRTE. When it switches to SSRTE would the remaining players start with a clean loss record, or would the previous loss record be preserved?

Title: Re: WCCC Protest
Post by Fritzlein on Feb 4th, 2006, 7:53am
My intuition is definitely to carry all losses forward in any elimination situation.

My hunch too is that a FXE, if the pairing algorithm is trying really hard to avoid repeat matchups, will look similar to a shrinking round robin whne the number of players is small.

Title: Re: WCCC Protest
Post by omar on Sep 20th, 2006, 6:31am
For this years WCC games I am considering using single elimination while the number of players is more than 16, then switch to floating double elimination (FDE) when the number of players is more than 4 and finally switch to round robin triple elimination (RRTE) as described by David Fotland earlier in this thread. Losses are carried forward when the format is changed.

We may even consider using this for the WC games.

Some things I like about this format are:

* It allows for a large number of initial players without significantly increasing the number of rounds.
* Eliminates the pairing issues when the number of players is low.
* Better handles non-transitivity.
* Same format could be used for both the WCC and WC.

If others have a good feeling about this format we will just go with it. Unfortunately I did not have much time to experiment with tournament formats this year and I don't think anyone else has taken an initiative to experiment either. So with very little time left before the start of this years events, we need to decide quickly on what format to use this year.


Title: Re: WCCC Protest
Post by Fritzlein on Sep 20th, 2006, 10:18am
Yes, we need to decide fairly quickly on the format for this year's events, particularly the World Championship.  I doubt there will be more than 16 bots entered in the Computer Championship, but for the World Championship it is a critical issue what to do with a large field, because I expect an open tournament to attract 24 or so this year, and conceivably a full 32.

The idea of starting with single elimination to save rounds and then switching to floating double elimination once it gets under 16 players doesn't seem to achieve its objective.  With 32 players, one round of single elimination cuts it to 16, after which FDE takes a maximum of eight more rounds to complete (as it took last year), for a total maximum of nine rounds.  But a full 32-player FDE also takes a maximum of nine rounds to complete, so wiping out half the field in the first round hasn't accomplished anything.  To put it another way, allowing the first-round losers to play on doesn't lengthen the tournament at that point.  Either way the price for doubling the size of the field is one extra round.

In any case, I'm somewhat skeptical of adopting the same format for the World Championship as for the Computer Championship, because the two have different constraints.  In the World Championship, the number of games doesn't matter as much as the number of rounds, because in the WC any number of games can be played in the same week.  Furthermore, it is particularly important to play an elimination format in the World Championship, to avoid possible collusion and to avoid dropouts who have no chance of being champion but are expected to play more games.

In the Computer Championship, the number of games is the limiting factor, not the number of rounds, because the hardware only allows one game at a time.  Also, round-robin formats make more sense, because the computers aren't going to throw games, drop out, or play less intensely when they are out of contention.

The idea of starting single elimination and allowing extra eliminations as the tourney progresses doesn't necessarily save rounds, but it does save games, so that would be an idea to consider for the Computer Championship.  One thought would be to give each contestant an extra life after every two victories.

For the Computer Championship, a round-robin would make a great deal of sense, were it not for the large number of games required, including a large number of mismatches when we would prefer more games to discriminate between the top contenders.  If there are N contestants, then round robin formats take O(N^2) games, whereas elimination formats take O(N) games.

I'm definitely still willing to kick around the pros and cons of various formats, but at the moment I would lean toward floating triple elimination again for the Computer Championship.  The problems last year arose because the pairing algorithm didn't sufficiently prioritize avoiding repeat matchups.  It seems to me that if we raise the priority of avoiding repeat matchups, then we get all the advantages of FTE without the disadvantage.  Round-robin formats remain slightly fairer, but take significantly more games, and I think the tradeoff is probably worth it.

Finally, if you start with FXE, then switching to round-robin when the number of players is small is nearly superfluous, because an FXE format, with proper aversion to repeat pairings, essentially turns into an XRR tournament towards the end anyway.

Title: Re: WCCC Protest
Post by Fritzlein on Sep 20th, 2006, 11:01am
By the way, one format change I would suggest independent of the pairing algorithm is to have all games be played at a time control of 1:30/9:00/90/0/6/5, in all rounds of the World Championship, Computer Championship, and Arimaa Challenge.  Since all of these games are spectator games, having the five-minute limit per move will keep them rolling along.  Furthermore, banking only 90% of unused time on a move is an additional slight incentive to pace moves regularly, rather than making some fast and some slow.

For the Computer Championships, a little less time per move may allow more games to be squeezed into the same number of days, an important consideration with rented hardware.  Varying the time control also affects computers less than humans, so it should hardly affect the outcome.

For the World Championship, if it might go nine rounds this year instead of eight, there's no need to make the players who survive that long play extra-long games in the final rounds.  This time control keeps the total time commitment in check a little bit.

For the Arimaa Challenge, the faster time control favors the computers, but who is scared of the computers this year?  If they start to look threating in the future we can slow it back down to two minutes a move.  For the present year, however, you would actually be doing the humans a favor to speed it up to 90 seconds a move, because they would get less bored during the games.

Mostly the faster time control is in service of making Arimaa a better spectator sport, but it seems to have lots of side benefits as well.



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.