Welcome, Guest. Please Login or Register.
May 7th, 2024, 1:41am

Home Home Help Help Search Search Members Members Login Login Register Register
Arimaa Forum « WCCC Protest »


   Arimaa Forum
   Arimaa
   Events
(Moderator: supersamu)
   WCCC Protest
« Previous topic | Next topic »
Pages: 1 2  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: WCCC Protest  (Read 4673 times)
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: WCCC Protest
« Reply #15 on: Jan 26th, 2006, 11:08pm »
Quote Quote Modify Modify

on Jan 25th, 2006, 7:44pm, Ryan_Cable wrote:
Wow, I am just stunned! Sad  I hope this is enough to scare us into writing up ironclad rules, and following them to the letter next year.

 
We do have very well defined rules for the tournament format this year also. It's just that when we came up with the rules we focused mainly on the ability of the format to find the best player and did not pay attention to other details like repeated pairings. This tournament happened to highlight the need to avoid repeated pairings. We should certianly rethink our tournament format to incorporate what we've learned from this tournament. However I am not sure if I agree with David's proposal. I don't think that it would aviod repeated pairings in all cases and my gut feel says that it probably would not preform as well in selecting the best player. But of course we need to run more simulations to know for sure.
IP Logged
Ryan_Cable
Forum Guru
*****



Arimaa player #951

   


Gender: male
Posts: 138
Re: WCCC Protest
« Reply #16 on: Jan 27th, 2006, 1:43am »
Quote Quote Modify Modify

It was widely assumed that second place went to the last player eliminated, but for next year, I think it is important to spell out exactly how second place is determined.  And since I think any pairing system will have at least the possibility of two players leaving the tournament with the same record, we should at least consider some better system of deciding second place than just order of elimination.
 
My example with the WC had a repeat pairing, but it was in a round where any pairing would result in a repeat.  The fact that Adanac would have been able to win the tournament with a 5-1 record, but robinson would still have had an opponent to face after going 6-0 strikes me as almost certain to be less likely to pick the True Champion than our system where you won the tournament iff you won 6 games.  Maybe there exists some number of players and number of eliminations where David’s system works best, but I would be quite stunned if it was best in general.  I am sure this will get a lot of discussion and research before the next WC.
 
However, the thing I find really scary is that David made a change to the tournament pairing based on an opinion that went against the letter of the rules, against the spirit in which the pairing algorithm was developed, and to some degree against the reasoning with which the FTE was created and selected.  In this case, it wasn’t all that important since jdb’s protest was not disputed by doublep, but it is a bad precedent.  In the future, I think you should select someone from within the Aamira community as tournament director if possible, so that he is more likely to be familiar with the process in which our unique tournament structure was developed and how it is intended to work.  I think Fritzlein would make an especially good choice for the CC director.
 
PS Clueless’s miracle win over Bomb means that, regardless of the final outcome, the tournament we conducted will now be isomorphic to a tournament where the rules were followed to the letter.  Also, the fact that a bot as dominant as Bomb failed to go undefeated drives home how important it is to have a tournament structure focused on picking the True Champion, which FTE seems to be doing very well.
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: WCCC Protest
« Reply #17 on: Jan 27th, 2006, 10:53am »
Quote Quote Modify Modify

As usual, I agree with much of what Ryan says.  We are lucky that DoubleP is so sporting.  We are lucky that Clueless won round 8, so that if the order of Round 7 and 8 had been reversed, it wouldn't have made any difference.  We are lucky that Clueless is unambiguously the second-strongest bot.
 
Ryan has also convinced me that we might want to consider a different way of ranking than order of elimination when the eliminated players finish with a tied record.  One possible tiebreaker used in swiss-paired chess tournaments is the sum of the wins of the opponents played.  If Clueless and Aamira had ended tied on a 3-3 record, then the tiebreaker would have been:
 
Aamira's opponents had 12 wins
3 Clueless
0 Loc
0 Loc
0 Gnobot
3 Clueless
6 Bomb
 
whereas Clueless' opponents had 24 wins
3 Aamira
0 Gnobot
6 Bomb
6 Bomb
3 Aamira
6 Bomb
 
That's clearly a tougher schedule for Clueless, and a good argument for Clueless to have gotten second place regardless of game order.
 
In the WC, if Adanac hadn't beaten Robinson once, then Adanac and PMertens would have had the same record of 4-2.  Order of elimination would have given PMertens second place, but the strength of schedule would have been:
 
PMertens' opponents had 16 wins
0-2 Grey0x2A
4-2 Adanac
1-2 Jdb
2-2 RyanCable
3-2 99of9
6-0 Robinson
 
Adanac's opponents had 22 wins
(forfeit) Megamau
4-2 PMertens
3-2 99of9
6-0 Robinson
3-2 Fritzlein
6-0 Robinson
 
I recall now that this tiebreaker usually gives someone zero points for a win by forfeit, which makes sense since a forfeit is as weak as the schedule can get, but anyway Adanac clearly would have had the tougher schedule, so this would have been another instance where order of elimination would have been a poor way to distinguish between second and third.
 
A second tiebreaker is then usually the number of wins among opponents defeated.  Third tiebreaker could be resurrected from last year's CC: fewest moves in all won games.  Fourth tiebreaker could be most moves in all lost games?
 
On one point, however, I'm not sure I can agree with Ryan:  I might not make a good tournament director for the 2007 Computer Championships.  Who's to say I won't have a vested interest next year?  Wink
« Last Edit: Jan 27th, 2006, 10:57am by Fritzlein » IP Logged

99of9
Forum Guru
*****




Gnobby's creator (player #314)

  toby_hudson  


Gender: male
Posts: 1413
Re: WCCC Protest
« Reply #18 on: Jan 27th, 2006, 4:56pm »
Quote Quote Modify Modify

If the places are determined primarily by for-and-against record, there is now a strong disincentive to getting the early round byes.  This may provide an incentive to enter the tournament with a lower ranking.
 
In all of this discussion, we need to remember some important requirements:
 
1) No incentive to lose a game at any point.
2) No incentive to enter the tournament with a lower measured-rating.
3) Gives the highest real-rated player the highest chance of winning (assuming transitivity).
4) Gives the second highest real-rated player the second highest chance of winning (assuming transitivity)... etc down the order monotonically.
5) Gives the highest real-rated non-winner the highest chance of taking second place.
6) Gives similar real-rated players similar chances of winning (but as per rule 2, can slightly favour the one with the highest measured-rating).
 
... more coming later, i'm getting called away.  Can anyone else think of others?  I haven't yet formulated one about repeat pairings - but I think it's not as self-evident as these others.
« Last Edit: Jan 27th, 2006, 4:56pm by 99of9 » IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: WCCC Protest
« Reply #19 on: Jan 27th, 2006, 10:16pm »
Quote Quote Modify Modify

on Jan 27th, 2006, 4:56pm, 99of9 wrote:
If the places are determined primarily by for-and-against record, there is now a strong disincentive to getting the early round byes.
Yes.  Also (although this may be an instance of the tail wagging the dog) swiss-paired chess tournaments usually use sliding pairing rather than folding pairing.  If they used folding pairing, the higher seeds would usually get worse tiebreaker points from playing weaker opponents.  Adanac had a tougher road than PMertens partly by losing later, but also partly by being a lower seed when folding pairing was in use.  In a swiss tournament where both players lose and win in the same rounds, sliding pairing is necessary to give the higher seed the better tiebreaker.
 
Quote:
3) Gives the highest real-rated player the highest chance of winning (assuming transitivity).

I don't think we should assume transitivity, for two reasons.  First, I think non-transitive situations are fairly common among bots, and not rare among humans.  We won't get a good measure of performance if we make assumptions that don't hold.
 
Second, and more importantly, I think that fairness is the primary principle here.  Wanting the best player to have the best chance of winning is an instance of fairness, but in looking out for the best player, we shouldn't unequally crush the chances of weaker players.
 
One instance of what I am talking about is the way Loc got hosed in round 3 to have to play Aamira again instead of having a shot a Clueless.  The fact that Clueless had already beaten Aamira and Aamira had already beaten Loc doesn't change my mind about that, because I don't trust transitivity.  Loc might well have had better winning chances against Clueless' defensive style of play than against Aamira's kamikaze assault.  This has negligible effect on determining the strongest player, but we still ought to be fair to Loc.
 
Here's a more involved example, but not too farfetched.  Suppose there are three bots A, B and C.  (Think of Aamira, Bomb, and Clueless if you must ...)  Suppose A beats B 60% of the time, B beats C 70% of the time, and C beats A 70% of the time.  In an elimination situation, each bot really, really needs the other two to play.  Bot B needs lots of C vs. A games, so that C will eliminate A, removing the threat to B.  Similarly A needs lots of B vs. C games so C will be eliminated and A's chances improve.  Whichever matchup doesn't happen (by assignment of byes or otherwise) hurts the third bot.
 
Now suppose that A, B, and C are in a double-elimination with bot D, which loses 80% of the time to each of them.  In tenfold round-robin, the expected finishing order (and wins) would be B(19) C(18) A(17) D(6) so B-C-A-D is by definition an ordering of the bots from best to worst.  We want B to have the best chance of winning the tournament.  Let's assume they are seeded in true order, and that the first two rounds have B beats D and C beats A, then B beats C in the winner's bracket while A beats D in the loser's braket.  D is eliminated, B is 2-0 and A and C are both 1-1.
 
Now if we are trying to give the best bot the maximum chance of winning, the first bye should go the the undefeated bot, which would cause A vs. C for the second time.  Yes, this helps give the best bot the best chance of winning: the odds of tournament victory are then B(82.9%) C(6.3%) A(10.8%), but notice that C, the second-best bot, has been hosed because B vs. A doesn't happen before a repeat of C vs. A.
 
If we elevate the importance of avoiding repeat matchups, the first bye will go to C so that B vs. A can happen for the first time.  If there needs to be another bye because A wins, then C vs. B  and C vs. A have each happened once, and also B and A would each be 2-1, so pretournament rating could break the ties and dictate C vs. A as the next game.  By this pairing scheme the odds of tournament victory become B(73%) C(16.2%) A(10.8%).
 
In my mind the latter pairing scheme is manifestly more fair, and this is the sort of scenario where we need keep non-transitivity in mind to avoid potential unfairness.
 
That said, I don't think that there is no transitivity whatsoever.  A round-robin is all you have to fall back on to define playing strength when there is no transitivity.  Yet I do believe somewhat in transitivity, i.e. that stronger players tend to be stronger against any opponent, in spite of certain quirks of style that affect particular matchups.  I would prefer more games among top bots to more games that are mismatches in an attempt to fill out a round-robin.
 
We need to keep generating scenarios.  Now that we have a couple of examples of unfairness when we don't avoid repeat matchups, I think we should try to generate examples where it is unfair if we do avoid repeat matchups.  We need to anticipate problems that haven't happened yet.
IP Logged

fotland
Forum Guru
*****



Arimaa player #211

   


Gender: male
Posts: 216
Re: WCCC Protest
« Reply #20 on: Jan 28th, 2006, 12:42am »
Quote Quote Modify Modify

Why not combine round robin with triple elimination?  Do a single round robin, then eliminate all bots with 3 or more losses, then another single round robin, then another elimination of all with 3 or more losses (total), etc.  This eliminates the problem with assigning byes, and still lets you spend more playing time with the stronger bots playing.
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: WCCC Protest
« Reply #21 on: Jan 28th, 2006, 8:22am »
Quote Quote Modify Modify

on Jan 27th, 2006, 1:43am, Ryan_Cable wrote:

However, the thing I find really scary is that David made a change to the tournament pairing based on an opinion that went against the letter of the rules, against the spirit in which the pairing algorithm was developed, and to some degree against the reasoning with which the FTE was created and selected.  In this case, it wasn’t all that important since jdb’s protest was not disputed by doublep, but it is a bad precedent.  In the future, I think you should select someone from within the Aamira community as tournament director if possible, so that he is more likely to be familiar with the process in which our unique tournament structure was developed and how it is intended to work.  I think Fritzlein would make an especially good choice for the CC director.

 
Yes, it scared me also Smiley Im sure David being the president of the ICGA is quite busy and also was not familiar with the reasoning behind current tournament rules. For next years events I think it will be much better if we follow Ryan's suggestion and have active members of the Arimaa community serve as arbiters.
IP Logged
omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: WCCC Protest
« Reply #22 on: Jan 28th, 2006, 8:55am »
Quote Quote Modify Modify

on Jan 28th, 2006, 12:42am, fotland wrote:
Why not combine round robin with triple elimination?  Do a single round robin, then eliminate all bots with 3 or more losses, then another single round robin, then another elimination of all with 3 or more losses (total), etc.  This eliminates the problem with assigning byes, and still lets you spend more playing time with the stronger bots playing.

 
We were trying to reduce the number of rounds as much as possible while trying to maximize the chances of the true highest rated player winning the tournament.
 
Our tournament was not really designed to recognize second place. I had assumed that the program which lost in the final round would be most deserving of second place. But I realize now that this is not alway the case. I think trying to have a format recognize second place while still being as good and efficient at finding the best player will be difficult and will complicate the otherwise clean and simple rules of the current FxE formats. For next year I am inclined to eleminate second place and instead award a participation prize to all the contestents that entered and did not win first place.
IP Logged
jdb
Forum Guru
*****



Arimaa player #214

   


Gender: male
Posts: 682
Re: WCCC Protest
« Reply #23 on: Jan 28th, 2006, 9:09am »
Quote Quote Modify Modify

I would be in favour of a proposal along the lines fotland suggested.  
 
Having extra games in a bot tourny is not the same as having extra games in a human event.
 
FxE does not address non-transitive results. Round robin does.  
 
In round robin, every opponent has the same "strength of schedule" which is not the case in some other formats.
 
His proposal also preserves the drama of a championship game.
 
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: WCCC Protest
« Reply #24 on: Jan 28th, 2006, 10:56am »
Quote Quote Modify Modify

I like Fotland's proposal when the number of players is small, because it is fair and the number of games won't be that much larger than FXE tournaments.  For our five-player triple elimination, the first round would probably have had only one extra game, (probably Loc or Gnobot getting a fourth loss.)  The second round would also have had only one extra game at most (perhaps Aamira getting a fourth loss), and then it would have been head to head the rest of the way.  Two extra games in a tournament that was already 13 games long is a small price to pay for having no issues of fairness in pairing.
 
Already with 8 bots, however, there would start to be a significant number of extra games.  FTE with 8 takes 21 to 23 games, but shrinking round robin triple elimination (SSRTE) will take 28 games even if a miracle decides things in the first round robin, and more likely will take 32 or more games, for an increase of 50% over FTE.
 
Maybe there could simply be a cutoff below which it is SSRTE and above which it falls back to FTE.  This makes sense for another reason: the larger the tournament, the less of an issue repeated pairings are.
IP Logged

omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: WCCC Protest
« Reply #25 on: Feb 3rd, 2006, 10:29pm »
Quote Quote Modify Modify

on Jan 28th, 2006, 10:56am, Fritzlein wrote:

Maybe there could simply be a cutoff below which it is SSRTE and above which it falls back to FTE.  This makes sense for another reason: the larger the tournament, the less of an issue repeated pairings are.

 
Interesting idea. For example a 16 player tournament could start out as FTE and when the number of players is perhaps 4 or less it switches to SSRTE. When it switches to SSRTE would the remaining players start with a clean loss record, or would the previous loss record be preserved?
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: WCCC Protest
« Reply #26 on: Feb 4th, 2006, 7:53am »
Quote Quote Modify Modify

My intuition is definitely to carry all losses forward in any elimination situation.
 
My hunch too is that a FXE, if the pairing algorithm is trying really hard to avoid repeat matchups, will look similar to a shrinking round robin whne the number of players is small.
IP Logged

omar
Forum Guru
*****



Arimaa player #2

   


Gender: male
Posts: 1003
Re: WCCC Protest
« Reply #27 on: Sep 20th, 2006, 6:31am »
Quote Quote Modify Modify

For this years WCC games I am considering using single elimination while the number of players is more than 16, then switch to floating double elimination (FDE) when the number of players is more than 4 and finally switch to round robin triple elimination (RRTE) as described by David Fotland earlier in this thread. Losses are carried forward when the format is changed.
 
We may even consider using this for the WC games.
 
Some things I like about this format are:
 
* It allows for a large number of initial players without significantly increasing the number of rounds.
* Eliminates the pairing issues when the number of players is low.
* Better handles non-transitivity.
* Same format could be used for both the WCC and WC.
 
If others have a good feeling about this format we will just go with it. Unfortunately I did not have much time to experiment with tournament formats this year and I don't think anyone else has taken an initiative to experiment either. So with very little time left before the start of this years events, we need to decide quickly on what format to use this year.
 
IP Logged
Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: WCCC Protest
« Reply #28 on: Sep 20th, 2006, 10:18am »
Quote Quote Modify Modify

Yes, we need to decide fairly quickly on the format for this year's events, particularly the World Championship.  I doubt there will be more than 16 bots entered in the Computer Championship, but for the World Championship it is a critical issue what to do with a large field, because I expect an open tournament to attract 24 or so this year, and conceivably a full 32.
 
The idea of starting with single elimination to save rounds and then switching to floating double elimination once it gets under 16 players doesn't seem to achieve its objective.  With 32 players, one round of single elimination cuts it to 16, after which FDE takes a maximum of eight more rounds to complete (as it took last year), for a total maximum of nine rounds.  But a full 32-player FDE also takes a maximum of nine rounds to complete, so wiping out half the field in the first round hasn't accomplished anything.  To put it another way, allowing the first-round losers to play on doesn't lengthen the tournament at that point.  Either way the price for doubling the size of the field is one extra round.
 
In any case, I'm somewhat skeptical of adopting the same format for the World Championship as for the Computer Championship, because the two have different constraints.  In the World Championship, the number of games doesn't matter as much as the number of rounds, because in the WC any number of games can be played in the same week.  Furthermore, it is particularly important to play an elimination format in the World Championship, to avoid possible collusion and to avoid dropouts who have no chance of being champion but are expected to play more games.
 
In the Computer Championship, the number of games is the limiting factor, not the number of rounds, because the hardware only allows one game at a time.  Also, round-robin formats make more sense, because the computers aren't going to throw games, drop out, or play less intensely when they are out of contention.
 
The idea of starting single elimination and allowing extra eliminations as the tourney progresses doesn't necessarily save rounds, but it does save games, so that would be an idea to consider for the Computer Championship.  One thought would be to give each contestant an extra life after every two victories.
 
For the Computer Championship, a round-robin would make a great deal of sense, were it not for the large number of games required, including a large number of mismatches when we would prefer more games to discriminate between the top contenders.  If there are N contestants, then round robin formats take O(N^2) games, whereas elimination formats take O(N) games.
 
I'm definitely still willing to kick around the pros and cons of various formats, but at the moment I would lean toward floating triple elimination again for the Computer Championship.  The problems last year arose because the pairing algorithm didn't sufficiently prioritize avoiding repeat matchups.  It seems to me that if we raise the priority of avoiding repeat matchups, then we get all the advantages of FTE without the disadvantage.  Round-robin formats remain slightly fairer, but take significantly more games, and I think the tradeoff is probably worth it.
 
Finally, if you start with FXE, then switching to round-robin when the number of players is small is nearly superfluous, because an FXE format, with proper aversion to repeat pairings, essentially turns into an XRR tournament towards the end anyway.
« Last Edit: Sep 20th, 2006, 10:24am by Fritzlein » IP Logged

Fritzlein
Forum Guru
*****



Arimaa player #706

   
Email

Gender: male
Posts: 5928
Re: WCCC Protest
« Reply #29 on: Sep 20th, 2006, 11:01am »
Quote Quote Modify Modify

By the way, one format change I would suggest independent of the pairing algorithm is to have all games be played at a time control of 1:30/9:00/90/0/6/5, in all rounds of the World Championship, Computer Championship, and Arimaa Challenge.  Since all of these games are spectator games, having the five-minute limit per move will keep them rolling along.  Furthermore, banking only 90% of unused time on a move is an additional slight incentive to pace moves regularly, rather than making some fast and some slow.
 
For the Computer Championships, a little less time per move may allow more games to be squeezed into the same number of days, an important consideration with rented hardware.  Varying the time control also affects computers less than humans, so it should hardly affect the outcome.
 
For the World Championship, if it might go nine rounds this year instead of eight, there's no need to make the players who survive that long play extra-long games in the final rounds.  This time control keeps the total time commitment in check a little bit.
 
For the Arimaa Challenge, the faster time control favors the computers, but who is scared of the computers this year?  If they start to look threating in the future we can slow it back down to two minutes a move.  For the present year, however, you would actually be doing the humans a favor to speed it up to 90 seconds a move, because they would get less bored during the games.
 
Mostly the faster time control is in service of making Arimaa a better spectator sport, but it seems to have lots of side benefits as well.
IP Logged

Pages: 1 2  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.