Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> Site Discussion >> Floating Elimination, Polynomial Time Pairing
(Message started by: Janzert on Jan 12th, 2011, 10:38am)

Title: Floating Elimination, Polynomial Time Pairing
Post by Janzert on Jan 12th, 2011, 10:38am
After independently rediscovering the python module to do minimum weight perfect matching*, I now have some python scripts that will do folding elimination pairing in a reasonable time for any size tournament we will have in the foreseeable future.

Here are the timings and average number of rounds from running 100 triple elimination tournaments of each size using Omar's tournament simulator:


Code:
Players Time    Avg rounds
 16    1:09.43 10.48
 32    1:26.24 11.91
 64    2:15.73 13.12
128    6:42.97 14.69


The code can found at https://bitbucket.org/janzert/folding-pairing/overview

Currently there isn't any documentation, but simple_elim.py is a drop in replacement for doublep's pairing program and full_elim.py is a drop in for aaaa's program. I believe it will run under python 2.5 and could be easily modified for 2.4 if needed.

Janzert

* That aaaa discovered last year (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=events;action=display;num=1249655059;start=15#26), could have saved myself a good hour searching around google had I only gone and re-read that thread first :(

Title: Re: Floating Elimination, Polynomial Time Pairing
Post by ocmiente on Jan 12th, 2011, 3:01pm
Very nice!

Now that I understand this a little better, I'm curious to know how you decide whether one set of weights or criteria is better than another.  Do you optimize for average rating from best, percent of time that the highest rated wins, average number of rounds, some master formula that combines all of them, or do you just eyeball the results to get a sense of which is better?


Title: Re: Floating Elimination, Polynomial Time Pairing
Post by Janzert on Jan 12th, 2011, 3:41pm
The weights are derived from the tournament rules criteria for pairing (e.g. see the elimination section of this year's WC (http://arimaa.com/arimaa/wc/2011/rules.html) or CC (http://arimaa.com/arimaa/wcc/2011/rules.html)). Those criteria were derived from various conversations here over the last 5 years or so. The exact current criteria in use were proposed by Fritzlein a while back (there is a url to that thread in the source code of doublep's code but that url seems to have been broken with forum changes and I wasn't able to find it by searching either).

Mostly they were built up over time to avoid pairings that were considered undesirable (repeated matchups, extra byes to one player, etc.)

Janzert

Title: Re: Floating Elimination, Polynomial Time Pairing
Post by aaaa on Jan 13th, 2011, 6:03pm
I think that the current floating tuple elimination format with its hierarchy of strictly overruling priorities is pretty close to ideal, apart from the lopsided influence of seeding.

To combat this, I'd first make giving a bye to a higher rated player a lower priority than avoiding match-ups between players with different number of losses. This would also lead to the hierarchy nicely following a regular pattern, where concerning one particular aspect (repetitions, number of losses and ratings), the corresponding rule with respect to assigning a bye would be directly followed by the one concerning pairing players.

As for seeding, my proposal would be to do a reseeding each round based on the (Bradley-Terry) performance rating of each player in the tournament, adding a half-weighted draw against a virtual player that has the seeding rating of the player in question. I choose half a weight, so that each actual tournament game would have a stronger weight than the prior. An exception can be made for forfeited games, which, for the purpose of calculating performance ratings, can be considered not to have taken place at all.

This approach would also help with getting rid of the unfortunate qualifying phase for bots. A round-robin phase should (apart from having the advantage of being able to be scheduled completely in advance) supply reasonably differentiating performance ratings for the FTE part, even with every participant starting with the same prior.

Title: Re: Floating Elimination, Polynomial Time Pairing
Post by ocmiente on Jan 13th, 2011, 8:29pm
I suspect that the outcome of the tournament doesn't change much with any reasonable changes in the pairing algorithms.  From other (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=events;action=display;num=1249653915;start=75) threads (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=events;action=display;num=1290620477;start=112#112), comparing 4 different floating triple elimination approaches, the outcome looks about the same for each.
formatinacc=50inacc=200inacc=400
floatTripElim (10.3)35.3%34.7%33.5%
floatTripElim2 (11.4)35.7%34.9%34.4%
floatTripElimRand (11.0)35.2%33.9%33.5%
FoySim (10.6)34.1%35.3%33.4%
     
           
The simulations were run as follows:
run4 'formats/XYZ' 2000 16 500 50 9999
run4 'formats/XYZ' 2000 16 500 200 9999
run4 'formats/XYZ' 2000 16 500 400 9999

The number in parenthesis after the format name is the average number of rounds the format requires.



If it turns out that the complexity of the matching algorithm doesn't result in much difference in the result, then I would prefer to select the simplest approach.  Something that might be done by hand would be ideal, so that players could more easily understand how they are matched.  

So I'm curious to know what numbers Janzert's latest format comes up with.  If the percent or times player 1 wins is much higher than 35%, then we should spend more time fine tuning.

Title: Re: Floating Elimination, Polynomial Time Pairing
Post by Janzert on Jan 14th, 2011, 1:46pm

on 01/13/11 at 20:29:42, ocmiente wrote:
So I'm curious to know what numbers Janzert's latest format comes up with.  If the percent or times player 1 wins is much higher than 35%, then we should spend more time fine tuning.


To be clear my implementation is not a new format. At least barring any bugs it should be the same pairing as the current floating elimination that is currently in use (also the same as doublep's (floatTripElimRepair) and aaaa's (floatTripeElimAaaa) code in the tournament simulator). It just uses a more efficient algorithm to find the best pairing. The past implementations using a branch and bound method were limited to around 20 players maximum. With this new algorithm pairing a 1000 players1 takes about 3 to 7 minutes while using roughly 250MB of memory. So it should make it practical to use floating elimination for just about any size tournament.

Since I don't see the current floating triple elimination format numbers offhand, here's a couple of new runs:
rounds50200400
10.436%40%38%
Also my understanding is that for run4 the last parameter is percentage of ties so for standard Arimaa tournaments should be 0.

Janzert

1 Output from simulationn for 1 tournament of 1000 players:

Code:
run4 'formats/floatTripElimJanzert' 1 1000 1500 200 0
10  100.0%
average number of rounds = 18.00
minimum number of rounds = 18
maximum number of rounds = 18
average rating from best = 11.0

Elapsed Time: 20:26.66


2 Full output from triple elimination runs:
Inaccuracy 50

Code:
run4 'formats/floatTripElimJanzert' 2000 16 500 50 0
 1   36.8%
 2   25.4%
 3   16.4%
 4    9.3%
 5    5.3%
 6    3.3%
 7    1.9%
 8    1.0%
 9    0.5%
10    0.1%
11    0.1%
12    0.1%
average number of rounds = 10.40
minimum number of rounds = 9
maximum number of rounds = 12
average rating from best = 36.7

Elapsed Time: 22:34.87

Inaccuracy 200

Code:
run4 'formats/floatTripElimJanzert' 2000 16 500 200 0
 1   40.1%
 2   21.8%
 3   15.5%
 4   10.0%
 5    5.3%
 6    3.4%
 7    1.8%
 8    1.0%
 9    0.6%
10    0.3%
11    0.2%
13    0.1%
14    0.1%
15    0.1%
average number of rounds = 10.42
minimum number of rounds = 9
maximum number of rounds = 12
average rating from best = 36.0

Elapsed Time: 22:41.64

Inaccuracy 400

Code:
run4 'formats/floatTripElimJanzert' 2000 16 500 400 0
 1   38.8%
 2   23.8%
 3   15.2%
 4    8.7%
 5    5.3%
 6    3.9%
 7    2.3%
 8    1.2%
 9    0.5%
10    0.3%
11    0.1%
13    0.1%
average number of rounds = 10.43
minimum number of rounds = 9
maximum number of rounds = 12
average rating from best = 36.6

Elapsed Time: 22:36.85


Title: Re: Floating Elimination, Polynomial Time Pairing
Post by ocmiente on Jan 14th, 2011, 3:29pm
Thanks very much for the explanation.  I think I get it now and understand why the complex matching algorithm is a good way to go.  

I guess that the higher percentage of wins for player 1 in this format is due to the fact that it prefers to match stronger players with weaker players (while at the same time minimizing the win/loss difference between opponents for competitive reasons) in order to keep the stronger players around as long as possible.

I think that I agree with aaaa that the influence of seeding seems lopsided.  His recommendation seems like a good idea.

Regarding the final parameter being 0 rather than 9999, you are right.  I'm surprised that 9999 works at all, but running the simulations manually with 9999 gives plausible results.  I'm going to rerun one of the 2000 tournament simulations to see if there is any difference, just because I'm curious.

[EDIT]
curiously, the results are about the same

Code:
run4 'formats/FoySim' 2000 16 500 200 9999
 1   35.3%
 2   23.6%
 3   15.4%
 4    9.8%
 5    6.3%
 6    4.7%
 7    2.2%
 8    1.4%
 9    0.6%
10    0.4%
11    0.3%
12    0.1%
13    0.1%
14    0.1%
average number of rounds = 10.59
minimum number of rounds = 9
maximum number of rounds = 12
average rating from best = 40.7

run4 'formats/FoySim' 2000 16 500 200 0
 1   35.5%
 2   23.8%
 3   16.6%
 4   10.2%
 5    5.7%
 6    4.0%
 7    2.2%
 8    0.8%
 9    0.8%
10    0.3%
11    0.1%
12    0.1%
13    0.1%
average number of rounds = 11.52
minimum number of rounds = 9
maximum number of rounds = 14
average rating from best = 38.3

[END EDIT]

Do you have any results for 2000 tournaments of 33 players?  


Title: Re: Floating Elimination, Polynomial Time Pairing
Post by Janzert on Jan 15th, 2011, 12:24am
To keep with the power of 2s here are a few 32 player runs.
First this one uses a little more realistic Elo spread of 1500 and inaccuracy of 200.

Code:
run4 'formats/floatTripElimJanzert' 2000 32 1500 200 0
 1   49.5%
 2   24.9%
 3   12.8%
 4    6.1%
 5    4.0%
 6    1.3%
 7    0.7%
 8    0.5%
 9    0.1%
10    0.1%
11    0.1%
12    0.1%
average number of rounds = 11.95
minimum number of rounds = 10
maximum number of rounds = 14
average rating from best = 32.9

Elapsed Time: 28:18.14

And here are the more standard 500 spread and 50, 200, 400 inaccuracies:

Code:
run4 'formats/floatTripElimJanzert' 2000 32 500 50 0
 1   30.8%
 2   20.4%
 3   14.5%
 4   10.9%
 5    6.7%
 6    5.4%
 7    3.6%
 8    2.8%
 9    1.7%
10    1.3%
11    0.8%
12    0.3%
13    0.3%
14    0.3%
15    0.1%
16    0.2%
18    0.1%
average number of rounds = 11.91
minimum number of rounds = 10
maximum number of rounds = 13
average rating from best = 31.3

Elapsed Time: 28:09.31


Code:
run4 'formats/floatTripElimJanzert' 2000 32 500 200 0
 1   33.1%
 2   20.1%
 3   14.7%
 4    9.7%
 5    6.5%
 6    4.8%
 7    3.7%
 8    2.2%
 9    1.8%
10    1.1%
11    1.0%
12    0.5%
13    0.3%
14    0.1%
15    0.1%
16    0.3%
17    0.1%
18    0.1%
19    0.1%
20    0.1%
average number of rounds = 11.92
minimum number of rounds = 10
maximum number of rounds = 13
average rating from best = 30.2

Elapsed Time: 31:23.63


Code:
run4 'formats/floatTripElimJanzert' 2000 32 500 400 0
 1   32.7%
 2   20.4%
 3   15.1%
 4    9.1%
 5    7.3%
 6    4.2%
 7    3.5%
 8    1.8%
 9    2.1%
10    1.1%
11    0.8%
12    0.5%
13    0.5%
14    0.3%
15    0.3%
16    0.1%
17    0.1%
20    0.1%
average number of rounds = 11.92
minimum number of rounds = 10
maximum number of rounds = 13
average rating from best = 30.4

Elapsed Time: 32:38.66


Janzert

Title: Re: Floating Elimination, Polynomial Time Pairing
Post by omar on Jan 21st, 2011, 3:06pm
I've been a bit behind on reading the forum; so I had not seen this till now.

Wow, a 1000 player FTE. That's amazing.

Regarding using 0 instead of 9999 for the draw parameter. It won't make any difference if your tournament state file is using 'pick' instead of 'pair'.

aaaa, sorry I asked you to explain the advantages of your proposal in the 2012 WC format thread, but I see that you had already explained it here. Have you been able to implement this seeding method and compare it with our current FTE? Seeing as how robust FTE is to rating inaccuracy, I would guess that changing the seeding method won't make much difference. In fact even random seeding at each round would probably do just as well.


Title: Re: Floating Elimination, Polynomial Time Pairing
Post by aaaa on Jan 28th, 2011, 6:06am
To get an idea of how calculated ratings differ from the "real" ones, it is worth pointing out that the WHR system
also comes with a method of calculating rating uncertainties for players. So it would be worth looking at what these numbers are for participants of a championship at the time of its start.
It underscores, however, the fact that it makes more sense for the discrepancies to follow something like a normal rather than a uniform distribution in the simulations. It turns out that, given the randomness functionality provided by Boost, it's not so hard to create a generator for this under C++. As an executable, it could then simply be called by any script.

Title: Re: Floating Elimination, Polynomial Time Pairing
Post by aaaa on Feb 23rd, 2011, 8:58pm
Another dump of C++ code on the wiki, again totaling 3 files:

perform.hpp contains code to calculate performance ratings. It's a separate file in order to allow it to be used in a different context; in particular, I'm thinking of a program that simply outputs values that indicate the relative performance of the players.
fte2.cpp is the new floating tuple elimination scheduler, but it requires (besides, obviously, the already uploaded colors.hpp) the use of a maximum weighted matching algorithm (which doesn't have to restrict solutions to those with maximum cardinality, although it is slightly preferable).
match-py.cpp is one particular implementation of such an algorithm; it interfaces with the file mwmatching.py from http://www.xs4all.nl/~rjoris/maximummatching.html, which needs to be in the working directory when the scheduler is called. In addition, to compile it, one needs to have the Python library installed.

To get an idea of how to get an executable, I at least, have to do something like the following (notice also the use of optimization flags):

Code:
g++ -O3 -fomit-frame-pointer fte2.cpp match-py.cpp -I/usr/include/python2.6 -lpython2.6

I had also created a similar interface to some very old C code (which, thus, would have eliminated any library dependency), but that doesn't appear to be able to handle the large weights that arise from having to enforce a strict priority of scheduling rules.

The program expects at least five arguments:
  • Like with the earlier scheduler, an excess of arguments leads to the earliest arguments to be interpreted as names of files containing games to be remembered for color assignment purposes only. To recap, each line represents a game; first "word" is gold player, second is silver player and additional words are ignored. Given the plan of unifying the world championship, this feature should thus no longer be made use of.
  • The fifth-to-last argument (i.e. the first argument if the number of arguments is the minimum) indicates whether the seeds are to be randomized (non-zero integer) or not (0); in the latter case, they are determined by the order in which the player names appear in the player file, best seeded to worst. For the human championship at least, my preference is to still have seeding act as the weakest tiebreaker, or, to phrase it as a motto, “Seeding should have as little influence as possible, but no less.” If, however, the elimination phase of the bot championship were to be preceded by a round-robin phase in the future, then the qualifying phase could be eliminated and this randomization option could be turned on, as plenty of rank differentiation should come from playing out a fixed round-robin phase and then considering the resultant performance ratings.
  • The fourth-to-last argument is the name of the file that contains the names of the participants, this time separated by newlines, not just any whitespace. Additional words, like ratings, can be present, but they are completely ignored and will not cause any reordering to take place for the purpose of seeding.
  • The third-to-last argument takes a floating point number indicating the weight of the virtual prior for the purpose of calculating performance ratings; it is as if each player starts out with this many draws against a virtual player with a fixed rating that is the same against everyone; a half here would be the most straightforward choice of a positive weight that doesn't count as much as any real game. The scheduler calculates performance ratings on the basis of all games except only those in which both sides forfeited (see here (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=events;action=display;num=1295473432;start=15#18) why I made one-sided forfeits still count).
  • The second-to-last argument is the name of the file containing past games that, besides affecting color history, are fully considered by the elimination (part of the) tournament.
    • As expected, normal is to have three words on one line indicating the game participants and whoever of the two won.
    • If there are only two player names, then this indicates a double forfeit.
    • If there is only one player name, then this player is removed from consideration (as if he/she/it had run out of lives already); otherwise, no one who has any lives left is ever eliminated, i.e. there is no default of eliminating forfeiting players.
  • The last argument will be taken as the number of lives, provided that this will leave the tournament with any non-removed players; otherwise, the scheduler will use the minimum for which this is the case. This will make it possible for a round-robin phase to start handing out losses to players without everybody being eliminated at the end of it.
In conclusion, I'd expect the following command-line arguments to be given when scheduling future human world championships:

Code:
0 seeds.txt .5 games.txt 3

Title: Re: Floating Elimination, Polynomial Time Pairing
Post by omar on Feb 23rd, 2011, 11:40pm
Thanks for posting this aaaa. Will check it out after the events are over. Too many things going on right now.

You have an account on the server, so feel free to upload it there and set it up.

Title: Re: Floating Elimination, Polynomial Time Pairing
Post by Janzert on Mar 2nd, 2011, 1:00am
aaaa, could the excess arguments come at the end instead of beginning of the argument list?

Also even for tournaments where you want a randomized seeding, won't you want to do that external to the program so there can be a permanent record of the seeding? In which case the first argument can be eliminated.

Janzert

Title: Re: Floating Elimination, Polynomial Time Pairing
Post by aaaa on Mar 7th, 2011, 11:11am
In case one just has to break a tie, which in our case is currently necessary to determine which bot will join the champion in the screening period, one useful technique could be to manually extend the tournament as follows:
Continue to make use of the file containing all the games of the tournament, starting by explicitly eliminating everyone not in the running, i.e. the winner and, here, all those not amongst the losers with the most wins. Because that would normally eliminate everyone, the program will revive all not-explicitly-removed players, giving them each one more life (provided the number of lives supplied as argument is no more than the number used during the original course of the tournament plus one), and schedule pairings, still taking into account the earlier games of the tournament with respect to byes, pairings and performance ratings. If further rounds are necessary, then just continue to update the history file as normal.
I would think that having the games of a main tournament affect the scheduling of a mini-tournament for runner-up in this fashion should actually be a desirable thing.

Title: Re: Floating Elimination, Polynomial Time Pairing
Post by aaaa on Mar 26th, 2011, 8:47pm
I have succeeded in converting the Python implementation of the algorithm to C++, thus removing the dependency on the Python library. The code is on the wiki as the article named mwmatching.cpp and obviously simply takes the place of match-py.cpp in the building process of the scheduler executable.

Title: Re: Floating Elimination, Polynomial Time Pairing
Post by omar on Mar 28th, 2011, 1:59pm

on 03/26/11 at 20:47:26, aaaa wrote:
I have succeeded in converting the Python implementation of the algorithm to C++, thus removing the dependency on the Python library. The code is on the wiki as the article named mwmatching.cpp and obviously simply takes the place of match-py.cpp in the building process of the scheduler executable.


Wow, that was a lot of coding aaaa. Thanks. I'll try compiling it all and will let you know if I run into any problems.



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.