Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> Events >> Minimum-weight perfect matching
(Message started by: Fritzlein on Dec 30th, 2006, 9:54pm)

Title: Minimum-weight perfect matching
Post by Fritzlein on Dec 30th, 2006, 9:54pm
The floating elimination tournament format has some nice features, but one drawback is that it isn't obvious how to move from general features to a specific pairing.  For example, we can say we want winners to play winners, we want to avoid repeated pairings, we want to give the bye to the top remaining player with no bye yet, etc., but given those agreed principles, how do we find the optimal pairing?

Paul Pogonyshev (author of bot_Aamira) generously provided some code which would find the globally optimal pairing, using branch-and-bound techniques.  The great thing about his code is that we can plug in whatever formula we want, depending on what we value.  For example, if we decide that it is good to give the bye to the lowest remaining player rather than the highest, we just need to change a formula in one place, and his code cranks out the globally optimal pairing under the new assumptions.

The problem with the code, however, is that branch-and-bound takes an exponential time in the worst case.  Omar successfully ran Pogonyshev's program on a hypothetical 18-player tournament to test it out, and it worked fine, but when he tried to run it on our actual tournament of 20 players, the server crashed.

The workaround for this year was to use the simplistic pairing algorithm from last year, which finds the optimal pairing in easy cases.  For the first two rounds, it worked fine, because repeat pairings were not an issue until the third round.  And by the third round only 15 players remained, so Pogonyshev's code could be used from then on.  This workaround, however, will only work for a tournament size up to 24 players, because if any more players enter, there will be more than 18 left in round three.

The good news is that an efficient algorithm for finding the globally best pairing is known.  It could pair a tournament of hundreds of players in seconds.  Our tournament pairing problem is known in operations research crowds as "minimum weight perfect matching", which was solved in O(n^4) time in 1965 by Jack Edmonds.  That algorithm has been improved and re-implemented many times in the forty years since, so that O(n^3) is now considered outdated, although it is perfectly adequate for our purposes.

The bad news is that the efficient algorithm is not easy to describe.  I will eventually attempt a description here so that some community-minded soul can implement it, for the larger good of Arimaa.  Before I take that plunge, however, it pays to at least look for a free existing implementation.

When I  search Google for variations on "minimum weight perfect matching" I keep hitting a 1999 paper by Cook and Rohe.  That paper says their code is freely available from http://www.or.uni-bonn.de/home/rohe/matching.html.  Sadly, the link is now dead.  However, it is just possible that Andre Rohe is still at Universitaet Bonn, and that his code is still available for free.  I tried to wade through the Web site to look for him among the faculty, but my German apparently isn't quite good enough for that.

PMertens, would you be willing to look up Andre Rohe, if he is still at uni-bonn?  If we get his code, you will be spared a long-winded description of a moderately complex algorithm.  :)

Title: Re: Minimum-weight perfect matching
Post by woh on Dec 31st, 2006, 4:36am
At the web site of Cook http://www2.isye.gatech.edu/~wcook/blossom4/ you can download the code for 'The Blossom IV Code for Min-Weight Perfect Matching'. To build it you'll need a library called concorde but there is a link where you can download this as well.

PS: when I had a look both links were really slow...

Title: Re: Minimum-weight perfect matching
Post by PMertens on Dec 31st, 2006, 5:52am
thanks a lot ... saves me a lot of work  :D

I could dl the stuff at 50K .... not sure what is considered slow in Belgium ;-)

Title: Re: Minimum-weight perfect matching
Post by Fritzlein on Dec 31st, 2006, 8:16am
Nice find, woh!  I assume that Cook will let us use the code for a non-academic purpose, as long as we aren't making any money from it.  I gather Blossom IV is optimized for geometric instances (i.e. where the weights are actually distances between points in a plane), which our pairing is decidedly not, but since we have such small numbers of nodes (players) the last ounce of efficiency doesn't matter as much as the fact that running time is asymptotically polynomial.  We could be cracking a walnut with a sledgehammer here, but still it could be the easiest answer to our problem.

Title: Algorithm Description #1
Post by Fritzlein on Apr 13th, 2007, 11:21pm
I found a book which describes Edmonds' minimum-weight perfect matching algorithm in enough detail that one could implement it.  I believe I understand the description, but I am afraid my programming skills are not up to the coding task.  Omar has suggested that I describe the algorithm so that he can implement it.  I thought I might as well put the description up in the Forum so that Omar's code could be debugged and made more efficient by others that also understand what the algorithm is supposed to do.

However, before we even get started on the minimum-weight perfect matching, we need to come up with a matrix of weights which represent the "badness" of each pairing.  How to generate those weights is an entirely separate discussion.  We already agree about some things, for example that (winner vs. winner) and (loser vs. loser) should both have lower weights than (winner vs. loser), to encourage the algorithm to pick a pairing which keeps the losers' bracket separate from the winners' bracket.  We don't all agree on other things, like folding pairing.  We can hash out those differences later.

A perfect matching can only exist when there are an even number of players.  If we have an odd number of players left, then we have to introduce an extra player called "bye".  The weight for the pairing (Player X vs. bye) represents the badness of Player X getting a bye, so the weight will be huge if Player X already had a bye in an earlier round, astronomical if Player X has already had two byes, etc.

For the purposes of this description, I will assume we already have calculated the weight of each pairing, for an even number of players (2N), and I will assume all the weights are positive.  The algorithm works fine with negative weights, but it makes the initialization step marginally more complicated, so we might as well disallow them.

We will start with an empty matching, and iterate N times, each time producing a matching containing one more pairing than the matching of the previous iteration.  After the Nth iteration we will have N pairs, i.e. a perfect matching.  Given that anyone can play anyone, it is not hard to pair everyone (i.e. not hard to find some perfect matching), but due to the extremely clever way we proceed, the matching we will produce at the end of our Nth iteration will be guaranteed to be the best one, i.e. one of minimum weight.

Players only join the set of paired players; they never leave it.  Each iteration adds two players to the set of paired players.  However, we don't necessarily keep all the old pairings from one iteration to the next.  If the first iteration gave us the matching (A vs. B), the second iteration might produce the matching (C vs. A) and (B vs. D).

The augmentation will always take the form of a chain.  For example, suppose that after four iterations we have the pairing (A vs. B), (C vs. D), (E vs. F) and (G vs. H).  Suppose that on the fifth iteration we add the two players X and Y.  Some of the possible ways we could add them are:

1. Put (X vs. Y) in.
or
2. Put (X vs. A) in, take (A vs. B) out, and put (B vs. Y) in.
or
3. Put (X vs. F) in, take (F vs. E) out, and put (E vs. Y) in.
or
4. Put (X vs. G) in, take (G vs. H) out, put (H vs. B) in, take (B vs. A) out, put (A vs. D) in, take (D vs. C) out, and put (C vs. Y) in.

There are many more possibilities, but all of them will be of this pattern, and any of them will leave us with one more pairing in our matching than we had before.  Of course, we don't like building chains because it takes time, and we would rather do the simplest augmentation, so as soon as we find a "legal" augmentation (which I will define soon) we will halt the iteration and just add those players to the matching.

The first iteration is easy.  Pick any player at random, look at the weight of his pairing with each opponent, pick the minimum weight, and pair those two players.  Whee!  One down, N-1 to go!  If only it stayed that simple...

This forum limits the length of each post, and there is no way I am going to get the whole description into one post, so let me put in a page break here.

Title: Algorithm Description #2
Post by Fritzlein on Apr 14th, 2007, 9:36am
OK, before I explain how each of our N iterations is going to find a legal augmentation, I had better step back and define what I mean by "legal".  We are going to do some fancy bookkeeping which magically insures that the pairings we add to our matching are good pairings, i.e. pairings that make progress towards our minimum-weight perfect matching.  We don't want to add some pairing of enormous weight to our matching unless we have to, because it may mess us up so much that later iterations can't fix it.  In fact, the definition of a legal augmentation is exactly that it consists of "legal pairings".  So what I really need to define here is when a pairing is good enough for us to consider adding to our matching.

However, whether or not a pairing is good enough depends in some sense on the two players involved in the pairing.  You may have a player who is "hard to pair" because, for example, he has already played each of the other players once, so each of his possible pairings has a high weight.  We can't avoid all of those high-weight pairings, because we have to pair him with somebody!  It may turn out that one particular pairing with a high weight is actually very desirable, because it lets you pair a player who is otherwise extremely hard to pair, and you would have gotten an even worse weight if you paired him with anyone else.

Therefore we will introduce a second set of numbers, one for each player.  Let's think of these numbers as showing how expensive it is to pair each player.  For each player we now have an expense.  This is in addition to having the weights we started with for each pair of players.

All of our expenses start at zero, because initially we don't know how costly each player is going to be.  As we iterate, we will move player expenses up and down depending on who appears to be the bottleneck at any given time.  It may turn out that a player who initially seemed very expensive later happens to be someone that lots of different people want to pair, which will make him seem more desirable, and which will reduce his expense number later.

While we are moving our expenses up and down, we will always obey a strict rule.  The expense of A plus the expense of B will always be less than or equal to the weight of pairing (A vs. B).  This is intuitive, right?  It doesn't make any sense to say that A costs us 50 to pair, and B costs us 60 to pair, if the weight of (A vs. B) is only 80.  If the weight of (A vs. B) is 80, then the combined expenses of A and B can be at most eighty, and will usually be less.

It is less intuitive, but still possible, for a player's expense to become negative, even though the weights are all positive.  (The weights never change.)  A negative expense can happen if one player is particularly desirable to pair.  So in the above example where the weight of (A vs. B) is 80, it is possible for A to have an expense of 90 while B has an expense of -10.  The important thing is that the sum of two players' expenses is at most equal to the weight of their pairing against each other.

Now that we have an expense for each player, we can define what pairings are good enough to be considered.  A legal pairing is one that exactly matches the sum of the expenses of the two players.  This also makes intuitive sense, right?  If the weight of the pairing is greater than the sum of the two expenses, then the pairing is not good enough, because we think maybe we can substitute a different pairing with less weight instead.

The only trouble is that, at the outset, all pairings are illegal because we set all the expenses to zero!  At the start, none of the pairings looks good enough to us because we are underestimating everyone's true expense.  So one of our main tasks as we iterate is to increase everyone's expense as much as possible.  We will be going back and forth between two problems, using each problem to determine the answer to the other problem.  These two are called "dual linear programs" in math speak.

At the end of the day, we only want the best matching, and we don't give a hoot about expenses, but it turns out that we have to accurately calculate expenses in order to solve our real problem.  We will use the expense of each player to help us find a low-weight matching, and we will use our low-weight matching to help us find the true expense of each player, in a kind of feedback loop.  When we are done, the answer to each question proves that the answer to the other question is correct.  This is a totally subtle mathematical point, which I certainly can't prove in the space of the forum, but let me just say that Edmonds must have been a genius to see how it would solve this problem.  For mathematical correctness, you are just going to have to trust me.

Returning to our first iteration, I can now state it in terms of player expenses as well as in terms of player weights.  Pick a player at random.  It will appear that it is illegal to pair that player with anyone, but that is merely because we are underestimating the expense of everyone.  Let us raise the expense of the player until some pairing becomes legal.  When you think about it, there is only one way to do this.  We are not allowed to raise the expense past any of the pairing weights, so we can only raise it up to the weight of the least bad pairing for that player.  As soon as we raise the expense that high, the least bad pairing becomes legal.  We add that pairing to our matching, and the first iteration is over.  Whee!  If only it stayed this easy...

An example may help clarify.  Suppose our tournament has four players left, A through D, and the weights of their pairings are

(A vs. B) = 80
(A vs. C) = 25
(A vs. D) = 600
(B vs. C) = 15
(B vs. D) = 700
(C vs. D) = 500

Maybe D's weights are all high because he has played each of the other three earlier in the tournament.  First, we initialize all of our player expenses to zero.

A=0
B=0
C=0
D=0

Now we choose any player.  Let's say we choose C.  None of C's pairings are legal.

(A vs. C) = 25, but combined expense = 0+0 = 0 < 25.  Illegal
(B vs. C) = 15, but combined expense = 0+0 = 0 < 15.  Illegal
(C vs. D) = 500, but combined expense = 0+0 = 0 < 500.  Illegal

Therefore we have definitely underestimated expenses.  We raise C's expense until some pairing becomes legal.  This happens when C's expense is 15.  Then we have

(A vs. C) = 25, but combined expense = 0+15 = 15 < 25.  Illegal
(B vs. C) = 15, but combined expense = 0+15 = 15 = 15.  Legal.  Yay!
(C vs. D) = 500, but combined expense = 15+0 = 15 < 500.  Illegal

We have now created a legal pairing, which turns out to be a legal augmentation as well.  We add (B vs. C) to our matching, and the first iteration is complete.  We also save the expenses

A=0
B=0
C=15
D=0

because we will need them when we start the next iteration.

Title: Algorithm Description #3
Post by Fritzlein on Apr 14th, 2007, 12:26pm
Before I give a general way to find a legal augmentation, let's continue the above four-player example to get used to the idea of augmenting.  We have so far paired (B vs. C), so we pick an unpaired player at random to add to the matching.  Let's say we pick D.  At the moment D has no legal pairings:

(A vs. D) = 600, but combined expense = 0+0 = 0 < 600.  Illegal
(B vs. D) = 700, but combined expense = 0+0 = 0 < 700.  Illegal
(C vs. D) = 500, but combined expense = 15+0 = 15 < 500.  Illegal

We increase D's expense as much as we can.  The nearest constraint is the pairing (C vs. D), which is 485 from being maxed out, because C already has expense 15.  So we update the expense of D to 485, leaving our expenses at:

A=0
B=0
C=15
D=485

Now we have made (D vs. C) a legal pairing.  Sadly, though, it isn't a legal augmentation, because C is already paired to B.  We can't put (D vs. C) into our matching without taking (C vs. B) out of our matching.  Swapping out one for the other doesn't increase the size of our matching, unless we can simultaneously swap in a new pairing for B.

Like the motivational speakers, we consider this an opportunity, not an obstacle.  Instead of having one way to get a legal augmentation, i.e. finding an opponent for D, we have two ways to find a legal augmentation, i.e. finding an opponent for D or B.  We will have an augmentation if we can get either of those guys paired.

In order to get a new legal pairing for D or B, we need to raise their expenses.  Unfortunately, we can't raise D's expense without violating the maxed-out weight of (D vs. C).  Also we can't raise B's expense without violating the maxed-out weight of (B vs. C).  We are in a pickle.  The solution isn't too hard, though.  We can raise the expenses of both D and B, while simultaneously lowering the expense of C by exactly the same amount.  If we do that, the pairings (D vs. C) and (B vs. C) both stay legal, and eventually either D or B will gain a new legal pairing.

So, how much do we raise D and B while lowering C?  We look at how much slack all the pairing weights have left us:

(A vs. D) = 600, while combined expense = 0+485.  Slack = 600-485 = 115.
(A vs. B) = 80, while combined expense = 0+0.  Slack = 80-0 = 80.
(B vs. D) = 700, while combined expense = 0+485.  Slack = (700-485)/2 = 107.5.

Did I surprise you by dividing by two in the third line?  Remember that we are raising the expense of both B and D by the same amount.  They have to rise in tandem, so that lowering the expense of C by an equal amount can keep both (B vs. C) and (D vs. C) legal.  Thus the pairing (B vs. D) has half as much slack as you might think.

In any case (A vs. B) provides the tightest constraint with only 80 slack, so 80 is the amount by which we raise the expenses of B and D while lowering the expense of C.  Our new expenses are:

A=0
B=80
C=-65
D=565

Note that we now have three legal pairings, and those three pairings make up a chain that is a legal augmentation.  Let me list all the pairings just for completeness, although we don't really care about the illegal pairings since we aren't going to use them.

(A vs. B) = 80, combined expense = 0+80 = 80.  Legal
(A vs. C) = 25, combined expense = 0-65 =-65 < 25.  Illegal
(A vs. D) = 600, combined expense = 0+565 = 565 < 600.  Illegal
(B vs. C) = 15, combined expense = 80-65 = -15.  Legal
(B vs. D) = 700, combined expense = 80+565 = 645 < 700.  Illegal
(C vs. D) = 500, combined expense = -65+565 = 500.  Legal

We can augment along a chain of legal pairings.  Put (D vs. C) in, take (C vs. B) out, and put (B vs. A) in.

That was our second iteration.  Everyone is paired now, and we are done!  Note that the sum of all the expenses is 580 and the sum of the weights in our perfect matching is also 580, i.e. 500 from (D vs. C) and 80 from (A vs. B).  This is not an accident.  Indeed, the equality is our guarantee.  You could pick a different set of expenses that doesn't violate any weight, for example

A=40
B=40
C=-30
D=530

However, you could NOT find any set of expenses which totals more than 580 without violating some pairing weight.  Similarly, I can find other perfect matchings, but not with a total weight of less than 580.  Mathematically speaking, whenever the values of two dual linear programs are equal, then both are optimal solutions.

Title: Algorithm Description #4
Post by Fritzlein on Apr 14th, 2007, 8:05pm
Eagle-eyed readers may have spotted me being tricky in the above example.  For the second iteration I picked an unpaired player at random, namely D.  This implies that I could have picked A just as well.  However, if you go back and try to pick A instead, you will find that you get stuck.  You won't be able to increase the expenses of A and B enough for either of them to get a legal pairing with D, because the weight of (A vs. B) gets in the way.  So you might ask how I knew that picking D was correct.

Actually, the issue is worse than that.  There are situations where you can get stuck no matter which unpaired player you use to start an iteration.  Let me give one of these troublesome examples, and then explain how we are going to get around it.

Suppose we have a tournament of six players, A through F, seeded in alphabetical order.  Suppose that in the first round A beats F, B beats E, and C beats D.  Our task is to pair the second round, and we have generated the following table of weights:
-ABCDEF
A-325759510
B3-65450657
C26-5025455
D5754502-21
E59506542-3
F510575513-
The weights reflect the fact that we would like to pair winners against winners, and losers against losers.  Also the weights reflect the fact that we really, really don't want to pair players who have already played each other.  Some trial and error indicates that the best perfect matching is (A vs. B), (C vs. E), and (D vs. F) with a total weight of 57.  It isn't easy to be sure this is best, even with only six players, but the number of reasonable cases is still small, so eventually one can feel confident.

However, if we try to assign expenses in the dual problem, we will end up with expenses like

A = 0
B = 3
C = 2
D = 0
E = 2
F = 1

for a total expense of 8.  We can try all day to raise the expenses, but the total will never get up to 58.  Indeed, the optimal set of expenses adds up to 8.5.  Even worse, with these expenses, none of the pairings between a winner and a loser is legal.  What went wrong?

It turns out that the dual solution corresponds to a matching that is essentially (A vs. B vs. C) and (D vs. E vs. F).  If we could play three-way Arimaa, that probably would be the best way to do it.  But since we are stuck with a two-player game, we have to find a way to exclude such degenerate solutions.

The problem arises because sometimes it is difficult to pair a whole group of players with any player outside of that group.  If the group has an even number of players, we don't care that we can't pair outside the group, because we can pair them all internally.  But if the group has an odd number of players, we have to pair someone inside the group with someone outside.

The workaround to this problem is in the spirit of assigning expenses to players.  If we find an odd group that is difficult to pair, we assign an expense to the group as a whole, rather than to any player inside the group.  To reflect the difficulty in pairing a winner to a loser, we need to give an expense to the group ABC or the group DEF.  If we had the following set of expenses

A = 0
B = 3
C = 2
D = 0
E = 2
F = 1
ABC = 50

then the pairing (C vs. E) would be legal.  To be more precise, a pairing is now legal if its weight is equal to the sum of expenses of the two players, plus the expense of any group that one of the players is in and the other player is not in.  (C vs. E) has a weight of 54, which equals 2 (the expense of C) plus 50 (the expense of ABC) plus 2 (the expense of E).

We have to be just as careful with our group expenses as we were with our player expenses to never have any pairing with a weight less than the sum of the expenses.  In the above example, we couldn't increase ABC to an expense of 51 because that would exceed the weight of (C vs. E).  Also we couldn't have weights

A = 0
B = 3
C = 2
ABD = 5

because (A vs. C) has weight of only two, and the combined expense of 0 (A) + 2 (C) + 5 (ABD) is greater.  

By now some folks' warning bells of algorithmic efficiency are probably going off.  There are a ton of different odd groups when there get to be any sizable number of players.  For fifty players, there are 344,680,279,929,482 different possible odd groups.  How could we possibly find the best set of expenses when the number of variables is exponential?  Fortunately, almost all of those expenses will be zero almost all of the time.  We don't need to use all the variables to find an optimal solution; we only need a few.  Indeed, we will never give an expense to more than N odd groups, and the groups will never overlap in funky ways; they will either be disjoint or nested.

The algorithm which incorporates odd groups is hard to explain in the abstract, so I would like instead to explain by example using the six-player table of weights given above.  Afterwards I will try to define the algorithm more formally.  This time through, instead of picking players randomly, I will start each iteration with the unpaired player who is alphabetically first.

Initialization:  Set expenses to
A = 0
B = 0
C = 0
D = 0
E = 0
F = 0

Iteration one: select unpaired player A.  Of the five possible pairings involving A, the most constraining is (A vs. C), with a slack of only 2.  Therefore we increase the expense of A to 2, making (A vs. C) a legal pairing, and add this pairing to our matching.

State at end of first iteration:
matching = (A vs. C)
A = 2
B = 0
C = 0
D = 0
E = 0
F = 0

Iteration 2: select unpaired player B.  Of the five possible pairings involving B, the most constraining is (B vs. A), with a slack of 1.  Therefore we increase the expense of B to 1, making (B vs. A) a legal pairing.  This doesn't finish the iteration because A was already paired.
A = 2
B = 1
C = 0
D = 0
E = 0
F = 0

In order to keep increasing the expense of B, we need to reduce the expense of A and increase the expense of C.  This will keep both (B vs. A) and (A vs. C) legal.  Also it will give either B or C a new legal pairing.  So we check the constraints to increasing the expense of B and C:

(B vs. D) has weight 54, expense 1+0=1, slack 54-1=53
(B vs. E) has weight 506, expense 1+0=1, slack 506-1=505
(B vs. F) has weight 57, expense 1+0=1, slack 57-1=56
(C vs. D) has weight 502, expense 0+0=0, slack 502-0=502
(C vs. E) has weight 54, expense 0+0=0, slack 54-0=54
(C vs. F) has weight 55, expense 0+0=0, slack 55-0=55
(B vs. C) has weight 6, expense 1+0=1, slack (6-1)/2=2.5

By far the pairing (B vs. C) is the most constraining.  We increase the weights of B and C each by 2.5, while decreasing the weight of A by 2.5.  Our new weights are
A = -0.5
B = 3.5
C = 2.5
D = 0
E = 0
F = 0

This makes the pairing (B vs. C) legal.  Unfortunately, we still don't have an augmenting chain, even though (A vs. B), and (B vs. C) and (C vs. A) are all legal.  It's a closed circuit of three players.  Although we could take any of those three pairings to be part of our matching we can't get more than one pairing out of it.

We need to find some way to get one of these three players paired outside the group.  Curiously, we don't even care which of the three gets paired outside.  For example, if C gets paired to someone else, we can use the legal pairing (A vs. B).  It is as if ABC were a single entity requiring a partner.

Following this train of thought to its logical conclusion, we could spawn a sub-problem, namely pairing the four players D, E, F, and ABC.  If we solve that problem, we will be able to expand our solution to a solution for the original six-player problem.  There are some tricky bits to the collapsing and re-expanding, but the basic idea is to simplify the problem so we can keep making progress.

When we collapse A, B, and C into a single entity, we need to save all the all the possible pairings.  We will have three different ways to pair ABC to D, one corresponding to each of (A vs. D), (B vs. D), and (C vs. D).  This is necessary so that if we happen to pair (ABC vs. D), we will know which of the three original players gets the pairing when we expand back up to the original problem.  So the collapse is actually all about getting a single expense for the group ABC, not about reducing the number of possible pairings.

Also, our collapsed problem must respect all the constraints of the original problem.  How can we insure that, for example, the weight of (B vs. D) is never exceeded by the sum of expense of B, expense of ABC, and expense of D?  There is no B in the collapsed problem.    The answer is to reduce the weight of (B vs. D) by the expense of B in the collapsed problem, after which the constraints of the collapsed problem correspond to the original constraints.  Thus we take the state:
-ABCDEF
A-325759510
B3-65450657
C26-5025455
D5754502-21
E59506542-3
F510575513-
A = -0.5
B = 3.5
C = 2.5
D = 0
E = 0
F = 0

and we collapse to the state
-ABCABCABCDEF
ABC---57.559.5510.5
ABC---50.5502.553.5
ABC---499.551.552.5
D57.550.5499.5-21
E59.5502.551.52-3
F510.553.552.513-
ABC = 0
D = 0
E = 0
F = 0

We are still in the middle of the second iteration, which started because we were trying to add player B to our matching.  In our collapsed world that means we are trying to add player ABC to our matching.  This means increasing the expense of ABC until some pairing becomes legal.  The smallest constraint is 50.5, coming from the pairing (ABC vs. D).  We increase the expense of ABC to 50.5.  Now (ABC vs. D) is not only a legal pairing, it is a legal augmentation.  We add it to our matching, and the second iteration is done.  Whew!

I had better start another thread before hitting the length cutoff.

Title: Algorithm Description #5
Post by Fritzlein on Apr 15th, 2007, 10:47am
Notice that, at the end of iteration 2, we didn't bother to re-expand our problem.  There is no need.  After we are totally done we can trace all the way back up to the perfect matching on the original problem.  Sadly, there is one case where we might be forced to re-expand the problem in the middle of an iteration, but I'll explain that later.

We begin iteration 3 with the matching consisting of the pairing (ABC vs. D), the weights
-ABCABCABCDEF
ABC---57.559.5510.5
ABC---50.5502.553.5
ABC---499.551.552.5
D57.550.5499.5-21
E59.5502.551.52-3
F510.553.552.513-
and the expenses
ABC = 50.5
D = 0
E = 0
F = 0

For this iteration we attempt to add the unpaired player E.  None of E's pairings are legal at present, so we need to increase E's expense.  We check the six possible pairings to see which is most constraining:
(ABC vs. E) has weight 59.5; expenses total 50.5+0 = 50.5; slack=59.5-50.5=9
(ABC vs. E) has weight 502.5; expenses total 50.5+0 = 50.5; slack=502.5-50.5=452
(ABC vs. E) has weight 51.5; expenses total 50.5+0 = 50.5; slack=51.5-50.5=1
(D vs. E) has weight 2; expenses total 0+0 = 0; slack=2-0=2
(E vs. F) has weight 3; expenses total 0+0 = 0; slack=3-0=3

The most constraining is (ABC vs. E), so we increase E's expense to 1, making this pairing legal.  Unfortunately, it isn't a legal augmentation, because we can't put (E vs. ABC) in without taking (ABC vs. D) out.  As usual, this gives us two ways to win.  We will increase the expenses of E and D in tandem, while reducing the expense of ABC, until either E or D gets a new legal pairing.

When we check the constraints to increasing D and E, note that all the ABC pairings don't enter into it, because ABC's expense is being reduced at the same time.

(D vs. F) has weight 1; expenses total 0+0 = 0; slack=1-0=1
(E vs. F) has weight 3; expenses total 1+0 = 0; slack=3-1=2
(D vs. E) has weight 2; expenses total 1+0 = 0; slack=(2-1)/2=0.5

We find that (D vs. E) is the most constraining, and limits us to increase the expense by 0.5.  When we make that change our expenses become

ABC = 50
D = 0.5
E = 1.5
F = 0

and (D vs. E) becomes legal.  But we are again in a pickle with an odd cycle, this time consisting of ABC, D, and E.  We can pair any of the three to any other in the circle, but we can only get one total pairing out of it, so we can't augment.  Furthermore, we can't increase the expenses without violating some constraint.  Therefore, we pull out our favorite trick, and collapse the problem again!  This time we merge ABC, D, and E into one entity.  Remembering to reduce our weight by the current expenses, our newly collapsed problem becomes
-ABCDEABCDEABCDEABCDEABCDEF
ABCDE-----460.5
ABCDE-----3.5
ABCDE-----2.5
ABCDE-----0.5
ABCDE-----1.5
F460.53.52.50.51.5-
with expenses
ABCDE = 0
F = 0

Recall that we started the iteration trying to pair player E, which falls into the collapsed problem as trying to pair player ABCDE.  We increase ABCDE's expense as much as possible, which is 0.5 in this case, resulting in (ABCDE vs. F) becoming a legal pairing.  This is also a legal augmentation, so we add the pairing to our matching.  This completes the third iteration.

What's more, now that we have made three pairings, we are done with the problem as a whole.  It only remains to expand the collapsed problem and recover the corresponding pairing in our original problem.  In the first expansion, from (ABCDE vs. F), we recover (D vs. F).  Since D was paired out of our cycle (ABC vs. D), (D vs. E), and (E vs. ABC), and D can't be paired twice, we are left to put (E vs. ABC) into our matching.

If we expand (E vs. ABC), we get the pairing (E vs. C).  Since C was paired out of our cycle (A vs. B), (B vs. C), and (C vs. A), we are forced to put (A vs. B) into our matching.  Thus our perfect matching in the original problem is (D vs. F), (E vs. C), and (A vs. B), just as we originally expected.

The total weight of our matching is 58.  Note that our final expenses were

A = -0.5
B = 3.5
C = 2.5
ABC = 50
D = 0.5
E = 1.5
ABCDE = 0.5
F = 0

which not coincidentally adds up to 58.  As before, this is our guarantee of an optimal solution, and now the optimal solution only includes games between pairs of people.  Yay!

Title: Algorithm Description #6
Post by Fritzlein on Apr 15th, 2007, 6:56pm
Now that I've worked all the way through a small example, let me try to be more general about what each iteration is doing.  In the examples, when I picked an unpaired player to add to the matching, I always found an augmentation in a few steps.  It doesn't have to be that quick.

For example, I could be trying to add player X to the matching.  I raise X's expense until some pairing becomes legal, say (X vs. A), but I can't add that to the matching because (A vs. B) is already in the matching.  So I raise the expenses of X and B while lowering the expense of A until some pairing becomes legal, say (B vs. C), but I can't add that to the matching because (C vs. D) is already in the matching.  So I increase the expenses of X, B, and D, while lowering the expenses of A and C until some pairing becomes legal, say (B vs. P), but I can't add that to the matching because (P vs. W) is already in the matching, etc.  I could build up quite a tree before finding a legal augmentation, something like

Code:
X---A===B---C===D---E===F
|    .   \    .  \
|    .    \    .  \
M===N    . P===W   Q===V
   |\
   | \
H===G  L===Z

In the above tree, each line represents legal pairing, i.e. pairings with a weight equal to the sum of the expenses of the two involved players.  The heavy lines indicate pairings that are in the current matching.  The dots are meaningless but I have to add them to prevent the forum from collapsing the spaces.

On each step we would be raising the expense of everyone an even distance from the root, including the root, namely {X, B, D, F, N, W, V, H, Z} and would be lowering the expense of every player an odd distance from the root, namely {A, C, E, M, P, Q, G, L}.  The amount of change to all of these expenses is minimum of three things:

1) the minimum slack in any pairing which has one player in the even set and another player that is in neither of the two sets.

2) half the minimum slack in any pairing which has both players in the even set.

3) the minimum expense of any player in the odd set that is actually a collapsed group.

The third constraint comes from the fact that, although an individual player can have a negative expense, a group of players can't have a negative expense.  There is a mathematical reason a group can't have a negative expense, while a player can, but in addition to the math it also sort of makes sense intuitively.  We don't care if a group is really easy to pair to someone outside.  In fact it might have multiple easy ways to pair outside, and we don't want to give a bonus for each.  For example, consider the weights below:
-ABCDEF
A-1001001001001
B100-1001001100
C100100-1100100
D1001001-100100
E1001100100-100
F1100100100100-
The minimum-weight perfect matching is obviously (A vs. F), (B vs. E), and (C vs. D), with a total weight of 3.  Also the maximum total expense we could find is three, for example

A = 1
B = 1
C = 1
D = 0
E = 0
F = 0

But if you allow groups to have negative expenses, you could find silly expenses like

ABC = -99
A = 50
B = 50
C = 50
D = 50
E = 50
F = 50

which reflects how easy it is to pair someone inside ABC to someone outside ABC.  It's so easy we do it three times!  The negative group expense gives us a new total expense of 201, which is unrelated to reality.  So, to stop such nonsense, every group must have a weight of zero or above.

This raises the question of what to do when our tree can't grow any more due to hitting constraint #3.  Let's suppose we were stuck because player Q in the tree above was really player QRSTU that we collapsed earlier.

Code:
X---A===B---C===D---E===F
|    .   \    .  \
|    .    \    .  \
M===N    . P===W   QRSTU===V
   |\
   | \
H===G  L===Z

If QRSTU has expense zero that we can't lower any more, then we can't raise the expenses we need to raise.  This is the situation I alluded to earlier where we have to expand a collapse before the end of the algorithm.  But how to expand in the middle of building a tree?

The way we expand depends which player inside the cycle QRSTU the player V is paired with.  Since QRSTU is an odd cycle, there will alway be an even-length path around one side and an odd-length path around the other side.  We keep the even-length path in our tree and throw the rest away.  For example, suppose it expands to

Code:
X---A===B---C===D---E===F
|    .   \    .  \
|    .    \    .  \
M===N    . P===W   Q===R---S===V
   |\    .    .    \    . /
   | \    .    .    \    /
H===G  L===Z    .    U===T

Then we throw away the odd-length Q-U-T-S and keep the even-length path Q-R-S, for an updated tree

Code:
X---A===B---C===D---E===F
|    .   \    .  \
|    .    \    .  \
M===N    . P===W   Q===R---S===V
   |\
   | \
H===G  L===Z

After the expansion Q and S represent players, not groups, so we can reduce their expenses in order to grow the tree, even if they need to be reduced to negative numbers.

We can only expand groups with zero expense.  The expenses of the constituents of that group, however, can be anything.  We have to remember to add their expenses back into the weights of the pairing.  In the above example, the pairing (S vs. V) will have a weight equal to the expense of S plus the weight of the disappearing pairing (QRSTU vs. V), and similarly for all other pairings involving QRSTU.

I think I have finally explained all the weird conditions.  Just to be clear, the algorithm branches depending on which type of constraint we run into when trying to grow the tree.  Consider the cases

1) the minimum slack in any pairing which has one player in the even set and another player that is in neither of the two sets.

2) one half the minimum slack in any pairing which has both players in the even set.

3) the minimum expense of any player in the odd set that is actually a collapsed group.

In case #3, we adjust all the expenses, and then expand the collapsed group whose expense we just reduced to zero

In case #2, we adjust the expenses, and we notice that we have formed an odd-length cycle, which we collapse into a single pseudo-player in the tree

In case #1, we adjust the expenses, and add the newly made legal pairing to our tree.  If that player was already paired, we have to add his pairing to the tree and keep on growing.  If, however, we managed to connect to a previously unpaired player, we can trace the augmenting path all the way back up to the root, swapping pairings into and out of our matching, ultimately increasing the matching to one pairing more than it had before.  That's another iteration completed.

The book suggests that we don't have to throw away all the bits of tree that might be left over after an iteration.  I am not sure how this works in the data structure, but apparently if we are growing a new tree on the next iteration, and we hit a piece of tree that was left over from before, we can incorporate it directly into the new tree we are growing.

That's everything I can think of.  Any questions?

Title: Re: Minimum-weight perfect matching
Post by omar on Apr 16th, 2007, 7:53am
Thanks for that increadible explaination Karl. It's going to take me some time to digest this. It will be really great to eventually have a program that implements this.

Title: Re: Algorithm Description #1
Post by BlackKnight on Apr 17th, 2007, 1:55pm
Great story, yes! I was always eagerly waiting for the next part. And you didn't disappoint your readers ... :D

I just have a few simple questions first:


on 04/13/07 at 23:21:44, Fritzlein wrote:
I found a book which describes Edmonds' minimum-weight perfect matching algorithm in enough detail that one could implement it.

This seems to be an interesting book. Do you mind to mention the author and title?


Quote:
2. Put (X vs. A) in, take (A vs. B) out, and put (B vs. Y) in.
or
3. Put (X vs. F) in, take (F vs. E) out, and put (E vs. Y) in.

What's the difference? Or was 3.) supposed to have another pair involved?


Quote:
Then we throw away the odd-length Q-U-T-S and keep the even-length path Q-R-S, for an updated tree

What does actually happen to the pair U==T? Shouldn't U remain a child of Q?

I first thought collapsing could be implemented through recursion but because of possible expansions this might not be very useful.

After an expansion, can you easily "restore" the weight matrix?

As you understand the algorithm very well now, could you transform it into a structogram?
Or is there any (partial) pseudo-code provided in the book?
I started my own structogram but first need to understand a few more things.

Thanks a lot!

Title: Re: Algorithm Description #1
Post by Fritzlein on Apr 17th, 2007, 2:34pm

on 04/17/07 at 13:55:26, BlackKnight wrote:
Great story, yes! I was always eagerly waiting for the next part. And you didn't disappoint your readers ... :D

Wow, I didn't know I had any readers; I was just having fun writing.  Thanks for the positive feedback!  Maybe I should serialize publication of algorithms like Charles Dikkens used to do for novels.  (Note: name intentionally misspelled because the Forum censors "thingy ens")


Quote:
This seems to be an interesting book. Do you mind to mention the author and title?

The book is "Combinatorial Optimization", with William Cook as co-author.  Presumably this is the same Cook whose blossom algorithm code is available for academic use, as mentioned earlier in this thread.  Other co-authors are Cunningham, Pulleyblank, and Schrijver.


Quote:
What's the difference? Or was 3.) supposed to have another pair involved?

Hmmm, I think I included that example to make the point that we can use pairings in either order as we make our chain, i.e. (F vs. E) instead of (E vs. F).


Quote:
What does actually happen to the pair U==T? Shouldn't U remain a child of Q?

No, U can't remain a child of Q, because the path from X to U is not alternating.  If we try to swap in (D vs. Q) then we can't also swap in (Q vs. U), no matter what we swap out, because Q can pair only one player.  Similarly we can't swap in both (R vs. S) and (S vs. T), even if we swap out (S vs. V).

I can see that my explanation of which part of the cycle to throw away is a bit mystifying, but it comes from trying to retain the property that all paths out of the root are alternating between pairings we would like to swap in, and parings we would have to swap out, until we can find a path of this type terminating at an unpaired player.


Quote:
I first thought collapsing could be implemented through recursion but because of possible expansions this might not be very useful.

The book explains that spawning copies of the problem can have some conveniences, but the worst-case running time is O(N^4).  If, instead, you merge the nodes "virtually", the worst-case running time can be reduced from O(N^4) to O(N^3).  They explained the data structure necessary to efficiently do the merges and splits in a different section of the book which I didn't read.  I thought maybe the important thing was to get us from exponential running time down to O(N^4)...


Quote:
After an expansion, can you easily "restore" the weight matrix?

You probably can, if you are more clever than I am. :)


Quote:
Or is there any (partial) pseudo-code provided in the book?

I will transcribe the condensed description of the algorithm when I have it in front of me.
[EDIT: Actually, the pseudocode in the book relies so heavily on notation that it's basically useless out of context.]


Quote:
Thanks a lot!

It is quite literally my pleasure.

Title: Re: Minimum-weight perfect matching
Post by seanick on Apr 18th, 2007, 10:48am
this is indeed quite interesting. I have a book on algorithms that could benefit if rewritten by someone such as yourself. Except, of course, for the code sample part- the authors collaborating with you might be worthwhile though.
I'll see if I can make any sense of combining the logic you described, with any of the algorithms that it provides code to implement, and if it does there might be a fairly efficient way to work it into something usable.

The only problem, of course, is that I don't seem to have enough free time to ever finish anything like this lately...
one would have though that after shipping Vista my group would have some free time. No, it was just catch-up time, and was spent before we ever earned it...  ::)

Title: Re: Algorithm Description #1
Post by BlackKnight on Apr 18th, 2007, 4:30pm

on 04/13/07 at 23:21:44, Fritzlein wrote:
However, before we even get started on the minimum-weight perfect matching, we need to come up with a matrix of weights which represent the "badness" of each pairing.  How to generate those weights is an entirely separate discussion.  We already agree about some things, for example that (winner vs. winner) and (loser vs. loser) should both have lower weights than (winner vs. loser), to encourage the algorithm to pick a pairing which keeps the losers' bracket separate from the winners' bracket.  We don't all agree on other things, like folding pairing.  We can hash out those differences later.

So the weights to be assigned to pairs might just be approximations.
Do we really need a minimum weight perfect matching then, and how good is good enough?

Maybe we could use an approximation algorithm. Even Greedy's performance is only twice as bad as the optimum. Those algorithms are much simpler to implement.

Is there a link to the current pairing script? I only found a link to the pairing script for the postal.

Title: Re: Algorithm Description #1
Post by PMertens on Apr 18th, 2007, 6:27pm

on 04/17/07 at 14:34:58, Fritzlein wrote:
Wow, I didn't know I had any readers; I was just having fun writing.


:o

but you sure did know before that your stories have a decent fanbase ?

Title: Re: Algorithm Description #1
Post by Fritzlein on Apr 19th, 2007, 8:15am

on 04/18/07 at 16:30:38, BlackKnight wrote:
So the weights to be assigned to pairs might just be approximations.  Do we really need a minimum weight perfect matching then, and how good is good enough?

For 2006, we didn't try to find the global minimum, we used a form of greedy algorithm, namely to first assign the bye, then pair the top player, then pair the top unpaired player, etc.  This caused serious problems in the computer championship by pairing too many repeat match-ups.  That's why we switched to finding a global minimum in 2007.


Quote:
Maybe we could use an approximation algorithm. Even Greedy's performance is only twice as bad as the optimum. Those algorithms are much simpler to implement.

The most obvious greedy algorithm can be arbitrarily bad, much more than twice as bad as optimal.  Consider the following weights:

(A vs. B) = 1
(A vs. C) = 2
(A vs. D) = 3
(B vs. C) = 4
(B vs. D) = 5
(C vs. D) = 10,000

The greedy algorithm will first pair (A vs. B) and then have no choice but to pair (C vs. D).

The only approximation algorithm I have read relies on the triangle inequality to hold between the weights, and that is definitely not the case for us.


Quote:
Is there a link to the current pairing script? I only found a link to the pairing script for the postal.

Paul Pogonyshev's branch-and-bound code for finding minimum-weight perfect matching is here:


on 01/22/06 at 13:31:06, doublep wrote:
OK, I implemented an overkill solution (1000 lines of C++.) You can grab it at http://download.gna.org/quarry/tournament.cpp (modified BSD license, use as you please.)

However, his way of generating the weight matrix is not what we actually used for the 2007 World Championship, and for 2008 we probably want to use something different again.  Also, for the first two rounds of 2007 the running time of doublep's code was intolerable (it is worst-case exponential) so we only used it from the third round onwards, i.e. after there had been some eliminations.  Luckily, for the first two rounds we could use the 2006 greedy algorithm, because it happened to produce the same pairings as the global minimum for those rounds.

Title: Re: Algorithm Description #1
Post by BlackKnight on Apr 19th, 2007, 4:02pm

on 04/19/07 at 08:15:42, Fritzlein wrote:
The most obvious greedy algorithm can be arbitrarily bad, much more than twice as bad as optimal.

This is of course absolutely right. I am sorry for jumping around in my last post. I was referring to maximum weight matching and to the 2-approximation for minimum weight matching of Goemans and Williamson. But you are right, this holds only under the triangle inequality.

And I guess my main concern was whether we should really (re-)implement 1000+ lines of code, if we can find a simple approximation algorithm.

But I wasn't aware of the fact, that it might be difficult to find one as I could only remember the two afore mentioned algorithms.

If there is no such algorithm, then maybe we can come up with one ... ;)

Title: Re: Algorithm Description #1
Post by BlackKnight on Apr 22nd, 2007, 4:04pm

on 04/19/07 at 08:15:42, Fritzlein wrote:
Paul Pogonyshev's branch-and-bound code

Also, for the first two rounds of 2007 the running time of doublep's code was intolerable (it is worst-case exponential) so we only used it from the third round onwards, i.e. after there had been some eliminations.  Luckily, for the first two rounds we could use the 2006 greedy algorithm, because it happened to produce the same pairings as the global minimum for those rounds.


Thanks a lot for the link.

Do you still have the weight matrices for the first two rounds? I'd like to try them with the solver in Excel.

Title: Re: Algorithm Description #1
Post by Fritzlein on Apr 22nd, 2007, 5:26pm

on 04/22/07 at 16:04:47, BlackKnight wrote:
Do you still have the weight matrices for the first two rounds? I'd like to try them with the solver in Excel.

I suggested a formula in the thread below...


on 09/24/06 at 21:01:18, Fritzlein wrote:
To this end, let me rip apart the the eval function, and put it back together in the order of my priorities.

and I think that is what omar eventually plugged into doublep's code.  I think we should change the weight function again for next year, but that's another story.

I'll be extremely impressed if the Excel solver can do the job, given that it must either be framed as an integer program  (paired=1, unpaired=0), or if you state it as a linear program, then the number of variables is exponential in the number of players.

Title: Re: Algorithm Description #1
Post by BlackKnight on Apr 22nd, 2007, 5:56pm

on 04/22/07 at 17:26:21, Fritzlein wrote:
I'll be extremely impressed if the Excel solver can do the job, given that it must either be framed as an integer program  (paired=1, unpaired=0), or if you state it as a linear program, then the number of variables is exponential in the number of players.

Yes, 20 players is the maximum number of players we can pair. And this will not be enough in the future ...  :)

Title: Re: Algorithm Description #1
Post by BlackKnight on Apr 23rd, 2007, 3:01pm

Quote:
... if we can find a simple approximation algorithm.
... it might be difficult to find one ...

There was some recent (2003) research on simple approximation algorithms for maximum weighted matching.

But for minimum WM, the only thing I came across is a PhD thesis from 2000 with an implementation (free and Open Source(tm)) of the Lin-Kernighan heuristic for the Traveling Salesman Problem (TSP) and the minimum weight perfect matching problem.
It seems according to the experiments the weights of the matchings are about up to 2% over the optimum.
But it also takes more than five lines to implement ... ;)

Title: Re: Algorithm Description #1
Post by Fritzlein on Apr 23rd, 2007, 3:53pm
I confess, I'm not too excited about the idea of using an approximation algorithm.  With an exact algorithm we can choose weights to guarantee principles such as, "No player can receive an Nth bye unless all remaining players have received their (N-1)st bye."  Other parts of weighting may be approximate and disputable, but the bye rule should be ironclad.  I also think it should be ironclad that a matching which repeats a pairing from a previous round can never be preferred over a matching without repeats, except if necessary to respect the bye rule.

I would hate to see a situation where one of these rules was violated unnecessarily, and we have to say, "Well, the whole pairing scheme is only approximate anyway."  If it came to that, I would rather have a human tournament director do the pairings, and accept that the human would not find the optimal solution, but would have better judgment than a computer.

Not that I am volunteering to code up Edmonds' algorithm!  It's just that I would rather have a human approximation than a computer approximation, if we couldn't be exact.

Title: Re: Minimum-weight perfect matching
Post by BlackKnight on Apr 23rd, 2007, 5:44pm
And there might not be a simple approximation algorithm anyway?!

Jon Bentley: "Algorithm designers sleep peacefully only when they know their algorithms are the best possible;"

So, let's go for the optimum!:)

Title: Re: Minimum-weight perfect matching
Post by BlackKnight on Apr 25th, 2007, 12:43am
How about using a solver at the NEOS server?

I just tried MINTO with a random instance of 32 players. The solver itself took 0.04 secs to find the minimum weight perfect matching.

Title: Re: Minimum-weight perfect matching
Post by Fritzlein on Apr 25th, 2007, 8:51am

on 04/25/07 at 00:43:15, BlackKnight wrote:
How about using a solver at the NEOS server?

I just tried MINTO with a random instance of 32 players. The solver itself took 0.04 secs to find the minimum weight perfect matching.

Hey, wow!  I didn't know there was a free, public solver that would do minimum weight perfect matching.  I guess we can forget about implementing our own matching code, and start arguing appropriate weights instead!

Title: Re: Minimum-weight perfect matching
Post by aaaa on Feb 8th, 2010, 8:18pm
I'm happy to report that our tractability problem is finally over and not just in theory. I found this page (http://www.xs4all.nl/~rjoris/maximummatching.html) dedicated to the class of problems in question. From there it lead me to this virtual directory listing (ftp://dimacs.rutgers.edu/pub/netflow/matching/weighted/solver-1/) containing bare-bones code implementing Gabow's O(N^3) algorithm. It actually finds the maximum sum of weights and it's important that every weight is positive (not just non-negative) in order for all vertices to get paired. Here is the input and output corresponding to each of the above examples (format descriptions can be found in separate files):


Code:
4 6 U
3 0 0 0
2 621
3 676
4 101
3 0 0 0
1 621
3 686
4 1
3 0 0 0
1 676
2 686
4 201
3 0 0 0
1 101
2 1
3 201


Code:
1 2
2 1
3 4
4 3


Code:
6 15 U
5 0 0 0
2 508
3 509
4 454
5 452
6 1
5 0 0 0
1 508
3 505
4 457
5 5
6 454
5 0 0 0
1 509
2 505
4 9
5 457
6 456
5 0 0 0
1 454
2 457
3 9
5 509
6 510
5 0 0 0
1 452
2 5
3 457
4 509
6 508
5 0 0 0
1 1
2 454
3 456
4 510
5 508


Code:
1 2
2 1
3 5
4 6
5 3
6 4


Code:
6 15 U
5 0 0 0
2 1
3 1
4 1
5 1
6 100
5 0 0 0
1 1
3 1
4 1
5 100
6 1
5 0 0 0
1 1
2 1
4 100
5 1
6 1
5 0 0 0
1 1
2 1
3 100
5 1
6 1
5 0 0 0
1 1
2 100
3 1
4 1
6 1
5 0 0 0
1 100
2 1
3 1
4 1
5 1


Code:
1 6
2 5
3 4
4 3
5 2
6 1

It shouldn't be hard to figure out that it's working.
Now anyone can indulge himself in working out global rules for any tournament format.

Title: Re: Minimum-weight perfect matching
Post by Fritzlein on Feb 9th, 2010, 12:01am

on 02/08/10 at 20:18:51, aaaa wrote:
I'm happy to report that our tractability problem is finally over and not just in theory.

Sweet!  Now I can revive my campaign to eliminate the current preliminary/final dichotomy with a unified floating triple elimination.  :D

Title: Re: Minimum-weight perfect matching
Post by omar on Feb 10th, 2010, 6:58pm
Awesome find aaaa. Now Karl can go wild with defining global rules for a floating elimination tournament :-)

One of our main reasons for having a swiss preliminary was to provide a venue for players to take part in an organized tournament. Hopefully this year we will have more organized tournaments during the summer (like the Blitz tournament) so making the 2011 WC a FTE sounds possible. Although I would like it to still finish by the end of March (i.e. within 12 weeks).

Title: Re: Minimum-weight perfect matching
Post by Fritzlein on Feb 10th, 2010, 7:59pm

on 02/10/10 at 18:58:46, omar wrote:
Although I would like it to still finish by the end of March (i.e. within 12 weeks).

Given the same number of participants, floating triple elimination should actually finish faster than our current format.  Our two problems with the format were not being able to handle a large crowd, and giving too much advantage to the top seed.  If the size problem is now resolved, then all that remains is tweaking the pairing rules to be slightly more fair, i.e. slightly less dependent on pre-tournament ratings.

Title: Re: Minimum-weight perfect matching
Post by Hippo on Feb 11th, 2010, 1:23am
There should be one more change with triple elimination format. Now we use different time controls for preliminary than for finals. In triple elimination only one time control can be used ... which?

Title: Re: Minimum-weight perfect matching
Post by ChrisB on Feb 11th, 2010, 7:30am

on 02/11/10 at 01:23:16, Hippo wrote:
There should be one more change with triple elimination format. Now we use different time controls for preliminary than for finals. In triple elimination only one time control can be used ... which?

Another possibility would be to start with 60 secs. per turn for the early rounds and then increase the time control to 90 secs. per turn for the rounds having 8 or fewer remaining players.  That approach would be consistent with our current approach.

Title: Re: Minimum-weight perfect matching
Post by Fritzlein on Feb 11th, 2010, 9:03am

on 02/11/10 at 07:30:56, ChrisB wrote:
Another possibility would be to start with 60 secs. per turn for the early rounds and then increase the time control to 90 secs. per turn for the rounds having 8 or fewer remaining players.  That approach would be consistent with our current approach.

That makes sense to me.  In a 16-player floating triple elimination, there would probably still be five "preliminary" rounds before reducing the field to eight players, just as this year.  However, instead of having the slate wiped clean for the finals so that everyone has two lives left, we would more likely have one person with three lives left, two people with two lives left, and the other five with only one life left.  The effect of carrying losses forward is to slightly shorten the "finals".

Also FTE would avoid repeat pairings at all costs until they are unavoidable.  For example, in 2010 we have chessandgo vs. The_Jeh and 99of9 vs. Tuks repeating from the preliminaries in the first round of the finals, but since we could avoid that by having chessandgo vs. 99of9 and The_Jeh vs. Tuks, there would be no repeat pairings if this were round 6 of FTE instead of round 1 of the finals.  I consider that a decided advantage of FTE.

Title: Re: Minimum-weight perfect matching
Post by Fritzlein on Feb 11th, 2010, 10:38am
I think the problem of giving too much advantage to the top seed can be solved simply by implementing sliding pairing, e.g.

1 vs 5
2 vs 6
3 vs 7
4 vs 8

Unfortunately, I'm not sure what the global criterion would be.  Right now we maximize the sum of squares of rank difference to get folding pairing, e.g.

1 vs 8
2 vs 7
3 vs 6
4 vs 5

But if we instead minimize sum of squares of rank difference, we get the "process" pairing

1 vs 2
3 vs 4
5 vs 6
7 vs 8

which is a terrible way to refine the sorting of a list that is already mostly sorted.

I guess with sliding pairing we are trying to set a constant size of mismatch and minimize the deviation from that size.  Therefore to get sliding pairing we would need another kind of global calculation of how big each score group is and what size of mismatch is optimal within that score group.

Title: Re: Minimum-weight perfect matching
Post by aaaa on Feb 12th, 2010, 7:23am
Most single elimination tournaments, of which floating tuple is a generalization, have the winner of a game travel the same path regardless of who wins. This suggests a likewise extension here, where in case of an upset, defined as a lower seed beating a higher one, players trade seeds. That way, there would be less of a problem of the top seed(s) "vacuuming" giant killers. Fold pairing would then be desirable on the basis that one's (initial) seed would be quite variable and dependent on tournament results and thus be less predetermining by itself.

Title: Re: Minimum-weight perfect matching
Post by Hannoskaj on Feb 12th, 2010, 9:57am
Another possibility to avoid influence of the seed is simply to randomise.
In the weights, we make appear only the scores of the players and who they have played, and we randomise the order of the players in input of the algorithm.

If we have for example exactly two equivalent players, they will have the same chances of playing each given opponent during this turn.


Alternatively and at much higher computational cost and programming complexity, we might try and choose between equivalent pairings by modeling the probability of result of the games, sampling or computing exactly, and choose the matching that leads usually to an easier matching on next turn.

Title: Re: Minimum-weight perfect matching
Post by Fritzlein on Feb 12th, 2010, 1:54pm
Aaaa, your proposal of "stolen seeds" is intriguing, but I'm not sure how it would work in rounds past the second.  If #15 steals the seed from #2, can that #2 be stolen from him by #7 later?  If so, it seems that #7 could buy himself a better seed in round three on the basis of having beaten a worse opponent.  I guess I would need to see some examples of your proposal in action to develop my intuition.

PMertens was a big fan of randomizing pairings, or rather using only in-tournament information to make pairings so that seeding could not provide any advantage to the players.  I think this is what Hannoskaj is suggesting.

On the other extreme are people who don't mind giving a big advantage to the top seed as long as the seeding is accurate.  It is fair to acknowledge the importance of the regular season in the playoffs, as most sports do.  Omar showed us, however, that most of us wouldn't want to take that principle too far.  He proposed the SwissKnife format, which turns out to be one of the most accurate means of crowning the best player as World Champion: simply give the title to the player with the highest rating.  But nobody is willing to go to that extreme.  We all want the World Championship to be at least partly, if not entirely, decided by a tournament.

Even the way the finals of bowling tournaments are paired (or used to be) would probably be too extreme for us.  They have #5 play #4, and then the winner plays #3, and then the winner of that plays #2, and the winner of that plays #1.  Thus the number of times you need to win in order to win the tournament corresponds exactly to your seed entering the finals.

We had broad consensus about one principle at least: everyone should need the same number of wins to become champion (within one because an odd number of players may force byes).  But also we recognize that not all wins are equal.  The assignment of opponents and byes leaves some range for giving an advantage the high seeds or not.

In the 2007 World Chamionship, I was the #1 seed in a field of 20.  Due to the folding pairing, I got to play the #20 seed in the first round.  Of the ten winners in the first round, the lowest was #15, who had won by forfeit over the #6 seed, so I got the play the #15 seed.  In the third round, there were an odd number of players left total, so I got a bye while other 2-0 players duked it out, i.e. chessandgo vs. robinson and 99of9 vs. PMertens.  In the fourth round there were three 3-0 players, so one of us got to "play down".  I was the lucky one again, getting to play the 13th seed, who was the lowest 2-1 player.

Thus, entering the fifth round, chessandgo and I were both 4-0, but my opponents had been a bye, #20, #15, and #13, whereas his opponents had been #17, #8, #7, and #3.  That's when a broad consensus emerged that the tournament format was giving too much advantage to the top-seeded player, even though it was requiring the same number of wins from both chessandgo and myself.  (Fortunately, chessandgo won despite having the deck stacked against him.  I would have felt bad winning that year.)

So how much should the advantage of the top seed be weakened, and by what means?  I tend to lean towards making the World Championship tournament as self-contained as possible.  That should mean that I jump right on the "random pairing" bandwagon, which gives no advantage to anybody.  However, intuitively I am afraid that it will randomly create just as much unfairness as I got by institutional means in 2007.  Maybe we will have #1 vs. #20 and #2 vs. #3 in the first round just by luck.  The principle I would like to see preserved is that when two 4-0 players meet in the fifth round, they had an approximately equally tough road to get there.

The principle of "equally tough road" is enforced first and foremost by pairing winners against winners and losers against losers.  Within that framework, however, there is a lot of room for improvement.  In particular, I don't like that the #1 seed gets an easier pairing than the #2 seed in each round until they meet.  Folding pairing is fine for the first round, but a harder pairing in one round should be balanced with an easier pairing in a later round.

To some extent this problem can be alleviated using strength of schedule to order the players for the folding pairing.  If the top player gets easier opponents in the first two rounds, by round three he might have a weaker SoS, thus losing his top seed, thus giving the advantages of the top seed to the #2 player.  Fairly likely, though, SoS doesn't kick in that soon.  With folding pairing it is very likely that the opponents of both #1 and #2 lost in the second round too, providing no discrimination.

After the third round, though, SoS will definitely be in force, particularly if we use WHR which will probably break all ties by then.  I guess I can stomach the #1 seed getting a break for two or three rounds in exchange for the #2 seed getting a break in the fourth round, when the break is more likely to be a significant one.  So maybe I am leaning towards folding pairing throughout, but with pairing order determined by in-tournament WHR, not by seed except to break ties.

If we do adopt folding pairing ordered by SoS, however, we have the detail to consider of what to do with an odd number of players in a score group.  Who gets to play down, and who must play up?  In keeping with the spirit of rewarding earlier tough pairings with later easy pairings, it should be the top member of a score group that gets to play down, and the bottom member of the next lower score group who has to play up.  I know this is counter-intuitive and will draw some opposition, but I think it works quite well with the "equal road" principle.

The bad case for this pairing that people are going to point out is an 18-player tournament where there are 9 winners and 9 losers after one round.  In round 2, my proposal would have the top 1-0 play the bottom 0-1, which means the top seed will get a second easy game while the other 1-0 players have to duke it out with each other.  BUT this will wreck the strength of schedule of the top seed, meaning that going in to round three he will get a much worse pairing, and that will happen at a time when having an easier/harder pairing will matter more.  The karma comes back around in later rounds, which is the whole point of my proposal.

So, that's the best I can do, and I am curious what people think.  Folding pairing in every round, but ordering is by W-L first, SoS second, and seed only third, which will cause the pairing order to churn from the original seeding order.  The bye goes to the top remaining player with no bye.  The top player in a score group gets to play down, while the bottom player in a score group must play up.  The top seed will have a huge advantage but only at first and will have to pay the price for any early-round benefits by getting hosed (e.g. not getting the bye) later on.

(On second thought, I would probably prefer good old-fashioned SoS to WHR in this scenario.  (SoS = sum of opponent wins, zero for bye, zero for forfeit win.)  More ties in SoS mean that seeds are respected longer, which is OK in this format.  Also WHR has issues with byes and forfeits which the traditional SoS handles better by zeroing out.)

(The churning of the pairing order will introduce a lot of luck into the pairings, which is ironic because it will make it a lot more like the random pairing method I was trying to avoid.  Oh, well.  :) But I can see the case for using a simpler system (simply random) than a more complex system with a similar effect.)

Title: Re: Minimum-weight perfect matching
Post by Janzert on Feb 12th, 2010, 2:35pm
Sounds promising. It'd be nice if someone could code it up to run through the tournament simulator Omar has and check how it compares with the other formats that have been tried.

Janzert

Title: Re: Minimum-weight perfect matching
Post by aaaa on Feb 12th, 2010, 3:47pm
As a compromise, we could keep it as it is, except that for each round, the seeds are recalculated by considering all in-tournament games so far plus a quarter win and a quarter loss against an anchor player with the rating used for the initial seeding.

Moreover, I believe a forfeit should be treated like any other game result, since it's reasonable to expect a bias towards forfeits by underdogs.

Title: Re: Minimum-weight perfect matching
Post by Fritzlein on Feb 12th, 2010, 4:02pm

on 02/12/10 at 14:35:18, Janzert wrote:
Sounds promising. It'd be nice if someone could code it up to run through the tournament simulator Omar has and check how it compares with the other formats that have been tried.

Thanks, Janzert.  It would be nice to have it coded up for simulations, so that we could step through some tournaments and see how it works.  If I remember correctly, however, Omar's simulator wasn't measuring fairness or reasonableness; it was just measuring how often the best player actually won the tournament.  Therefore I expect any proposal that reduces the advantages to the #1 seed will fare worse according to his way of measuring.

If we are going to run the simulator at all, there should be a companion measurement, namely how often the best player wins the tournament if the initial seeds are totally random.  That is to say, the format should work well both when the ratings are approximately accurate and when the ratings are 100% nonsense.

But in any case I am less worried by how often the best player will win the tournament than I am worried whether there will be unforeseen strange pairings or bye assignments that appear unreasonable or unfair.  The reason that we keep tweaking our World Championship format is not that it is failing to select worthy champions, but that we keep noticing things that strike us as unfair in some respect.

For example, this year we had a few games in the last round of the preliminaries where one or both players had no incentive to win, or maybe even an incentive to lose.  Since the slate is wiped clean between preliminary and final, taking a preliminary loss to jockey for final seeding position is a reasonable strategy.  I can assure you, from my game against Adanac, that it is no fun to be in a position where if I lose, I will be suspected of losing on purpose.  It's even worse to be in that position when I am behind on the board.

In a unified triple-elimination, however, there will never (if it is properly structured) be a case where losing a game is better than winning.  The three lives are too precious to be squandered for any kind of positioning in pairing.  There will not be players who are "safe", nor will there be players who are "out" but still playing.  The players who can no longer win, no longer play.  All the players who are still playing have every reason to try to win.

Despite all the wrinkles that we still need to work out in FTE, I would be happy to get back to an elimination format.

Title: Re: Minimum-weight perfect matching
Post by Janzert on Feb 12th, 2010, 4:17pm

on 02/12/10 at 16:02:11, Fritzlein wrote:
Therefore I expect any proposal that reduces the advantages to the #1 seed will fare worse according to his way of measuring.


Yes, but what I want to know is how much worse. :}


Quote:
If we are going to run the simulator at all, there should be a companion measurement, namely how often the best player wins the tournament if the initial seeds are totally random.  That is to say, the format should work well both when the ratings are approximately accurate and when the ratings are 100% nonsense.


If I'm remembering the how the simulator worked correctly, one of the parameters you gave it was the error size for the ratings. So you ended up with each player having a true rating that determined the outcome of games and an apparent rating that was used for the seeding. Looking at how well the different formats performed in the face of large errors in the apparent ratings was a key concern at the time since we were using straight gameroom ratings and knew that they were often quite far off and open to manipulation.


Quote:
For example, this year we had a few games in the last round of the preliminaries where one or both players had no incentive to win, or maybe even an incentive to lose. ... Despite all the wrinkles that we still need to work out in FTE, I would be happy to get back to an elimination format.


Yes, the way swiss tournaments can end up with terrible incentives for players is something I really dislike.

Janzert

Title: Re: Minimum-weight perfect matching
Post by Fritzlein on Feb 15th, 2010, 2:48pm

on 02/12/10 at 15:47:54, aaaa wrote:
As a compromise, we could keep it as it is, except that for each round, the seeds are recalculated by considering all in-tournament games so far plus a quarter win and a quarter loss against an anchor player with the rating used for the initial seeding.

Calculating in-tournament ratings with a weak anchor will surely result in inversions between score groups.  The players will tend to layer along fault lines where nobody below a line has beaten anybody above the line.  For example, after three rounds of this year's preliminary tournament, there would be a fault line with Simon above it and 99of9 below it, even though Simon was 1-2 and 99of9 was 2-1.  Thus Simon would probably be ranked higher than 99of9 by an in-tournament WHR with a weak anchor.

But perhaps when you talk of only recalculating the seeds, you mean that we should also preserve pairing within score groups, in which case the in-tournament ratings would only re-order players within a score group and not globally.  In that case, I would approve of some churn from the initial seeding, as long as it isn't so much as to simply randomize the players.

If I understand the formula correctly, the churn could begin already on the second round.  For fun I tried folding pairing on the first round of this year's tournament with the actual players and ratings.  Supposing that the favorites had all won, the seeds would be re-calculated as follows:

Seed  Participant    WHR  New WHR  New Seed
----  -----------   ----  -------  --------
1    Fritzlein     2568   2572     1
2    chessandgo    2561   2568     2
3    Adanac   .    2319   2353     3
4    99of9    .    2227   2277     4
5    Tuks     .    2199   2263     5
6    ChrisB   .    2095   2196     6
7    omar     .    2012   2139     8
8    PMertens .    2004   2174     7
9    The_Jeh  .    1987   1817     9
10    naveed   .    1859   1731    12
11    woh .    .    1851   1750    10
12    Simon    .    1799   1734    11
13    Nevermind     1756   1706    14
14    Nombril  .    1753   1718    13
15    fritzlforprez 1658   1650    15
16    Hippo    .    1581   1577    16


There is not much change on the extremes, as it should be, because the difference between an opponent 900 points stronger and one 1000 points stronger is minimal.  Towards the middle, however, PMertens had to face a significantly stronger opponent than omar did, and is rewarded with a better seed (and theoretically easier opponent from folding pairing) because of it.  Similarly, naveed got a break in the first round compared to woh and Simon, playing a significantly easier opponent, but the penalty is to get knocked down a couple of seeds for round 2, earning a (probably) harder pairing then.

Based on just this one test run, I like the way the idea is working.  If you get an easier road in one round, you have to pay with a harder road next round.  That's just the kind of equalization I would like to see.  Unfortunately, my programming skills don't suffice to extend the simulation to a second round.  (In fact, there's a fair probability that my skills didn't suffice for the first round.  Believe my numbers at your peril. :P)  I would really like to see a couple of realistic simulations to step through the consequences of the proposed combination of seeding/SoS, because it is not a situation where my intuition has something to grab hold of.

Title: Re: Minimum-weight perfect matching
Post by Fritzlein on Feb 15th, 2010, 3:07pm

on 02/12/10 at 15:47:54, aaaa wrote:
Moreover, I believe a forfeit should be treated like any other game result, since it's reasonable to expect a bias towards forfeits by underdogs.

This goes completely against the notion of equalizing the toughness of the paths of players.  A forfeit win is as easy a path as one could get, i.e. equivalent to a bye.  If you had a forfeit win against a strong player, it would be absurd for it to increase your strength of schedule, earning you an easier pairing the following round.  The exact opposite should happen, whereby a forfeit win earns you a tougher opponent the following round to keep you on par with other players of your same W/L record.  If we use any kind of in-tournament WHR, I advocate that byes and forfeit wins each count as a win against an anchor player with rating 1000.

Title: Re: Minimum-weight perfect matching
Post by aaaa on Feb 17th, 2010, 6:14pm
Here's how my hybrid score would have evolved during the Swiss part of the current championship:

Before the tournament:
Fritzlein2568
chessandgo2561
Adanac2319
99of92227
Tuks2199
ChrisB2095
omar2012
PMertens2004
The_Jeh1987
naveed1859
woh1851
Simon1799
Nevermind1756
Nombril1753
fritzlforpresident1658
Hippo1581
After round 1:
Fritzlein2601
chessandgo2580
Adanac2370
Tuks2254
Simon2140
omar2086
PMertens2063
Nombril2058
The_Jeh1954
99of91886
naveed1840
woh1800
ChrisB1790
Nevermind1701
fritzlforpresident1584
Hippo1522
After round 2:
Fritzlein2647
chessandgo2625
Adanac2490
Tuks2352
Simon2122
PMertens2090
omar2037
The_Jeh2021
99of91966
Nevermind1848
Nombril1847
Hippo1718
woh1636
fritzlforpresident1514
ChrisB1497
naveed1476
After round 3:
chessandgo2802
Adanac2674
Fritzlein2491
Tuks2394
Nombril2116
Nevermind2057
The_Jeh2050
99of92015
Simon2009
PMertens1896
omar1778
woh1736
Hippo1670
naveed1583
ChrisB1356
fritzlforpresident1338
After round 4:
Adanac2909
chessandgo2671
Fritzlein2498
99of92231
Tuks2205
The_Jeh2160
Simon2123
Nombril2077
Nevermind1917
woh1899
PMertens1855
omar1759
Hippo1602
naveed1528
ChrisB1416
fritzlforpresident1151
After round 5:
Fritzlein2641.4
chessandgo2640.5
Adanac2624
99of92236
The_Jeh2177
Tuks2165
woh2078
Simon1937
Nevermind1883
PMertens1870
Nombril1821
Hippo1791
omar1729
naveed1590
ChrisB1349
fritzlforpresident1064
Let's see whether anyone will manage to feel offended by this as well. :P

Title: Re: Minimum-weight perfect matching
Post by aaaa on Feb 18th, 2010, 11:01pm

on 02/17/10 at 18:14:18, aaaa wrote:
After round 5:
Fritzlein2641.4
chessandgo2640.5
Adanac2624
99of92236
The_Jeh2177
Tuks2165
woh2078
Simon1937
Nevermind1883
PMertens1870
Nombril1821
Hippo1791
omar1729
naveed1590
ChrisB1349
fritzlforpresident1064

Let's make that final ranking dictate the outcome of every game in a simulated floating triple elimination tournament with reseeding. Then the pairings for each round would become:
1.Fritzlein12742b352b2
2.chessandgo16133154b131
3.Adanac115210416b2e
4.99of9961732e
5.The_Jeh10386271e
6.Tuks841259103e
7.woh15111485e
8.Simon695147e
9.Nevermind4815136e
10.PMertens516133116e
11.Nombril31471210e
12.Hippo115611e
13.omar142109e
14.naveed1311168e
15.ChrisB7129e
16.fritzlforpresident21014e



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.