Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> Events >> 2012 Computer Championship
(Message started by: rbarreira on Nov 5th, 2011, 7:53am)

Title: 2012 Computer Championship
Post by rbarreira on Nov 5th, 2011, 7:53am
The registration for the 2012 Computer Championship has started and there are two bots registered so far:

bot_sharp
bot_briareus

I wonder how many bots there will be this year? I'm expecting at least marwin and badger to join, hopefully clueless too even though it hasn't been around much. So there should be at least 5 participants which is not bad. Perhaps 6 if Omar enters Bomb2005 again.

Then I wonder if Janzert has started the new bot he was planning, or if he will re-enter Opfor at least?

What about Gnobot, is it ever coming back? :)

Any new bots planning to enter?

PS: Do not forget there's a new time control this year, 2m/6m/100/0/8h/6m.

Title: Re: 2012 Computer Championship
Post by Janzert on Nov 7th, 2011, 3:06am
OpFor's replacement has had almost no work done on it. At this point I'm not planning on entering with any bot.

Title: Re: 2012 Computer Championship
Post by Fritzlein on Nov 11th, 2011, 7:17pm
Given that we will in all probability not have nine entrants this year, does the qualifying procedure still make sense?  There is always a danger of a last-minute surge of bot-vs.-bot qualifying games, which can eat up server resources and degrade quality of service.

The qualifying games originally served a double purpose: seeding and limiting the field to eight.  When you take away the need to limit the field, it's not clear to me that the bureaucracy is worth it just for seeding.  An simple alternative would be to seed based on last's year's finish with brand new bots getting seeded below them in order of rating.

Title: Re: 2012 Computer Championship
Post by rbarreira on Nov 12th, 2011, 3:51am

on 11/11/11 at 19:17:55, Fritzlein wrote:
Given that we will in all probability not have nine entrants this year, does the qualifying procedure still make sense?


Was there a probability of > 8 entrants in any year? In 2010 there were 8 (including bomb which didn't really "enter"), in 2011 less than that.


on 11/11/11 at 19:17:55, Fritzlein wrote:
There is always a danger of a last-minute surge of bot-vs.-bot qualifying games, which can eat up server resources and degrade quality of service.


It seems that most of the bots used for qualification have a limit of 1 running instance, so it would be quite risky to wait until the last minute to play many games. And even if people did, that limit should prevent many simultaneous games, especially since 3 of the benchmark bots are weak enough to not require many games against them for a winning streak of 4.


on 11/11/11 at 19:17:55, Fritzlein wrote:
An simple alternative would be to seed based on last's year's finish with brand new bots getting seeded below them in order of rating.


Assuming that the seed has any impact on winning probability (has anyone calculated that btw?), I think it's a good thing to have a seeding based on the actual current skill of the bots. A triple-elimination tournament already involves some amount of luck, I think anything which can reduce the impact of luck is a good thing.

Title: Re: 2012 Computer Championship
Post by omar on Nov 12th, 2011, 5:25pm
If we only have 4 bots registering it might be better do just do a double round robin tournament. That would eliminate the need for seeding.

Title: Re: 2012 Computer Championship
Post by Fritzlein on Nov 12th, 2011, 5:45pm
The only objections I have to round robins are the possibility of collusion and ties, neither of which are a problem in this case.  I trust the bots won't throw games, and a tie is merely an occasion for a climactic playoff game!

Title: Re: 2012 Computer Championship
Post by aaaa on Nov 12th, 2011, 6:29pm
Especially with so few entrants, a double round robin will be considerably less differentiating. What if the two top bots happen to be way ahead of the field? They would have only two games between them. My proposal is to keep FTE and do the seeding based on the achievement of the bot's programmer(s) in the most recent computer championship, with ties broken by the result in the previous one, and so on, with remaining ties broken by earliest signup time.

Title: Re: 2012 Computer Championship
Post by Janzert on Nov 13th, 2011, 1:14am
I really liked the idea of a round robin followed by FTE with losses carried over and seeding from the round robin. A secondary seed to break ties from the round robin is probably needed though. Also maybe the FTE should be 4 losses for elimination.

Title: Re: 2012 Computer Championship
Post by rbarreira on Nov 13th, 2011, 4:30am

on 11/13/11 at 01:14:31, Janzert wrote:
I really liked the idea of a round robin followed by FTE with losses carried over and seeding from the round robin. A secondary seed to break ties from the round robin is probably needed though. Also maybe the FTE should be 4 losses for elimination.


I like this idea. Any round-robin tournament would at minimum need a final phase to break the ties Fritz mentioned, this looks like a good solution. It also solves aaaa's objection about cases where the two top bots are quite close.

Title: Re: 2012 Computer Championship
Post by jdb on Nov 14th, 2011, 2:10pm

on 11/13/11 at 04:30:19, rbarreira wrote:
I like this idea. Any round-robin tournament would at minimum need a final phase to break the ties Fritz mentioned, this looks like a good solution. It also solves aaaa's objection about cases where the two top bots are quite close.


I also like this idea.

Title: Re: 2012 Computer Championship
Post by Fritzlein on Nov 14th, 2011, 3:45pm

on 11/13/11 at 01:14:31, Janzert wrote:
I really liked the idea of a round robin followed by FTE with losses carried over and seeding from the round robin.

I rather like your idea of using a single round-robin to eliminate the need for seeding, and carrying forward losses into a multiple-elimination format to sort out the top.  The secondary seeding could be random after that without seeming unfair, because it would have such a small impact.

Just a note, however, that with only four entrants, an FTE format automatically plays a round-robin for the first three rounds, and losses carry forward, so your proposal doesn't change anything until the number of participants gets larger than the number of eliminations plus one.

Title: Re: 2012 Computer Championship
Post by jdb on Nov 17th, 2011, 9:09am

on 11/14/11 at 15:45:13, Fritzlein wrote:
I rather like your idea of using a single round-robin to eliminate the need for seeding, and carrying forward losses into a multiple-elimination format to sort out the top.  The secondary seeding could be random after that without seeming unfair, because it would have such a small impact.

Just a note, however, that with only four entrants, an FTE format automatically plays a round-robin for the first three rounds, and losses carry forward, so your proposal doesn't change anything until the number of participants gets larger than the number of eliminations plus one.


From a practical standpoint there is slight difference. If there are administrative problems running the tournament, a round robin can continue. The FTE format is delayed until the problems are fixed.

Title: Re: 2012 Computer Championship
Post by aaaa on Nov 25th, 2011, 7:14pm
Another difference is that using FTE with enough lives such that it would start out as a de facto round robin tournament would not guarantee a balanced assignment of colors.

The new scheduling software already minimizes the influence of the initial seeding as much as possible, mostly by having current tournament performance trump it in any intermediate ranking for the purpose of assigning byes and pairings. That means that simply giving the scheduler differently scheduled round robin games as part of the tournament history should be enough. I'd still like to avoid a random seeding if possible though. Past performance seems like a reasonable choice (here, indented bots share a programmer):
    1. Sharp (2011: #1)
    2. Marwin (2011: #2)
    3. Clueless (2011: #3)
    4. Bomb (2011: #4, 2010: #5)
    5. Briareus (2011: #4)
    6. OpFor (2011: #6)
    7. GnoBot (2010: #5, 2009: #2)
    8. Badger (2010: #5)
      Aamira (2007: #5)
    9. Pragmatictheory (2010: #8)
    10. Zombie (2009: #6)
      Faerie (2007: #3)
    11. Rat (2009: #8)
      Loc (2008: #6)
    12. Occam (2007: #6)

Title: Re: 2012 Computer Championship
Post by omar on Nov 26th, 2011, 4:58pm
Round robin followed by FTE would make the tournament a little longer. With only 4 bots registered, it would only add about two days, but in general it could get too long. My reason for suggesting a double round robin was because only 4 bots were registered. But we really should not use formats that we can't use if more bots were registered. Using the previous years results seems like a good alternative. Thanks aaaa for ordering the bots based on previous WCC performance.

Title: Re: 2012 Computer Championship
Post by rbarreira on Nov 26th, 2011, 6:29pm
I'm not sure I understand the direction of the discussion, are we talking the format/seeding for the 2012 WCC? Or is it for later years, as this year's rules were already published?

Omar are you agreeing with aaaa's idea about there being no qualification if at most 8 bots participate? So the seeding will be determined by how good the bots were in the championship of the previous year or how quick their authors signed up? What if more than 8 bots participate, there would be a qualification but it wouldn't count for seeding?


on 11/26/11 at 16:58:32, omar wrote:
But we really should not use formats that we can't use if more bots were registered. Using the previous years results seems like a good alternative.


If we want to use a format which can handle more than 4 bots why do we need a new alternative? Wouldn't the 2011 WCC format be good enough? Meaning qualification/seeding games followed by FTE?

As mentioned before I liked Janzert's idea, but if that's not possible then what about a floating elimination tournament with more than 3 lives? Does that lengthen the tournament too much?

Title: Re: 2012 Computer Championship
Post by omar on Nov 28th, 2011, 12:03am
We really should stick with the published rules for this year. But I know it can be a bit tedious to have your bot play all the qualifying games. Since there are only 4 bots registered this year, I am open to using previous years performance for the rankings if a majority of the registered bot developers want to use that to avoid the qualifying games. The format of the tournament will still be FTE.

Title: Re: 2012 Computer Championship
Post by rbarreira on Nov 28th, 2011, 2:26am
Thanks for the clarification Omar. I prefer that the seeding be done via the qualifying games.

I would be fine with Arimaazilla and/or Aamira2006P2 being eliminated from the list of benchmark bots though. They are too weak to contribute much (or likely, anything) to the ranking.

Title: Re: 2012 Computer Championship
Post by Nombril on Nov 28th, 2011, 6:07pm
Not that I have a bot to enter... so maybe my opinion doesn't count... but it seems that seeding based on last year's performance will penalize the developers that have made the most improvements in their bots.

I know I had similar misgivings about having ratings play any significant role in the human tournament.  It seems the point of a tournament is to establish the best player at the time of the tournament, not crown the player that has already accumulated a high rating.

Title: Re: 2012 Computer Championship
Post by rbarreira on Nov 29th, 2011, 4:50am
I was looking at the rules page and saw the following:


Quote:
This tournament format is designed only to clearly recognize a first place winner.


But it's also important to recognize the second place, as that bot goes to the Challenge screening together with the winner? So I guess a tie break might be needed if there's a tie for 2nd place.

Title: Re: 2012 Computer Championship
Post by Fritzlein on Nov 29th, 2011, 7:51am
It would be silly indeed to have the second bot for the qualifying phase be determined by a tiebreaker, or to have a tie for 2nd-3rd not be broken!

Title: Re: 2012 Computer Championship
Post by aaaa on Nov 29th, 2011, 7:51am

on 11/29/11 at 04:50:27, rbarreira wrote:
But it's also important to recognize the second place, as that bot goes to the Challenge screening together with the winner? So I guess a tie break might be needed if there's a tie for 2nd place.

This is a good point, which I had forgotten about. Here is my earlier proposal to handle that:


on 03/07/11 at 11:11:19, aaaa wrote:
In case one just has to break a tie, which in our case is currently necessary to determine which bot will join the champion in the screening period, one useful technique could be to manually extend the tournament as follows:
Continue to make use of the file containing all the games of the tournament, starting by explicitly eliminating everyone not in the running, i.e. the winner and, here, all those not amongst the losers with the most wins. Because that would normally eliminate everyone, the program will revive all not-explicitly-removed players, giving them each one more life (provided the number of lives supplied as argument is no more than the number used during the original course of the tournament plus one), and schedule pairings, still taking into account the earlier games of the tournament with respect to byes, pairings and performance ratings. If further rounds are necessary, then just continue to update the history file as normal.
I would think that having the games of a main tournament affect the scheduling of a mini-tournament for runner-up in this fashion should actually be a desirable thing.

Title: Re: 2012 Computer Championship
Post by tize on Nov 29th, 2011, 3:39pm
If we would like to have the qualification games for seeding but decrease the "burden" on the developers, we could allow the developer to accept last years score against a qualification bot by not playing it at all. If at least one game is played against the bot then the developer has passed that offer.

It would at least lower the number of games needed for the returning developer.

Title: Re: 2012 Computer Championship
Post by aaaa on Nov 30th, 2011, 8:08am
If this is going to be a transitional event cycle anyway, we might as well take advantage of the comparably little number of entrants and use this championship to set the groundwork for the desired future tournament format for which a consensus has already existed for quite some time now: round robin with losses carried over to a floating triple elimination phase.

As people have already remarked several times before, bot games are already not very attractive to witness live with their long time controls, so there should be no need to make any compromises towards any prospective spectators; the games should be scheduled and run automatically as much as possible, maximizing their possible number in any time period. The two important problems to look out for here are making sure that no leftover processes are stealing resources during games and that time lost during the communication of moves are not taken off the clock.

Title: Re: 2012 Computer Championship
Post by rbarreira on Nov 30th, 2011, 9:01am

on 11/30/11 at 08:08:07, aaaa wrote:
The two important problems to look out for here are making sure that no leftover processes are stealing resources during games and that time lost during the communication of moves are not taken off the clock.


Two related questions for Omar:

1- Would it be hard to have a script kill off any remaining bot processes before each scheduled game? As the processes are running on the bot's own accounts, it should be enough to kill all processes on bot accounts.

2- If this were done, would it fix the biggest issue you have with the round robin + FTE format? As jdb pointed out in the chat, the good thing about the round-robin phase is that even if there are problems with any particular game, the other scheduled games can still go on. This is in contrast to the current format which requires immediate manual intervention at each failed game (the result is soon needed for the next round's pairings).

Due to that, I wonder if aaaa's suggestion could be a win-win situation both for Omar and the bot developers. Omar would not need to be on-call for each tournament game, and I know several developers would prefer a tournament format with more games (in order to make the result more representative).

Title: Re: 2012 Computer Championship
Post by omar on Dec 1st, 2011, 10:04pm

on 11/29/11 at 04:50:27, rbarreira wrote:
I was looking at the rules page and saw the following:

But it's also important to recognize the second place, as that bot goes to the Challenge screening together with the winner? So I guess a tie break might be needed if there's a tie for 2nd place.


Yes, you are right. We will have to resolve second place if there is a tie.

Looks like aaaa, had already considered how to resolve this, so we can go with that suggestion. I'll update the page to include it.

Title: Re: 2012 Computer Championship
Post by omar on Dec 1st, 2011, 10:20pm

on 11/30/11 at 09:01:41, rbarreira wrote:
Two related questions for Omar:

1- Would it be hard to have a script kill off any remaining bot processes before each scheduled game? As the processes are running on the bot's own accounts, it should be enough to kill all processes on bot accounts.

2- If this were done, would it fix the biggest issue you have with the round robin + FTE format? As jdb pointed out in the chat, the good thing about the round-robin phase is that even if there are problems with any particular game, the other scheduled games can still go on. This is in contrast to the current format which requires immediate manual intervention at each failed game (the result is soon needed for the next round's pairings).

Due to that, I wonder if aaaa's suggestion could be a win-win situation both for Omar and the bot developers. Omar would not need to be on-call for each tournament game, and I know several developers would prefer a tournament format with more games (in order to make the result more representative).


Yes, a script could be setup to kill the processes before a game starts.

But, I would still prefer not to use a round robin. It's not too bad with 4 bots, but if we had 8 it would be way too many games. However, I would like to encourage the bot developers to make use of the tournament management tool to organize their own tournaments outside of the WCC.


Title: Re: 2012 Computer Championship
Post by omar on Dec 1st, 2011, 10:25pm

on 11/28/11 at 18:07:15, Nombril wrote:
Not that I have a bot to enter... so maybe my opinion doesn't count... but it seems that seeding based on last year's performance will penalize the developers that have made the most improvements in their bots.

I know I had similar misgivings about having ratings play any significant role in the human tournament.  It seems the point of a tournament is to establish the best player at the time of the tournament, not crown the player that has already accumulated a high rating.


True. But sometimes I think we worry too much about seeding. The difference between a randomly seeded FTE and a perfectly seeded FTE is not that much in terms of the formats chance of selecting the best player. I had posted the results of such simulations once, but can't seem to find them right now.

Title: Re: 2012 Computer Championship
Post by rbarreira on Dec 3rd, 2011, 10:38am

on 12/01/11 at 22:20:34, omar wrote:
But, I would still prefer not to use a round robin. It's not too bad with 4 bots, but if we had 8 it would be way too many games.


If the length of the tournament is the biggest issue (and I understand why it would be) here's an idea:

What about using Fast (30s per move) games for the round-robin phase, and move on to the regular time control of 2 minutes per move only during the FTE phase which has much fewer games (especially with losses in the round-robin carried forward)?

30s per move games would be four times faster than 2 minutes per move, so it would be possible to pack 12 games into a single day of the tournament. Even with 8 bots, the double round-robin phase could be played in less than 5 days.

Of course this assumes that bot developers are fine with having their bot playing Fast games in the round-robin phase. I for one would not mind it... In the human championship (or at least past ones) there were already different time controls for different phases of the tournament.

Title: Re: 2012 Computer Championship
Post by aaaa on Dec 5th, 2011, 5:27pm
I've changed my mind. Perhaps a uniform tournament structure is better after all, just like with the human championship. Quadruple elimination scales better than the hybrid format; the former will have much more "relevant" games, whereas the latter could waste lots of them on already-doomed weaker bots. If the number of games ever were to become an issue, then one could just easily decree that the number of lives should be a function of the number of participants, just like how it was with the number of rounds in the Swiss Open Classic.

Title: Re: 2012 Computer Championship
Post by rbarreira on Dec 6th, 2011, 11:13am

on 12/05/11 at 17:27:09, aaaa wrote:
I've changed my mind. Perhaps a uniform tournament structure is better after all, just like with the human championship. Quadruple elimination scales better than the hybrid format; the former will have much more "relevant" games, whereas the latter could waste lots of them on already-doomed weaker bots. If the number of games ever were to become an issue, then one could just easily decree that the number of lives should be a function of the number of participants, just like how it was with the number of rounds in the Swiss Open Classic.


I would prefer quadruple elimination (FQE) over FTE too. The top bots seem to be quite close which makes the current format not very discriminating.

To illustrate that point, I fed the 2010 WCC and 2011 WCC games (separately) into BayesElo and it gave a probability of superiority of the winner over the 2nd place of just 67% and 58% respectively. In other words this tournament format is not much better than a coin toss in terms of telling which is better between the two top bots (not to mention a case where three top bots are close enough, where one of them has to be out of the challenge screening).

Title: Re: 2012 Computer Championship
Post by ingwa on Dec 6th, 2011, 11:49am
There was a discussion about this in the chat today, and for what it's worth I also agree that quadruple elimination makes sense.

 -Inge (co-author of Badger)

Title: Re: 2012 Computer Championship
Post by omar on Dec 9th, 2011, 12:27pm
The nice thing about the round robin format is that there is no need for seeding. However, it comes at the cost of many games. Since those extra games don't really improve the performance of the format over FTE it's hard to justify using round robin. Especially since the FTE does a good job even when the seeding is practically random.

I ran some simulations:

Using round robin with 8 players with a true rating distribution of 200, rating inaccuracy of 50 and 0 chance of draw. Result of 1000 tournaments. Measuring the probability of picking the player with the highest true rating.

./run4 'formats/roundRobin' 1000 8 200 50 0
 1   26.1%

With a measured rating inaccuracy of 0. Should not make any difference since roundRobin does not depend on measured ratings:
./run4 'formats/roundRobin' 1000 8 200 0 0
 1   25.5%

With a measured rating inaccuracy of 500. Should not make any difference since roundRobin does not depend on measured ratings:
./run4 'formats/roundRobin' 1000 8 200 500 0
 1   25.2%

Now using double round robin:
./run4 'formats/roundRobinDouble' 1000 8 200 50 0
 1   31.4%

./run4 'formats/roundRobinDouble' 1000 8 200 0 0
 1   31.7%

./run4 'formats/roundRobinDouble' 1000 8 200 500 0
 1   29.8%

Now using FTE:
./run4 'formats/floatTripElim' 1000 8 200 50 0
 1   32.3%

With a measured rating inaccuracy of 0. Perfect seeding.
./run4 'formats/floatTripElim' 1000 8 200 0 0
 1   33.6%

With a measured rating inaccuracy of 500. Almost random seeding.
./run4 'formats/floatTripElim' 1000 8 200 500 0
 1   30.8%


Now using FQE (Quad elimination):
./run4 'formats/floatQuadElim' 1000 8 200 50 0
 1   35.3%

./run4 'formats/floatQuadElim' 1000 8 200 0 0
 1   37.5%

./run4 'formats/floatQuadElim' 1000 8 200 500 0
 1   32.0%


Even though the results are measured to a tenth of a percent they can vary by about 2% from one run to another. So the measured error is at least +-1%.

Even with bad seeding FTE performs almost as good as double round robin. FQE performs slightly better. Certainly FQE could be justified since it does improve performance with minimal additional rounds (about 11.2 rounds). But it is really cutting close since I would have liked the number of rounds to be 10 or less. However, I am willing to go with FQE next year. Will also use it this year if all the participating bot developers agree.

Just for fun I tried FPE (Penta elimination) and it is still slightly better than FQE.

./run4 'formats/floatPentElim' 1000 8 200 50 0
 1   35.6%

./run4 'formats/floatPentElim' 1000 8 200 0 0
 1   37.9%

./run4 'formats/floatPentElim' 1000 8 200 500 0
 1   33.4%

If you want to try your own experiments, the simulator can be downloaded from here:

http://arimaa.com/arimaa/tourn/compare/



Title: Re: 2012 Computer Championship
Post by rbarreira on Dec 9th, 2011, 2:02pm
Thanks for the all the simulations Omar and for offering to make the tournament quadruple elimination (depending on developers' votes)!

According to your results it seems perfect seeding also makes a good impact vs random seeding, almost as much as adding another round.


on 12/09/11 at 12:27:04, omar wrote:
However, I am willing to go with FQE next year. Will also use it this year if all the participating bot developers agree.


So it seems that's two votes already (me and ingwa although he hasn't registed bot_badger yet). Three participating developers to go (if no one else registers).

Title: Re: 2012 Computer Championship
Post by jdb on Dec 9th, 2011, 6:07pm
I support quad. elimination format.

Title: Re: 2012 Computer Championship
Post by lightvector on Dec 9th, 2011, 7:41pm
FQE sounds good to me.


on 12/06/11 at 11:13:16, rbarreira wrote:
I fed the 2010 WCC and 2011 WCC games (separately) into BayesElo and it gave a probability of superiority of the winner over the 2nd place of just 67% and 58% respectively.


It always amazes me how many games it takes to get statistical significance one way or the other.

Title: Re: 2012 Computer Championship
Post by tize on Dec 10th, 2011, 1:48pm
I'm also willing to accept a change to FQE.

Title: Re: 2012 Computer Championship
Post by Fritzlein on Dec 10th, 2011, 2:16pm
Omar, thanks for being willing to add a fourth elimination.  That will be great for the participants.  Even though the games are a bit slow-paced for spectators, I think the additional games will please the audience as well.

Each of the last four years, the first-place bot lost between zero and two times to the second place bot, but not to any other bot, while the second-place bot lost three times to the winner and zero times to anyone else.  That made it relatively clear which bots were the top two and therefore should advance to qualifying, but there was presumably still some luck involved.  I'm going to go out on a limb to predict that with four lives for each bot, the situation won't repeat.  At least one of the top two bots (presumably sharp and marwin) will lose a game to a bot that doesn't finish in the top two.

Title: Re: 2012 Computer Championship
Post by omar on Dec 13th, 2011, 11:32pm
OK, quad elimination it is.

Seeding will still be based on games against the benchmark bots, as opposed to previous years results.

Any suggestions for the benchmark bots.

Title: Re: 2012 Computer Championship
Post by rbarreira on Dec 14th, 2011, 2:25am

on 12/13/11 at 23:32:18, omar wrote:
Any suggestions for the benchmark bots.


I suggest removing Arimaazilla and Aamira2006P2. That still leaves Gnobot2005Blitz to differentiate between weak potential entrants that can't beat any of the better bots.

Title: Re: 2012 Computer Championship
Post by omar on Dec 17th, 2011, 9:30pm

on 12/14/11 at 02:25:55, rbarreira wrote:
I suggest removing Arimaazilla and Aamira2006P2. That still leaves Gnobot2005Blitz to differentiate between weak potential entrants that can't beat any of the better bots.


That would reduce the number of bots and make the qualifying phase a bit less work. I am OK with this. If there are no objection raised soon, I'll update the WCC page.

Title: Re: 2012 Computer Championship
Post by tize on Dec 20th, 2011, 3:11pm
I have no objection to reduce the number of qualification bots, actually I wouldn't object to use the bot's name sum (a=1,b=2...) modulo some nice number as long as there is no more bots entering the tournament. :o

Because according to the test simulations that I ran here at home indicated that the winner (in a tight 4 bot tournament) will be picked 49-53 percent of the time when the seeding goes from random to perfect.

Title: Re: 2012 Computer Championship
Post by aaaa on Dec 23rd, 2011, 2:15pm
Will there be any plans to prevent games from being affected by undue time loss, maybe by means of a verification script parsing the game log?

Title: Re: 2012 Computer Championship
Post by omar on Dec 28th, 2011, 11:46am

on 12/23/11 at 14:15:14, aaaa wrote:
Will there be any plans to prevent games from being affected by undue time loss, maybe by means of a verification script parsing the game log?


If a game times out, we will have humans look at the logs and determine what might have caused the problem, present it to the TD and go based on that. If it is a case covered in the Technical Problems section of the rules then a decision by the TD is not required and it can be resolved as directed in that section. I don't plan on writing a script to parse the game logs.

Title: Re: 2012 Computer Championship
Post by aaaa on Dec 28th, 2011, 2:09pm
The problem then is, that it is, perversely enough, exactly those bots that are programmed to recognize and take into account unfairly lost time, that end up suffering for it in silence.

Title: Re: 2012 Computer Championship
Post by omar on Dec 28th, 2011, 2:42pm

on 12/28/11 at 14:09:05, aaaa wrote:
The problem then is, that it is, perversely enough, exactly those bots that are programmed to recognize and take into account unfairly lost time, that end up suffering for it in silence.


Can you elaborate on this.

Title: Re: 2012 Computer Championship
Post by aaaa on Dec 28th, 2011, 3:27pm
In the gamestate file, the wused/bused parameter for the side to move will show how much time has been deducted from the clock already as the communication comes through. My bot, for one, takes this number into account in its time management.

Title: Re: 2012 Computer Championship
Post by Janzert on Dec 28th, 2011, 6:29pm

on 12/28/11 at 14:09:05, aaaa wrote:
The problem then is, that it is, perversely enough, exactly those bots that are programmed to recognize and take into account unfairly lost time, that end up suffering for it in silence.


If I'm remembering events correctly, this isn't just a theoretical concern either. I'm pretty sure I found in reviewing logs from OpFor that it had avoided a timeout by taking into account the lost time. While during the same tournament a few games later another bot did timeout because of it and ended up replaying the game. I remember debating at the time modifying OpFor to timeout if it detected the situation arising but deciding it was rather more in the spirit of good sportsmanship to keep the current behavior.

Title: Re: 2012 Computer Championship
Post by tize on Dec 29th, 2011, 10:28am
Before last WCC I was in favor of having the bots supporting the (w/b)used variable (even though marwin always have ignored them). But after the game marwin vs sharp where sharp lost on time due to not looking at the bused variable I'm more infavor of restarting the games. Because in that game sharp would have lost 2minutes by looking at the bused (the same position was sent to him twice) and the problem would have gone unnoticed.
(Is this problem fixed in AEI, that if the same position is sent twice the same move is sent again directly?)

Also by looking at the (w/b)used variables only half of the problem can be "solved" because all delays from the bot to the server will still tick down on the bots timer.

For casual games it would still be preferable if the bots didn't ignore the (w/b)used as to minimize the timeouts.

Title: Re: 2012 Computer Championship
Post by rbarreira on Dec 29th, 2011, 5:11pm
The problem with not taking W/Bused into account is that if a bot crashes half-way through a search, it won't be aware of the used time when AEI restarts it. (not sure if something similar is the case for the older bot interface)

Title: Re: 2012 Computer Championship
Post by tize on Dec 30th, 2011, 1:41am
I'm not sure but I think that if marwin crashes an empty move is sent to the server. (I'm using the old interface.)

Title: Re: 2012 Computer Championship
Post by tize on Jan 2nd, 2012, 6:21am
So the registration period has ended with 6 bots registred, let's all welcome the new bot lucy.

Title: Re: 2012 Computer Championship
Post by Hippo on Jan 2nd, 2012, 12:00pm

on 01/02/12 at 06:21:54, tize wrote:
So the registration period has ended with 6 bots registred, let's all welcome the new bot lucy.


Hmm, this is third year I miss bot_quad in the championship.

Title: Re: 2012 Computer Championship
Post by rbarreira on Jan 3rd, 2012, 4:43am

on 01/02/12 at 06:21:54, tize wrote:
So the registration period has ended with 6 bots registred, let's all welcome the new bot lucy.


Indeed, welcome to the new bot :)

It's nice to see more bots registered this year than last year (not counting Bomb last year since it was not registered by its author).

Title: Re: 2012 Computer Championship
Post by Swynndla on Jan 3rd, 2012, 4:27pm

on 01/02/12 at 06:21:54, tize wrote:
...let's all welcome the new bot lucy.

Thanks! :)   I knew that arimaa bot development was difficult, but I didn't appreciate just how d**n hard it was until now ... kudos to all you developers!

Title: Re: 2012 Computer Championship
Post by omar on Jan 6th, 2012, 7:17pm

on 12/28/11 at 15:27:02, aaaa wrote:
In the gamestate file, the wused/bused parameter for the side to move will show how much time has been deducted from the clock already as the communication comes through. My bot, for one, takes this number into account in its time management.


When things are working fine the time a bot has when the bots move starts is given by
tcmove+tc?reserve (where ? is w or b). However, if there is problem and the bot is restarted in the middle of it's move it most likely assume that the time it has to make the move is still given by
tcmove+tc?reserve, however the bot has lost some time due to the restart and will time out. So the bots do need to use the ?used parameter to see how much time has already been used up. The time remaining should be:
tcmove+tc?reserve-?used
But this could be off a few seconds due to network delays in receiving and sending the move, so it's good to deduct about 10 seconds from this.


Title: Re: 2012 Computer Championship
Post by aaaa on Jan 8th, 2012, 5:58am
But you still haven't addressed the potential problem that a bot could unfairly lose because on one or more moves it got less thinking time than it was entitled to and that this is missed because the game didn't end in a timeout. Hence my suggestion that each game is logged in some way that such a thing can at least be discovered after the fact.

Title: Re: 2012 Computer Championship
Post by omar on Jan 9th, 2012, 10:07am

on 01/08/12 at 05:58:33, aaaa wrote:
But you still haven't addressed the potential problem that a bot could unfairly lose because on one or more moves it got less thinking time than it was entitled to and that this is missed because the game didn't end in a timeout. Hence my suggestion that each game is logged in some way that such a thing can at least be discovered after the fact.


If you go to the game's comments page and view the frame source there is a log of how the game appeared to the server.

Title: Re: 2012 Computer Championship
Post by aaaa on Jan 9th, 2012, 1:07pm

on 01/09/12 at 10:07:09, omar wrote:
If you go to the game's comments page and view the frame source there is a log of how the game appeared to the server.

From the looks of it, that isn't enough, and that's not surprising; unless it's communicated back to the server, only the client knows how much time has past between the resumption of its player's clock and its reception of the signal to move.

Title: Re: 2012 Computer Championship
Post by omar on Jan 11th, 2012, 12:25am

on 01/09/12 at 13:07:00, aaaa wrote:
From the looks of it, that isn't enough, and that's not surprising; unless it's communicated back to the server, only the client knows how much time has past between the resumption of its player's clock and its reception of the signal to move.


There is also the net log files which the bots keep. These have timestamps of when the bot received the game state and  when the bot sent the move.


Title: Re: 2012 Computer Championship
Post by rbarreira on Jan 11th, 2012, 2:04pm
The qualification page is showing bot_OpFor as a participating bot.

Title: Re: 2012 Computer Championship
Post by omar on Jan 13th, 2012, 8:03am
Oops. I'll fix that.

Title: Re: 2012 Computer Championship
Post by lightvector on Jan 16th, 2012, 10:34pm

on 01/11/12 at 00:25:36, omar wrote:
There is also the net log files which the bots keep. These have timestamps of when the bot received the game state and  when the bot sent the move.


For what it's worth, sharp now looks at the wused and bused fields and takes them into account faithfully but will also output a "g<gameid>.time" file logging a warning whenever they are larger than 10 seconds.

Title: Re: 2012 Computer Championship
Post by omar on Jan 26th, 2012, 10:37pm
Just a quick reminder to the bot developers about trying to finish off as many of the qualifying games as possible. The qualifying phase ends at the end of January.

Title: Re: 2012 Computer Championship
Post by Fritzlein on Jan 31st, 2012, 9:04pm

on 01/26/12 at 22:37:48, omar wrote:
Just a quick reminder to the bot developers about trying to finish off as many of the qualifying games as possible. The qualifying phase ends at the end of January.

So the seeding is now fixed?

Title: Re: 2012 Computer Championship
Post by omar on Feb 1st, 2012, 12:07am
Yes,

bot_sharp score=24 losses=-1 time=1327983155
bot_marwin score=24 losses=-2 time=1327994583
bot_briareus score=24 losses=-9 time=1327884842
bot_Lucy score=8 losses=-10 time=1328039673
bot_clueless score=1 losses=-0 time=1327939133
bot_Badger2 score=0 losses=-0 time=0

Title: Re: 2012 Computer Championship
Post by tize on Feb 6th, 2012, 3:27pm
I see that the hardware for the tournament have been decided, and according to some benchmarks that I found it should be almost 50% faster than last year. Nice!

It also has the popcnt instruction (which I belive the gameroom server lacks). This instruction makes marwin about 10% faster if compiled to use it on my computer. The thing is that if I compile marwin to use the instruction it will not run on the gameroom server after the tournament.

So my question is this: I'm I allowed to compile marwin for popcnt usage?

I could of course also provide a version that doesn't demand the popcnt instruction.

Title: Re: 2012 Computer Championship
Post by Hippo on Feb 6th, 2012, 3:52pm
I expect using it will be OK, especially if you test the instruction availability and chose "emulation mode" instead if missed.
Yes popcount routine is needed often.
(I use 2 variants ... lowpopcount when expecting at most around 3 bits set and popcount when all 64 bits are possible. Using single instruction would made it much easier).

Title: Re: 2012 Computer Championship
Post by rbarreira on Feb 6th, 2012, 4:44pm
IMO I think the bots should be allowed to use the power of the hardware to the fullest.

Compiling two versions and mentioning that in the README of the bot is probably the best way of making it easy for Omar to set up the gameroom version of the bot later on.

Title: Re: 2012 Computer Championship
Post by aaaa on Feb 6th, 2012, 5:09pm
It doesn't say that much, but here is an earlier thread (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=events;action=display;num=1264886467) on the issue of how hardware-specific binaries should be allowed to be.

Title: Re: 2012 Computer Championship
Post by tize on Feb 7th, 2012, 12:50am

on 02/06/12 at 16:44:32, rbarreira wrote:
Compiling two versions and mentioning that in the README of the bot is probably the best way of making it easy for Omar to set up the gameroom version of the bot later on.


If Omar doesn't object against it this is what I will do for marwin this year.

Title: Re: 2012 Computer Championship
Post by omar on Feb 8th, 2012, 2:51pm

on 02/06/12 at 15:27:59, tize wrote:
I see that the hardware for the tournament have been decided, and according to some benchmarks that I found it should be almost 50% faster than last year. Nice!

It also has the popcnt instruction (which I belive the gameroom server lacks). This instruction makes marwin about 10% faster if compiled to use it on my computer. The thing is that if I compile marwin to use the instruction it will not run on the gameroom server after the tournament.

So my question is this: I'm I allowed to compile marwin for popcnt usage?

I could of course also provide a version that doesn't demand the popcnt instruction.


Interesting. Maybe CPUs are moving back towards complex instruction sets now.


on 01/31/10 at 05:59:10, omar wrote:
Well the intent of the Arimaa challenge is to encourage algorithmic advances as opposed to hardware advances,  so I wouldn't want to encourage the bot developers to tightly couple their code to the current hardware configuration. If you use the standard libraries provided by the OS and language you shouldn't run into any problems with the executable being tied to the hardware configuration.


I guess in this case the compiler provides an option to use or not use the instruction; so it's not covered by what I said earlier.

Again the intent is that the challenge should encourage algorithmic advances as opposed to building specialized hardware to play Arimaa. However, advances will continue to happen in general purpose hardware; that is to be expected. Another intent is to be able to bring the advances to the masses as opposed to just a demonstration which isn't accessible to others.

Imagine if you were going to provide your bot executable to many people. You would have a download for Windows, Mac and Linux. Within each platform you might have one for 32-bit and another for 64-bit. You would not have separate executable for different end user hardware. If the end users computer had 4 cores instead of single core or 8 GB RAM instead of 4 GB you could check for it in the code and use it that way. If your code could run faster on an Intel than AMD due to some instruction it provided you should check for it in code and use it that way.

So if it can be done within one executable, I would be OK with your bot using this instruction. Perhaps in a few years all computers will have this instruction and your code won't even need to check for it. Your target platform though is a 64-bit Linux system. You shouldn't assume any specific processor, RAM or number of cores. However, you can detect and use whatever is available.

Now it is quite possible that a bot which wins the challenge could only win it on the hardware being used for the challenge match and not on a machine with anything less. I am OK with that since the hardware is chosen to be representative of a current commodity computer.


Title: Re: 2012 Computer Championship
Post by Swynndla on Feb 8th, 2012, 4:39pm
Omar - I don't know how to detect if hardware popcount is supported by the CPU or not, and I just assumed that we would be allowed to use popcount since it's on the CPU provided, and so I've used *heaps* of popcounts in Lucy for that reason.  If in doubt, I've popcounted it.  I may have nested popcounts too.  As far as I can see, the hardware popcount *isn't* something that one executable looks to see if it's available after running, but rather it has to be compiled in!  So I'm not exactly sure what you're saying ... but are you saying that if I don't know how to detected hardware popcount, then I have to *disable* hardware popcount when I compile Lucy?  :-[ :'(

Edit: also I'm a bit confused, as other hardware instructions like "ctz" have been a hardware instruction in cpu's for years, and we aren't expected to detected if that's available (as it's available on in the gameroom and everywhere), right? ... how far does this go?  Do we have to detect if the cpu is a 386 too?  Maybe I've misunderstood what you've said and it's ok to compile Lucy to use the cpu provided??

Edit2: My (AMD) cpu at home is a couple of years old or so, and it wasn't the latest cpu when I got it (it was a fairly budget one), and it supports hardware popcount ... so that's another reason why I thought it'd be pretty much standard to use in the comp.  Not only that, but gcc will turn on popcount by default for my cpu type.

Title: Re: 2012 Computer Championship
Post by omar on Feb 8th, 2012, 10:54pm
The account provided is really just to upload and test your program. The target platform your program needs to work on is Linux 64 bit. Beyond that you can't presume things about the hardware. Your program is free to detect whatever it can about the hardware and make use of it.

I am sure there is a way to detect if the CPU provides popcnt from C. Have you checked cpufeature.h. Or your program can read the /proc/cpuinfo file and check the 'cpu flags' section. If none of these work, you can also have your bot read a config file which sets a flag to indicate if popcnt is available.

At the very least what you can do is create two executable and exec the appropriate one via the main executable that detects what's available.


Title: Re: 2012 Computer Championship
Post by Swynndla on Feb 9th, 2012, 2:17am
Thank you Omar for the clarification and for the pointers!  I'm sure I'll be able to get something going.  Cheers!  :)

Title: Re: 2012 Computer Championship
Post by tize on Feb 9th, 2012, 3:06pm
Ok thanks Omar, now I know what to do if I'd like marwin to be able to use the special instructions, maybe I have time maybe not.

Title: Re: 2012 Computer Championship
Post by tize on Feb 9th, 2012, 5:08pm

on 02/08/12 at 22:54:56, omar wrote:
At the very least what you can do is create two executable and exec the appropriate one via the main executable that detects what's available.


After going away from the computer I realized that this probably would allow some thing like this:

Code:
#!/bin/bash

POPCNT=`grep -c 'popcnt' /proc/cpuinfo`

## Check for popcnt support in the CPU
if [ $POPCNT -gt 0 ]
then
     echo "popcnt supported" 1>&2
     PROG="./marwin_popcnt"
else
     echo "popcnt not supported" 1>&2
     PROG="./marwin_no_popcnt"
fi

## Call the appropriate binary
$PROG $@

If that's the case then it's just a matter of providing two binaries to go along with the script.

Title: Re: 2012 Computer Championship
Post by Swynndla on Feb 10th, 2012, 2:18am
Thank you tize!  :)

Title: Re: 2012 Computer Championship
Post by rbarreira on Feb 23rd, 2012, 2:30am
Did all developers set up their bots? I haven't seen ingwa online in a while...

Title: Re: 2012 Computer Championship
Post by omar on Feb 25th, 2012, 11:56pm
bot_badger2 was not submitted. I did not hear anything from Inge.

Title: Re: 2012 Computer Championship
Post by chessandgo on Mar 3rd, 2012, 7:16am
The bot games seem to be terribly exciting. Do you think it would be a good idea to commentate one of the WCC games between two of the leading bots if such a game occurs at a spectator-friendly time?

Title: Re: 2012 Computer Championship
Post by omar on Mar 3rd, 2012, 9:49am

on 03/03/12 at 07:16:01, chessandgo wrote:
The bot games seem to be terribly exciting. Do you think it would be a good idea to commentate one of the WCC games between two of the leading bots if such a game occurs at a spectator-friendly time?


That would be great. Just post here or send me a message if you find a game that suits your time. It would be great if the bot developers could join in as well.

Title: Re: 2012 Computer Championship
Post by chessandgo on Mar 3rd, 2012, 12:03pm
Do you schedule the WCC games at a fixed time? I hope that my schedule will be completely flexible next week, so I should be able to assist Greg for Monday's WC game if he's looking for a co-commentator, and hopefully a WCC game as well... What would be a good time for spectators (if you can pick up a spectator-friendly slot for a game involving two fof marwin, sharp and briareus for example)?

EDIT: oh my bad, Monday's game is at 3am my time, I thought it was much earlier. Was it rescheduled?

Title: Re: 2012 Computer Championship
Post by omar on Mar 3rd, 2012, 4:50pm
I usually schedule the WCC games at a time that is convienient for me so that I can keep an eye on the games in case of a problem. The times are usually 15:00 GMT and 19:00 GMT on weekdays. If you let me know what time is convenient for you, I can set it up for that.

I don't think the time for the hanzack - Nombril game changed.

Title: Re: 2012 Computer Championship
Post by Eltripas on Mar 3rd, 2012, 7:36pm

on 03/03/12 at 16:50:33, omar wrote:
I usually schedule the WCC games at a time that is convienient for me so that I can keep an eye on the games in case of a problem. The times are usually 15:00 GMT and 19:00 GMT on weekdays. If you let me know what time is convenient for you, I can set it up for that.

I don't think the time for the hanzack - Nombril game changed.


It was 17 hours later when it was scheduled.

Title: Re: 2012 Computer Championship
Post by Nombril on Mar 3rd, 2012, 11:09pm
hanzack-Nombril has not moved.  But the spreadsheet and Google Calendar were originally showing the game 12 hours earlier then the gameroom time.

Title: Re: 2012 Computer Championship
Post by Eltripas on Mar 3rd, 2012, 11:28pm

on 03/03/12 at 23:09:28, Nombril wrote:
hanzack-Nombril has not moved.  But the spreadsheet and Google Calendar were originally showing the game 12 hours earlier then the gameroom time.


I guess I must have confused it with a WCC game or something.

Title: Re: 2012 Computer Championship
Post by chessandgo on Mar 4th, 2012, 6:03am
I must have confused with another game as well.

I could commentate on the marwin-sharp game, if you can setup the radio relay Omar! I'm very much looking for co-commentators, be they bot programmers or not. I'll give a shot at recording as well.

Title: Re: 2012 Computer Championship
Post by aaaa on Mar 4th, 2012, 6:53am
I wouldn't commit to commentating any of the CC games until the bug that causes the game client to miss moves until refreshed, which has been plaguing all of these games so far, has been fixed.

Title: Re: 2012 Computer Championship
Post by rbarreira on Mar 5th, 2012, 11:18am
It appears that the bug is fixed, at least it's not happening in the current WCC game.

Title: Re: 2012 Computer Championship
Post by Fritzlein on Mar 7th, 2012, 5:28pm

on 12/10/11 at 14:16:50, Fritzlein wrote:
Each of the last four years, the first-place bot lost between zero and two times to the second place bot, but not to any other bot, while the second-place bot lost three times to the winner and zero times to anyone else.  That made it relatively clear which bots were the top two and therefore should advance to qualifying, but there was presumably still some luck involved.  I'm going to go out on a limb to predict that with four lives for each bot, the situation won't repeat.  At least one of the top two bots (presumably sharp and marwin) will lose a game to a bot that doesn't finish in the top two.

If sharp wins another game before getting eliminated, then my prediction about the lack of clear dominance will have come true.  I think the odds are in my favor.  But my other prediction (that sharp would repeat as champion) is looking like a long shot now that sharp is in third place.

Title: Re: 2012 Computer Championship
Post by aaaa on Mar 8th, 2012, 11:49am

on 12/10/11 at 14:16:50, Fritzlein wrote:
Each of the last four years, the first-place bot lost between zero and two times to the second place bot, but not to any other bot

In 2008, eventual champion Bomb's only loss was to third-place finisher OpFor.

Title: Re: 2012 Computer Championship
Post by Fritzlein on Mar 8th, 2012, 2:19pm

on 03/08/12 at 11:49:23, aaaa wrote:
In 2008, eventual champion Bomb's only loss was to third-place finisher OpFor.

Oops, thanks for the correction.  There has indeed been some ambiguity between second and third place in the past.

Title: Re: 2012 Computer Championship
Post by Fritzlein on Mar 10th, 2012, 5:17pm
28 peak listeners :o for chessandgo and Swynndla commentating briareus's victory over marwin to set up a grand finale.  That's a great sign for the community; I remember computer championship games in the past where there weren't even two people chatting in the chat room.

Title: Re: 2012 Computer Championship
Post by Fritzlein on Mar 11th, 2012, 3:10pm
Holy cow, 49 listeners for marwin's victory to claim the championship!  This is not only a record for Computer Championship evaluation, it ties the record of 49 set in the World Championship this year.

Title: Re: 2012 Computer Championship
Post by omar on Mar 12th, 2012, 12:21am
I had sent out an email shortly before the game to give an update on the status of the world championship events. So maybe that helped. Hope we get a turnout like that for tomorrows game.

Title: Re: 2012 Computer Championship
Post by jdb on Mar 12th, 2012, 9:30am
Congratulations to all the other bot developers. By far the strongest WCC ever.

Title: Re: 2012 Computer Championship
Post by tize on Mar 24th, 2012, 4:14am
Yes it was a very good WCC we had. (When the fortune smile at you, you can't think otherwise...) And I'm already starting to look forward for next year, hopefully the bots will keep learning things this year.

I now have had time to look at the recorded games and would like to thank everybody who made those possible. It's much more fun to watch the games with commentators, thank you, and I got a lot of interesting pointers of where the problems are with marwins play.



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.