Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> Events >> 2007 qualifying for computer championship
(Message started by: Fritzlein on Nov 11th, 2005, 5:44pm)

Title: 2007 qualifying for computer championship
Post by Fritzlein on Nov 11th, 2005, 5:44pm
Omar,

It's probably too late to change the rules for how bots qualify for the 2006 championships, but for next year I think a different qualifying process would be in order.  Under the current rules bots have to expose themselves online for 40 days.  Not only is that a long time for the developer to dedicate a special-purpose machine, it also is a long time for opponents to figure out exactly how to exploit any weaknesses.

Now I'm all in favor of humans having a fair shot at learning something about a bot if that bot is going to be playing for the Arimaa Challenge.  But that's different from letting the all the other bot  developers know everything about your bot.  If I were an unscrupulous bot programmer, I would test out winning lines against specific bots that aren't random enough.  I could have my bot think on its own if it got to a unique position, but if it were in a known position my bot would play a known winning line I tested out during those 40 days.

This vulnerability means that bot developers have every incentive to make changes at the last minute, so their bot doesn't play the same way in the championship as it did while on-line.  You don't forbid developers from changing the bot during the 40 days (on the contrary, I expect you would welcome continuing development) so it isn't against the rules to make last-minute changes.  And the incentive is all towards making any adjustments as late as possible.

The consequence of rules that encourage last-minute changes is to undermine the original point of the 40 days on-line.  Humans could get familiar with a bot, and then be shocked to discover the same bot playing differently in the challenge match.  I think the qualifying rules should totally change, so that there is no incentive for a bot to play weakly or differently in the run-up to the computer championship.

My idea is this: Let anyone who wants to participate and who has a working bot submit the code by December 31, 2006.  Make those bots available for on-line play for two weeks before the tournament, but  after the code is finalized.  So for the first two weeks of 2007, humans would have a shot at Bomb2007CC, Clueless2007CC, etc.  That would give the bots some ratings to seed them into the computer championship, which would then start on January 15.   Also, it would preserve fairness between the bot developers, because no developer would have to put up their bot for the others to scope out ahead of time.  Finally, it would make the challenge defense fair, because the humans would have a bit of practice against the bot they will actually be playing in the Challenge, as opposed to a bot that might be modified later.

Actually, in my opinion, two weeks of practice against all the bots is way more than necessary.  The humans couldn't be taken by surprise even if only the three humans defending the challenge match had the chance to practice, and even if they only got three practice games each.  But I'm erring on the side of caution here: maybe in some future year that won't seem like enough time.

I want to stress that I don't think any current developers are intentionally putting up brain-dead versions of their bots.  I'm sure Bomb's present weakness has to do with a registration code or some such, and I'm sure Clueless' present weakness has to do with new features that aren't properly tuned yet.  I'm just saying that we should totally remove the current incentive for developers to make their bots play weakly on purpose.

Title: Re: 2007 qualifying for computer championship
Post by 99of9 on Nov 11th, 2005, 6:27pm
I support this, it sounds like a nice way of preserving the mystery between bot developers, but allowing reasonable human experimentation.

Title: Re: 2007 qualifying for computer championship
Post by Ryan_Cable on Nov 12th, 2005, 4:03am
I agree that this would be a better system, if only to assure humanity a shot at the bots in exactly their final configuration before the Challenge.  However, for fairness sake, I think the three humans who are selected to defend the challenge should be prohibited from personally playing these final configuration bots.  Otherwise they will be able to do what Fritzlein is describing only more so.  I think reviewing the games other people play with the bots, swapping analysis in the forum, and playing the pre-freeze bot if the developer does voluntarily place it online should give them plenty of information and better approximates what preparation for a match with a human is like.  In any event, I think that a bot developer would be at serious disadvantage by programming in secret and giving up all of the play testing opportunities offered by the gameroom.

Slightly off topic:  I do think that a bot, especially one of the weaker bots, could gain an advantage in the Computer Championship if it had an opening book that incorporated one or more of the bot bashing strategies.  The only hard part would be designing the conditionals to break out of the book if the target bot didn’t cooperate.  It would be particularly advantageous if bot_bomb is going to be minimally changed from last year.

Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 12th, 2005, 10:21am
Ryan,

It is good point that humans, when practicing against the final versions of the bots, might be able to discover a line which wins every time.  However, there is a huge distinction in objective:

If I, as a bot developer, discover a line that beats your bot every time, that doesn't mean my bot is better, and it doesn't mean my bot deserves to win heads up against your bot in the Computer Championship.

If I, as a human defender of the Arimaa Challenge, discover a line that beats your bot every time, then your bot is no match for human intelligence, and therefore unworthy to win the challenge.

Admittedly, it is an open question how much time humans should get to probe for bugs and weaknesses.  How good is a bot that wins the first ten games against the human World Champion, and then loses the next twenty?  I would be very afraid of that bot the following year if its bugs got fixed, but  I still say a machine that can't stand up under even two weeks of scrutiny deserves to lose right then.  Once the challenge prize seems more within reach of the developers, they can focus more on preventing cheap tricks by the humans, just as Deep Blue was tuned to not be snookered by the anti-computer tactics of 1997.

(For what it's worth, I think Kasparov would have won against Deep Blue given two weeks of practice, but against Hydra today, two weeks would be insufficient.  Training time is a buffer to allow human adaptability to come into play, but it doesn't protect us indefinitely.)

By the way (and maybe Fotland can clarify this) I'm pretty sure the winning program in the Gifu computer go championship did play differently depending on its opponent.  Thus we may well currently have a computer champion in go which does not play as well against a human opponent as programs which finished lower in the standings.  I don't think my concern here is idle speculation: It doesn't do Arimaa any good if bots start using cheap tricks against each other, especially if that results in the champion bot not being the strongest overall player against humans.  We can't prevent that from happening, but we can discourage it.

Title: Re: 2007 qualifying for computer championship
Post by acheron on Nov 14th, 2005, 3:44pm
I'd suggest changing it more than that.

Exposing your 'bot to online play beforehand is both a risk and an advantage.  You gain the advantage of extensive testing to reveal flaws, and yet simultaneously expose any such flaws to outside analysis.  

However, it is inequitable to treat computer players differently than human plays in this respect.  No human is forced to make themselves available for forty days to have their own weaknesses probed.  Instead, they enter the matches from a neutral starting point.

I'd remove the early showing requirement altogether.  Let the developers choose whether or not to use the online facilities to refine their 'bot.  

Let's face it, pre-generated paths for use against specific 'bots are in no way a test of intelligence, for either the 'bot or the humans.  Is not the goal to create an environment where thinking is the key (adaptive particularly).  If so, then all this scouting and pre-generation is purely counter to that goal.

Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 15th, 2005, 12:36pm

on 11/14/05 at 15:44:08, acheron wrote:
Let's face it, pre-generated paths for use against specific 'bots are in no way a test of intelligence, for either the 'bot or the humans.


On the contrary, this is a test of intelligence in my book.  If a bot can be beaten the same way every time, it clearly isn't intelligent.  Conversely if a human can come up with a winning line against a bot, that shows intelligence somewhere in humanity, albeit more in the person who originates the line than in the person just clever enough to play it.


Quote:
Is not the goal to create an environment where thinking is the key (adaptive particularly).  If so, then all this scouting and pre-generation is purely counter to that goal.


I agree that thinking should definitely be the key in whatever format we choose.  If bots become at all adaptive in the future, then a bot which which wins 90% at first will hardly be driven down to a 10% or 0% win rate by two weeks of exposure, since the bot can be learning at the same time.  (Incidentally, by "final code", I mean only no further human modification.  Self-modifying bots are fine.)

I think the general assumption is that humans are more adaptable than bots, and will continue to be more adaptable even than bots which can learn.  Therefore we suppose that a shorter exposure favors machines while a longer exposure favors humans.  The humans would absolutely not fear exposing themselves to bot challenges for forty days or for any period of time.  So when we discuss longer or shorter practice before a match it is in the back of our minds how much we want to handicap the match in favor of bots or in favor of humans.

If we didn't think about that balance, and instead thought only about measuring adaptability and intelligence, then we could just rephrase the Arimaa Challenge to be "Put your bot online for a year with no intervention (only self-modification) and if no human can win 30 against it in a 40 game span, the bot wins."  I actually would quite enjoy such a challenge structure, but in my opinion, that would handicap the bots even more than my proposal of two weeks' exposure of the final code and three mini-matches thereafter.  Making a bot that plays Arimaa well and learns is probably much harder than just making a bot that plays well.


Title: Re: 2007 qualifying for computer championship
Post by acheron on Nov 15th, 2005, 1:19pm
My real problem there is that now you've created an environment that is not reflective of the environment for a human player.

No human makes themselves available to such exhaustive study and repetitive testing.  

Longer exposure only favors humans because it implies testing specifically for means of beating the one bot in question, not by the one human opponent, but by a multitude of bot bashers looking for weakness.  Therein lies the problem.

Now the bright side of game records being listed does mean that it is somewhat possible to do this in reverse.  It might be interesting, for example, to scour the records of the top few players and make particular note of the situations that led to their losses.  Does player #2 have a tendency to under-value his Horses and suffer losses for trading them too lightly?  Does player #3 struggle whenever the focal point is on the righthand side of the field instead of the lefthand where his thinking is more comfortable?  Etc...  An interesting concept, but unless you had a very large staff working on such a bot, impractical.

With the relatively low prize value, you're really talking hobby programmers.  Such beforehand scouting then would fall upon his or their shoulders, as opposed to the hunter for 'bot weakness which has a larger crowd to draw upon.

My underlying point... don't set up a system that handicaps at all.  Have the challenge decided by the performance at the actual event (even if this means extending the number of games played from 3 to 5 or 7) and not influenced by beforehand efforts of those not even directly involved.

Title: Re: 2007 qualifying for computer championship
Post by Janzert on Nov 15th, 2005, 2:07pm

on 11/15/05 at 13:19:30, acheron wrote:
No human makes themselves available to such exhaustive study and repetitive testing.


The lowest number of games played by a +2000 rated player is currently 299. Discounting omar_fast and bot_lightning, +1900 is 168 games. Minimum games +1800 is 99 games.

I think two weeks exposure without outside modification is a very short time and favors the bot.

Janzert

Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 15th, 2005, 8:16pm

on 11/15/05 at 13:19:30, acheron wrote:
Have the challenge decided by the performance at the actual event (even if this means extending the number of games played from 3 to 5 or 7) and not influenced by beforehand efforts of those not even directly involved.

It seems very fair and reasonable to simply hold an event and let the winner be the winner, without a bunch of preconditions.  But you also hint at a big problem with this method, namely that a short match doesn't necessarily showcase intelligence, in particular not adaptability.

I believe that Kasparov would have beaten Deep Blue in a 20-game match instead of a 6-game match.  He barely had any chance to adjust his strategies when he discovered his anti-computer ideas weren't working.  Supposing that I am right, and Kasparov would have turned it around, then would it have been correct to say that Deep Blue was NOT the better chess player, even though it won the 6-game match?

For me, the ability to learn is built in to the definition of who is the best.  I've beaten Omar in all eight of our games.  What if next time we play, he surprises me with a new opening and beats me three times in a row with it?  Would that make him better than me?  I say it wouldn't, if I learned from those games and came back to beat him the next eight.

I guess what we're trying to avoid in the challenge rules is giving the prize to a bot that can win a short match but not a long one.  Given that bots are less adaptable than humans, this is a reasonable probability.  I hold that being beaten by a surprise you didn't figure out immediately shouldn't count if you can figure it out soon and win consistently after that.  This certainly leaves open the question of how long is long enough, but it seems that the more you emphasize learning in your definition of being best, the longer you have to allow.  For my preferences, a 20-game match would be long enough, even if the bot had no exposure other than the games themselves.

Title: Re: 2007 qualifying for computer championship
Post by omar on Nov 16th, 2005, 1:36am
Thanks for bring up this topic Karl. It is a very difficult and complex issue and one that I've thought about quite a bit. Yet I'm also not quite satisfied with the current solution and open to suggestions for improving it.

First of all, lets step back and think about why we need any qualifying games at all. Consider what happened in the DB vs GK match. The DB team had complete access to all the games GK had ever played while they developed DB in secrecy and it never played any public games until the day of the match. That I think is totally unfair. When two players face each other in a match they should have equal access to each others historical games records for preperation. Also the game record of both players should contain some minimum number of "serious" games; or both should contain nothing. Otherwise the situation is unfair to the player that has a publicly available historical game record, but has none or very limited such information about the opponent. Therefore bots need to play some minimum number of "serious" games so that there is a publicly available game records for opponents to review when they prepare to play against it match games.

Now consider the situation of the 2002 Kramnik vs Fritz match. Kramnik was given an exact copy of Fritz that he would play against in the match one month in advance so he could prepare for the event. Although this has been considered fair, I personally think that it is unfair to the bot. I don't think it is fair to have such complete access to a bot.

What I would really perfer is to have the bots play the "serious" games in a natural way throughout the year to establish a game record comprable to that of top human players. Then there would not be a need to impose any qualifying restrictions on them. But as we have seen the bots being developed usually are not available to play against throughout the year. Thus the need for imposing some qualifying restrictions. Perhaps someday when there are many Arimaa bots and the number of bots that can enter the championship tournament is limited with rating being a selection criteria then the bots will more naturally play "serious" games to qualify for the tournament and there will not be a need for any other qualifying requirements. However, the current situation does not support this. Thus the need for imposing some qualifying conditions.

I actually am not too concerned about the bots being changed till the last minute. I think it is very fair to allow the bot developers to continue changing the bots as much as they want. In fact the bots can even be changed between the match games as long as it is done in an automated way and does not require manual intervention from the bot developer. The model that I use for the bots is that of a program distributed to millions of people. The program developer will not be able to hand tweak the program to customize it to each end user, but he can ask the end user to run a program that automates the customization. Also there is no way one can prevent a bot developer from having the bot play differently during the freeze period than it does during the challenge match. The only thing that can be done is to discourage such practice. A freeze period would not prevent it. Also keep in mind that whenever a bot is changed the developer does not really know if the change actually improved the bot or made it worse. This gets to be a bigger and bigger problem as the bots improve to the point where hundreds of games are needed to determine if a change actually improved the bot. So I am not at all concerned about the bots being changed. What I am concerned about is that the bot developers be honest and present their current best bot as they would if they were trying to maximize it's ratings.

I initially had required the bots to play some minimum number of games to qualify; with a portion of those games being against humans. However, we know that some bots are played against more than others. So it could be possible that a bot does not meet the requirement just because humans did not play it; not because it was not available. So I changed the requirement to being available to play against; which is much more in the control of a bot developer.

Now 40 days might seem as you mentioned a long time for opponents to figure out exactly how to exploit a weakness. However, I think it is balanced out by allowing the bots to be changed during this time and up to the last minute. If an exploit is found a developer can try to fix it. They can even postpone fixing it to the last minute, but then they won't know if the change really fixed it especially if it is a strategic exploit and not just a simple bug in the code.

I think a solution which completely eliminate the need for qualifying conditions indirectly through other requirements which encourage the bots to play more regularly and try to maximize their ratings would be the best way to improve this.


Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 16th, 2005, 6:20pm

on 11/16/05 at 01:36:30, omar wrote:
Also there is no way one can prevent a bot developer from having the bot play differently during the freeze period than it does during the challenge match. The only thing that can be done is to discourage such practice. A freeze period would not prevent it.

Really?  How would the bot know whether it was playing a challenge game or a pactice game?  If the bot bases its decision of how to play on the date, you could alter the system clock or something to prevent that.  I'm obviously not very savvy with this kind of thing, but I'm surprised that you don't think you could set it up somehow.


Quote:
I think a solution which completely eliminate the need for qualifying conditions indirectly through other requirements which encourage the bots to play more regularly and try to maximize their ratings would be the best way to improve this.

Ensuring an adequate number of games for each bot at full strength is what I was trying to get at with a code freeze, so I'm curious to hear more about the problems with that idea.  If the enviroment in which the submitted bots operated would somehow ensure that they had to play each game to win, because any game against another bot might be for the computer championship, and any game against a human might be part of the challenge, then you would have complete control over the amount of exposure you think is fair.

By the way, I'm still sort of intrigued by the notion of putting up bots on the server and declaring open season.  I know this is a radical departure from the challenge match structure, but it seems to have an intuitive appeal to me.

* developer submits code
* bot is made available on the server
* if no human can do X within the next Y months, humanity acknowledges defeat

X could be any individual compiling a +20 score (wins minus losses) or whatever seems fair.  If humans were brave enough, they could choose a faster time control than 2 minutes per move, but they wouldn't have the option of going slower than 2 minutes per move.

The developers could get in line for a turn at the challenge.  Whenever a bot gets shot down, its developer would have to go to the back of the line, and the next contending bot could be put in place.  The challenge could become a perpetual thing, a way of life for the Arimaa community.  You can be sure humans would be falling over themselves to be the one to shoot down whichever contender was currently in place.

I dunno, it just seems kind of fun, but maybe it wouldn't work in practice.  Or even if it would work, it would seem to the developers we are just moving the goalposts yet again.

Does anyone else think this would be an interesting idea?


Title: Re: 2007 qualifying for computer championship
Post by nbarriga on Nov 16th, 2005, 6:41pm

on 11/16/05 at 18:20:00, Fritzlein wrote:
By the way, I'm still sort of intrigued by the notion of putting up bots on the server and declaring open season.  I know this is a radical departure from the challenge match structure, but it seems to have an intuitive appeal to me.

* developer submits code
* bot is made available on the server
* if no human can do X within the next Y months, humanity acknowledges defeat

X could be any individual compiling a +20 score (wins minus losses) or whatever seems fair.  If humans were brave enough, they could choose a faster time control than 2 minutes per move, but they wouldn't have the option of going slower than 2 minutes per move.

The developers could get in line for a turn at the challenge.  Whenever a bot gets shot down, its developer would have to go to the back of the line, and the next contending bot could be put in place.  The challenge could become a perpetual thing, a way of life for the Arimaa community.  You can be sure humans would be falling over themselves to be the one to shoot down whichever contender was currently in place.

I dunno, it just seems kind of fun, but maybe it wouldn't work in practice.  Or even if it would work, it would seem to the developers we are just moving the goalposts yet again.

Does anyone else think this would be an interesting idea?

I really like the idea, i think that having a system like this will encourage developpers to work during all the year. And you could set up a  parallel bot ranking with the "highest number of [days|matches] undefeated while competing for the challenge".

Title: Re: 2007 qualifying for computer championship
Post by nbarriga on Nov 16th, 2005, 6:41pm

on 11/16/05 at 18:20:00, Fritzlein wrote:
By the way, I'm still sort of intrigued by the notion of putting up bots on the server and declaring open season.  I know this is a radical departure from the challenge match structure, but it seems to have an intuitive appeal to me.

* developer submits code
* bot is made available on the server
* if no human can do X within the next Y months, humanity acknowledges defeat

X could be any individual compiling a +20 score (wins minus losses) or whatever seems fair.  If humans were brave enough, they could choose a faster time control than 2 minutes per move, but they wouldn't have the option of going slower than 2 minutes per move.

The developers could get in line for a turn at the challenge.  Whenever a bot gets shot down, its developer would have to go to the back of the line, and the next contending bot could be put in place.  The challenge could become a perpetual thing, a way of life for the Arimaa community.  You can be sure humans would be falling over themselves to be the one to shoot down whichever contender was currently in place.

I dunno, it just seems kind of fun, but maybe it wouldn't work in practice.  Or even if it would work, it would seem to the developers we are just moving the goalposts yet again.

Does anyone else think this would be an interesting idea?

I really like the idea, i think that having a system like this will encourage developpers to work during all the year. And you could set up a  parallel bot ranking with the "highest number of [days|matches] undefeated while competing for the challenge".

Title: Re: 2007 qualifying for computer championship
Post by Janzert on Nov 16th, 2005, 11:50pm

on 11/16/05 at 18:20:00, Fritzlein wrote:
By the way, I'm still sort of intrigued by the notion of putting up bots on the server and declaring open season.  I know this is a radical departure from the challenge match structure, but it seems to have an intuitive appeal to me.


While I really like this idea from the pure "let's best try and determine if bots or humans are better at arimaa" point of view, I think it has a number of shortcomings when looking at ancillary concerns.

Probably top most of these would be, for spectators there is no annual "Grand match" focal point. Probably resulting in greater difficulty getting more sponsors as well.

Janzert

Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 18th, 2005, 9:39am

on 11/16/05 at 23:50:08, Janzert wrote:
for spectators there is no annual "Grand match" focal point. Probably resulting in greater difficulty getting more sponsors as well.


Good point.  I wonder if bot developers would participate if it were set up as a side contest, independent of the Arimaa Challenge.  We have the Player of the Month contest to encourage humans to play each other.  In parallel there could be a regular Bot Contender prize to encourage developers to work on their bots year-round.  The prize money could be split between the developer and the first player to garner a +20 score, depending on how long the bot survived.  (There I go giving away Omar's money...)  We could have a hall of fame so that defenders of human supremacy would gain eternal glory.

I'm just brainstorming here.  The Arimaa server is already a way cool place to play.  It seems that bot development has cooled off a bit from last year, but it is probably cyclical and will pick up again without any structural changes.

Title: Re: 2007 qualifying for computer championship
Post by acheron on Nov 18th, 2005, 12:21pm
If you want to boost potential prize money I wouldn't look to Omar to provide it.

The real trick is to acquire corporate sponsors.  These aren't going to be interested right now because no one has ever heard of Arimaa.  Which then leads us to step one - acquire publicity.

Someone with a good mind for exposure could probably come up with an appropriate targeted list for media contact.  Generate some press in the right circles and paint Arimaa as a more mainstream activity and corporate sponsorship will follow.

Title: Re: 2007 qualifying for computer championship
Post by PMertens on Nov 18th, 2005, 3:57pm

Quote:
It seems that bot development has cooled off a bit from last year, but it is probably cyclical and will pick up again without any structural changes.


let's see what can be done about that :-)

Title: Re: 2007 qualifying for computer championship
Post by nbarriga on Nov 18th, 2005, 4:44pm

on 11/18/05 at 12:21:23, acheron wrote:
Someone with a good mind for exposure could probably come up with an appropriate targeted list for media contact.  Generate some press in the right circles and paint Arimaa as a more mainstream activity and corporate sponsorship will follow.


One word: Slashdot.

Title: Re: 2007 qualifying for computer championship
Post by 99of9 on Nov 18th, 2005, 6:57pm

on 11/18/05 at 09:39:27, Fritzlein wrote:
It seems that bot development has cooled off a bit from last year, but it is probably cyclical and will pick up again without any structural changes.

Actually I think there has been more bot development this year than ever before.  It's just that the development hasn't been in the top bots.  Haizhi, Weiser, and Aamira!

I agree us old-timers haven't put in the hard yards this year, but even that may pick up in December, as we all seem to be last minute people.

Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 18th, 2005, 11:55pm

on 11/18/05 at 18:57:45, 99of9 wrote:
Actually I think there has been more bot development this year than ever before.  It's just that the development hasn't been in the top bots.  Haizhi, Weiser, and Aamira!


That's true, there has been a lot of development.  I guess I was forgetting Haizhi and Aami-ra because it looks like they aren't spending their qualifying time on line.  I'm happy Weiser is still around, though, and still under active development.

Title: Re: 2007 qualifying for computer championship
Post by omar on Nov 19th, 2005, 10:53am

on 11/18/05 at 16:44:51, nbarriga wrote:
One word: Slashdot.


The server would probably crash though :-)

Although this new server seems to be running pretty smooth even when there are a lot of people in the gameroom. I checked the load on the server during my WC game against Paul and the load was only like 0.5; I think there was about 12 to 14 people logged in at that time.

The load actually goes up when people start playing the bots that are also running on the server.

Title: Re: 2007 qualifying for computer championship
Post by omar on Nov 19th, 2005, 10:55am

on 11/16/05 at 23:50:08, Janzert wrote:
While I really like this idea from the pure "let's best try and determine if bots or humans are better at arimaa" point of view, I think it has a number of shortcomings when looking at ancillary concerns.

Probably top most of these would be, for spectators there is no annual "Grand match" focal point. Probably resulting in greater difficulty getting more sponsors as well.

Janzert


Precisely. I would like to preserve that annual Grand Match nature of the current structure.


Title: Re: 2007 qualifying for computer championship
Post by omar on Nov 19th, 2005, 11:05am

on 11/16/05 at 18:20:00, Fritzlein wrote:
Really?  How would the bot know whether it was playing a challenge game or a pactice game?  If the bot bases its decision of how to play on the date, you could alter the system clock or something to prevent that.  I'm obviously not very savvy with this kind of thing, but I'm surprised that you don't think you could set it up somehow.


Exactly. The date could be used by the program to play differently. I would never alter the date on a computer because a lot of other programs depend on it being right; including the bot interface scripts.

Title: Re: 2007 qualifying for computer championship
Post by omar on Nov 19th, 2005, 11:38am

on 11/18/05 at 09:39:27, Fritzlein wrote:
Good point.  I wonder if bot developers would participate if it were set up as a side contest, independent of the Arimaa Challenge.  We have the Player of the Month contest to encourage humans to play each other.  In parallel there could be a regular Bot Contender prize to encourage developers to work on their bots year-round.  The prize money could be split between the developer and the first player to garner a +20 score, depending on how long the bot survived.  (There I go giving away Omar's money...)  We could have a hall of fame so that defenders of human supremacy would gain eternal glory.

I'm just brainstorming here.  The Arimaa server is already a way cool place to play.  It seems that bot development has cooled off a bit from last year, but it is probably cyclical and will pick up again without any structural changes.


It takes a lot of effort to develop a bot and even more to improve a bot that is already playing pretty good. So I don't think the amount of money I could offer would produce much encouragement. The bot developers are really doing it for other reasons. The two main motives I see right now are: as projects related to school work, and because developing game programs is their personal hobby. Even in Chess top programs like Shredder and Junior are developed by hobbiests. Although in chess they can probably make a living off the program.

What I've been meaning to do, but haven't done yet is to mail out a call for participation letter to the computer science dept of major universities asking them to develop a bot to represent their university in the annual Arimaa computer championship. This would help fuel Arimaa getting known about in the academic community. The professors might start mentioning Arimaa in their intro to AI classes and more students might choose to develop a bot for their project.


Title: Re: 2007 qualifying for computer championship
Post by omar on Nov 19th, 2005, 11:53am
Returning back to the topic of this dicussion.

Lets state the goals of what the qualifier should acheive. Here is what I would like to shoot for:

1. Require the bots to produce some minimum game record with some minimum games being against humans.
2. Require the bots to show their best performance in these games.
3. Be easy enough that someone can qualify even if they started late in the year.
4. Be hard enough that even the well established and best bots need to make some effort.


Title: Re: 2007 qualifying for computer championship
Post by omar on Nov 19th, 2005, 12:13pm

on 11/18/05 at 23:55:37, Fritzlein wrote:
That's true, there has been a lot of development.  I guess I was forgetting Haizhi and Aami-ra because it looks like they aren't spending their qualifying time on line.  I'm happy Weiser is still around, though, and still under active development.


Also don't underestimate bot_bomb. Remember David can change it till the last minute. The publicly available database of the games and the match offline scripts now allow a developer to be able to improve their bot without it actualling needing to play games in the gameroom.

I think we might be suprised by how well bomb2006CC will actually play in the tournament and perhaps in challenge match ;-)


Title: Re: 2007 qualifying for computer championship
Post by 99of9 on Nov 19th, 2005, 4:00pm

on 11/19/05 at 12:13:39, omar wrote:
Also don't underestimate bot_bomb. Remember David can change it till the last minute. The publicly available database of the games and the match offline scripts now allow a developer to be able to improve their bot without it actualling needing to play games in the gameroom.

I think we might be suprised by how well bomb2006CC will actually play in the tournament and perhaps in challenge match ;-)


We may well be surprised, but that is partly because to date a broken version of bomb has been in the gameroom.

Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 20th, 2005, 9:10pm

on 11/19/05 at 11:53:52, omar wrote:
Returning back to the topic of this dicussion.

Lets state the goals of what the qualifier should acheive. Here is what I would like to shoot for:

1. Require the bots to produce some minimum game record with some minimum games being against humans.
2. Require the bots to show their best performance in these games.
3. Be easy enough that someone can qualify even if they started late in the year.
4. Be hard enough that even the well established and best bots need to make some effort.


Heheh, sorry I started to wander pretty far afield with suggestions for radically changing the structure of the Arimaa Challenge.  I agree with the comments that a climactic match once a year is by far the best for publicity and eventual corporate sponsorship, so my above barinstorming notwithstanding, I wouldn't want to mess with that setup.

It is good, however, for us to be clear that the qualifying for the Computer Championship is, at the present, very much entangled with the goals for the Challenge.  For example, the first condition about having some minimum game record against humans really has nothing to do with the Computer Championship, does it?  What relevance do human games have to a bot versus bot contest?

If a bot competed in and won the Computer Championship without having any record whatsoever against humans, that would in no way undermine the integrity of the Computer Championship.   Moreover, as long as that bot thereafter played enough games against humans before it played in the Challenge, it would in no way undermine the integrity of the Challenge that the game record was compiled after the Computer Championship rather than before.

Conditions 1 and 2 seem to be directed at the Challenge, while 3 and 4 seem directed more more towards the Computer Championship.  Yet one might think, "Why not kill two birds with one stone, and have the conditions for a bot to qualify for one event be exactly the same as the conditions to qualify for the other event?"  I think that's where the problem starts, and is what inspired my initial post in this thread.

For a bot to have a game record against humans is not only unnecessary for the Computer Championship, it is a competitive disadvantage.  Another bot developer could create an opening book against the bot he wants to beat, basing it on the weaknesses exposed by human play.  A bot might be hurt in the Computer Championship by its prior exposure to humans, even though getting bashed around by humans says nothing about the fitness of a bot to compete against other bots.

Thus it makes sense, in the qualifying for the Computer Championship, to remove any criterion of having a certain number of games against humans or of spending a certain amount of time on line accepting challenges.  Let the developers choose how much to test their bots, and against which opponents.  There is no need to begrudge developers this control as they compete with each other.

Why not wait until after the Computer Championship is over, or at least until the final code for the bots has been submitted, to mandate play against humans?  At that point it can't make any difference in the Computer Championship how badly the bots get kicked around by humans.  It will only make a difference in the Challenge, which is as it should be.

You say, Omar, that there is no way to prevent a developer from having his bot play differently in practice games than in the Challenge itself.  Since that is true no matter when those practice games are held, this is no argument in favor of having the practice games later or sooner.  Cheating can happen at any time.   So let's put aside the question of cheating, assume the developers will play fair, and consider only the process of developing a bot.  If it is important that a bot have a record of games against humans to qualify for the Challenge, isn't it equally important that those games be played by the bot in its final version?  Otherwise last-minute improvements may make the entire game record useless to the humans.  It's as if we would allow a developer to qualify for the Challenge by handing us a record of games compiled by some other bot.

Indeed, if you recall, Kasparov had played a previous match against Deep Blue, and had those games from which to study and prepare.  However, Deep Blue had changed and improved so much that this experience was useless to Kasparov.  The computer had the same name, but it wasn't the same computer, much as there is currently a bot online that is named Bomb without being Bomb.

Assuming the developers play fair, why not mandate the exposure of the final version of the bot, i.e. the actual bot that competes in the Computer Championship?  Exactly that bot is going to play in the Challenge, so exactly that bot needs to have a game record against humans.  As a side benefit, the bots that lose the Computer Championship wouldn't have to be forced to qualify for the Challenge.   Only the bot that wins and will actually play in the Challenge needs to play humans.  (Of course, all bots eventually become members of the CC fold of bots, but that is separate from the qualifying process.)

To recap, I would split the qualifying criteria of the Computer Championship from the qualifying criteria of the Challenge.  Conditions 3 and 4, for the Computer Championship qualifying only, could be met by specifying five bots from previous championships and requiring two games against each of them at two miutes per move.  Even a latecomer could qualify, but it shows some seriousness on the part of the developer to clear this hurdle.

Conditions 1 and 2, for the Challenge, could be met by exposing the final version of the bots (or only the winner) during the Computer Championship (or after) for whatever amount of time or number of games seems appropriate.


Title: Re: 2007 qualifying for computer championship
Post by 99of9 on Nov 20th, 2005, 10:55pm

on 11/20/05 at 21:10:38, Fritzlein wrote:
As a side benefit, the bots that lose the Computer Championship wouldn't have to be forced to qualify for the Challenge.   Only the bot that wins and will actually play in the Challenge needs to play humans.


I quite like your suggestion overall Fritz, and I don't have time now to prepare a proper response, but the quote above is one thing you should think about a bit more:

The issue is that the humans can focus all their attentions on finding ways to beat one particular bot, because they know which one they will play.  Bots do not know their Challenge opponent(s) in advance, so either they have to be set up very generally, or they have to have separate tuning done for every different possible human.

Therefore I think you would be unintentionally skewing the Challenge a little further in the favour of humans.

Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 21st, 2005, 10:22am

on 11/20/05 at 22:55:21, 99of9 wrote:
The issue is that the humans can focus all their attentions on finding ways to beat one particular bot, because they know which one they will play.  Bots do not know their Challenge opponent(s) in advance, so either they have to be set up very generally, or they have to have separate tuning done for every different possible human.

Therefore I think you would be unintentionally skewing the Challenge a little further in the favour of humans.

I quite agree that this is an important issue of fairness, but it could be addresses by reducing the contending bot's exposure to human games.  To tell you the truth, I think it is quite fair to limit the exposure to three practice games for each of the three defenders.  I would still put up my money that the Challenge won't be won under those circumstances, but by the same token don't think bot developers would have anything to complain about in terms of humanity ganging up on their bot.

The reason I said "whatever amount of time or number of games seems appropriate" is that I wanted to write a focused post (for once) purely about separating the qualifying for the Computer Championship from the qualifying for the Challenge.  When I got into questions of how to prevent cheating, or how to structure the challenge, or how much exposure to humans is appropriate, I was distracting from what I think is a quite useful suggestion, and I didn't want that central suggestion to be rejected due to my opinions on other issues. ;-)

Title: Re: 2007 qualifying for computer championship
Post by omar on Nov 22nd, 2005, 10:39am
Karl, you are absolutely right, the qualifying games really are trying to kill two birds with one stone and although the two events are seperate they are in some ways intangled. It reallys helps to recognise this fact explicitly.

Perhaps it would be better to seperate the qualifying games for the two events, but as Toby mentioned it raises some issues about fairness to the bots. So we must be very careful in how we collect the record of games against humans.

Consider the following:

After the computer championship tournament is over the two bots which finished with the best results compete to determine which will play in the challenge match.

The two bots will be kept online for humans to play against for two weeks. Any human player that wants to play against the bots must play only two games against each bot once with each color. So many humans can play against the bots during the two weeks, but each can only play two games and must play two games against both bots. The human players selected for the challenge match must not play the bots during this period and can only observe and analyse the games. Remember the human players for the challenge match are already determined before the computer championship tournament begins. The bot which has a better record after the two weeks with ties broken using number of moves in the games will go on to play in the challenge match.


Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 22nd, 2005, 2:10pm
That's a fantastic idea, Omar.  Intuitively it addresses the fairness issue to forbid the actual defenders of the Challenge match from playing practice games against the bot contenders, and to limit everyone else to two each.

I don't think bot developers can complain too much about it.  In particular, if a human outside of the top three can come up with a winning line during qualifying that the bot falls for verbatim in the Challenge, then that bot deserves to lose the Challenge.

This format would be extremely exciting if bots ever advanced to the point that only three humans could beat the best bot consistently.  Imagine the fever pitch of interest in the Challenge match if some bot were winning against all its human opponents during the two-week qualifying phase!

At the same time, I think the proposed structure is fair to humanity, because there is a large enough pool of strong players nowadays to try out all the basic strategies, which gives the defenders some idea of what to expect no matter how those games turn out.

But what I like best about your proposal, Omar, is that it makes the qualifying more appropriate to the event.  The bot we would most want to participate in the Challenge isn't the best anti-bot bot, it is the best anti-human bot.

If humans continue to be way ahead of bots, I can imagine developers will simply give up on the Challenge, and instead focus on getting their bot to beat other bots.  Someone who does a good job of that could theoretically win the Computer Championship with a bot that is easier for humans to beat than the losers of the Computer Championship.

If the top two bots from the Computer Championship have a run-off qualifier for the Challenge based on anti-human play, it will reward a bot developer who focuses also on beating humans, and not purely on beating other bots.  And regardless of developers' intentions, your qualifying rules would give us a better chance of selecting the bot most skilled in anti-human play.

Title: Re: 2007 qualifying for computer championship
Post by 99of9 on Nov 23rd, 2005, 12:05am

on 11/22/05 at 10:39:03, omar wrote:
The bot which has a better record after the two weeks with ties broken using number of moves in the games will go on to play in the challenge match.


An interesting twist.  Unfortunately it opens the possibility of humans trying to manipulate the bot qualification.  If the best bot was way ahead of the second best bot, humans could throw games against the second best bot in order to ensure their champions did not have to face the number 1 bot.  Obviously I don't think this is all that likely, but it would be nice if it were excluded.

I still think the simplest temporal order would be:
1) Choose human participants (submit to ICGA only)
2) Submit bots
3) Human participants revealed
4) Humans allowed to play any bots they want for time T
5) Bot tournament.
6) Winner of bot tournament is challenger
7) Challenge

Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 24th, 2005, 1:07pm

on 11/23/05 at 00:05:56, 99of9 wrote:
An interesting twist.  Unfortunately it opens the possibility of humans trying to manipulate the bot qualification.  If the best bot was way ahead of the second best bot, humans could throw games against the second best bot in order to ensure their champions did not have to face the number 1 bot.  Obviously I don't think this is all that likely, but it would be nice if it were excluded.

This is a valid objection, but I so much like the idea of using bot vs. human games in qualifying for a bot vs. human event that I want to take a close look at whether this objection can be overcome.

First, I do think that at least developers should be excluded from playing their own bots, because clearly developers have an incentive to throw games against their own bot.  So the two developers and the three defenders are out.  I concede that at present this makes the pool of potential human players smaller, and that a smaller pool of humans makes this qualifying scheme more open to manipulation.  However, the small numbers could be meliorated somewhat by putting the two contending bots in the lobby and encouraging everyone who has beaten Arimaazilla to play them.  Also there would be more participation if the time control were set to something reasonably brisk, say 45 seconds per move.

Second, Omar could write into the rules a provision that all four games will be disregarded for anyone who has, in Omar's opinion, purposely played weakly in any game.  But I admit that the subjective nature of poor play means this rule could be enforced only in extreme cases.

The third and most important protection, however, comes from the motivation of human players.  How will humanity (apart from the five already ruled out) behave, and why?

Right now, I imagine most people will want to see the strongest available bot play for the title, even if they are rooting for the bot to lose.  Nobody thinks even the best bot has a chance to win at present, so bot-haters will want to see the best bot in the Challenge so it can be decisvely crushed.   For different reasons, the people who would want the bot to win the Challenge will certainly want the best bot there.  And then there are the majority of honest people who want the top bot playing for the prize just because that's the way it should be.

But let's fast forward a few years to a worst case scenario.  Suppose a new, super-strong bot comes out and crushes the competition in the bot tournament.  Suppose also the bot has no game history against humans, and no losses against bots, so nobody knows if it even has any weaknesses.  Suddenly humanity would be afraid of losing the Challenge.

A few people (probably a small percentage, but let's say they exist) are motivated by not wanting humanity to fail, and are also unscrupulous enough to try to throw the qualifying in favor of the weaker bot.  However, in order to succeed, they must be able to beat the strongest bot, and indeed be able to do so in their first two tries ever against that bot.  I say that if there exists a cadre of people outside of the defenders who beat the bot without practice, presumably that bot was scant threat to humanity anyway.

One could say that even if humanity has a 100% chance of winning the Challenge, it is still unfair to the bot developer whose bot lost in qualifying to an inferior bot.  It is true that this would be unfair.  But if humanity has no chance of losing the Challenge, I don't see where humans get the motivation to manipulate bot qualifying.  What's the point?  So either the motivation or the ability to be dishonest will always be lacking.  It seems that this minimizes the potential problem to the point that we might reasonably ignore it.

But I do see it is a balancing act.  If bots qualify for the Challenge by playing humans rather than by playing other bots, it has some very nice positives, but I see how negatives can creep in as well.


Quote:
I still think the simplest temporal order would be:
1) Choose human participants (submit to ICGA only)
2) Submit bots
3) Human participants revealed
4) Humans allowed to play any bots they want for time T
5) Bot tournament.
6) Winner of bot tournament is challenger
7) Challenge

If Omar decides to keep it that the winner of the bot tourney automatically becomes the challenger, then this order of events appears to be very sensible.  Depending on the length of time T, it protects the interests of both the bots developers and the Challenge defenders.

Title: Re: 2007 qualifying for computer championship
Post by omar on Nov 28th, 2005, 4:09pm

on 11/23/05 at 00:05:56, 99of9 wrote:
An interesting twist.  Unfortunately it opens the possibility of humans trying to manipulate the bot qualification.  If the best bot was way ahead of the second best bot, humans could throw games against the second best bot in order to ensure their champions did not have to face the number 1 bot.  Obviously I don't think this is all that likely, but it would be nice if it were excluded.


Good point Toby. I had a feeling someone would bring this up. This possiblility occured to me, but I didn't think it would be too much of a problem. For your statement to hold, we have to make the assumption that somehow people already know which bot is stronger against humans. I would argue that people playing the two bots would not already know this, if the bots have not given much of a historical record against humans. The new format does not require any record against humans prior to the computer championship.

Title: Re: 2007 qualifying for computer championship
Post by omar on Nov 28th, 2005, 4:20pm

on 11/24/05 at 13:07:36, Fritzlein wrote:
A few people (probably a small percentage, but let's say they exist) are motivated by not wanting humanity to fail, and are also unscrupulous enough to try to throw the qualifying in favor of the weaker bot.  However, in order to succeed, they must be able to beat the strongest bot, and indeed be able to do so in their first two tries ever against that bot.  I say that if there exists a cadre of people outside of the defenders who beat the bot without practice, presumably that bot was scant threat to humanity anyway.


Yes, I think this is also a very strong argument as to why we need not worry about everyone conspiring to throw games against the weaker bot to get it into the challenge.

However, consider a senerio where both bots have not lost any games, but because people purposely lost faster to the weaker bot (assuming they somehow already knew which was weaker), the weaker bot gets selected based on the number of moves rule.

Perhaps to further eliminate that possibility we could add an exception to the rules that if both bots have not been defeated then the bot which won the computer championship goes on to play in the challenge match.

Title: Re: 2007 qualifying for computer championship
Post by omar on Nov 28th, 2005, 4:33pm

on 11/23/05 at 00:05:56, 99of9 wrote:
I still think the simplest temporal order would be:
1) Choose human participants (submit to ICGA only)
2) Submit bots
3) Human participants revealed
4) Humans allowed to play any bots they want for time T
5) Bot tournament.
6) Winner of bot tournament is challenger
7) Challenge


We would still have the problem that in step 4 we don't know if the bot was really playing to it's full potential. The bot has no steak in those games. So as I mentioned earlier, it would be possible to look at the time to decide how well to play. I also plan to add an 'Event' field to the game info that is available to the bots, so even that could be used to decide how to play. The bots need to have a steak in the games they play against the humans when establishing their record against humans. Ratings are usually enough motivation, but tournaments and contests make it even more stronger.

Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 29th, 2005, 5:35pm

on 11/28/05 at 16:20:44, omar wrote:
However, consider a senerio where both bots have not lost any games, but because people purposely lost faster to the weaker bot (assuming they somehow already knew which was weaker), the weaker bot gets selected based on the number of moves rule.

Perhaps to further eliminate that possibility we could add an exception to the rules that if both bots have not been defeated then the bot which won the computer championship goes on to play in the challenge match.


Or, more simply, if the top two bots end up tied on won-loss record, then the tie is broken in favor of the bot that won the Computer Championship, and game length doesn't enter into it ever.  With that tiebreak in effect, you would actually have to beat a bot to conspire against it, rather than merely losing faster to the other bot.

Title: Re: 2007 qualifying for computer championship
Post by Fritzlein on Nov 29th, 2005, 5:44pm

on 11/28/05 at 16:33:51, omar wrote:
The bots need to have a stake in the games they play against the humans when establishing their record against humans.

I hadn't quite understood this part of your proposal before, Omar.  By having the bot vs. human games be part of a runoff qualifying process, you help ensure that the bots will try their hardest in the very games where we need them to be playing at full strength.

When you add it to all the other benefits of qualifying for the Challenge on the basis of games versus humans, the benefit of enouraging full-strength play ices the cake.  Even though I 'm still slightly worried about the possibility of conspiracy against a particular bot, I think the proposed runoff system is much better than what we have now.



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.