Arimaa Forum - Print Page


    
      
        Arimaa Forum
        (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
      

        Arimaa >> Off Topic Discussion >> Rating of a perfect chess player
        
(Message started by: omar on Oct 21^st, 2004, 11:43pm)

Title: Rating of a perfect chess player
Post by omar on Oct 21^st, 2004, 11:43pm

I've often wondered what the rating of a perfect chess player would be. I know it has to be some finite number, although it would probably be an appoximate number with an error bound. No one has ever crossed a rating of 3000; so are we almost there; are we half way there; or are we not even close. Im not too concerened about having an exact number, I would be happy to have an approximate number that would give me some idea of where we are compared to a perfect player.

Is there any way to find this number; even approximately?

The other day I thought of a way to approximate this number. But I don't if it will actually work and it requires a lot of data. So Im posting it to see what others think.

First lets assume that a perfect game of chess is a draw (kind of like a complex version of tic-tac-toe). This assumption could be wrong, but for the moment lets assume it is true. Now if two perfect players were playing each other the result would always be a draw. The draw percentage would be 100%. Suppose we make a graph of chess rating (x-axis) vs percentage of draws (y-axis). The games used for this graph should be games between two players that have a fairly close rating (maybe 50 points of one another). We know that the percentage of draws among beginners is about 3% and among the top players it is above 50%. I wonder what this line or curve looks like. Maybe we can extrapolate the line to 100% draws and see what rating it corresponds to. Would this be the appoximate rating of a perfect chess player. Has anyone seen a graph like this?

Now there is the problem that in games between humans mutual draws increase the percentage quite a bit more than it really needs to be. So we have to keep in mind that the rating number may not really be the rating of a perfect chess player, but rather an approximation of what rating we can expect 100% draws among the human players. But if we use game between computers where draws were not so easily offered or accepted then it might give is a better ideas of the real chess max rating.

Jeff Sonus of www.chessmetrics.com has a database of lots of human games. I've contacted him to ask about this and waiting to hear from him.

Does anyone know if there is a big data base of computer games? That might give a better approximation.

Omar

Title: Re: Rating of a perfect chess player
Post by MrBrain on Oct 22^nd, 2004, 10:56am

As with a previous discussion, I think "perfect chessplayer" is ill-defined. What does this mean? There may be many moves in a given position that lead to a draw, while other moves are a loss. But which of the drawing moves is the "perfect" move? This may depend highly on the opponent. Against another "perfect player", the distinction is moot. But against someone with say a random strategy, a C player, an Expert, etc, a different "optimal" move may result.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Oct 22^nd, 2004, 12:06pm

I think MrBrain is quite right about the difficulty of defining perfect play. This problem has been much-discussed over the years, at least as far back at the late 1890's when the "modern" players thought they could basically codify all the principles of good chess play, which would eventually lead to equalization and a draw. Lasker was somewhat of an anomaly in his day because he wanted to provoke a fight, even if doing so required making an objectively bad move. Capablanca was sure that chess was heading for a death of draws, but years after his grim predictions, Tal thrashed his opponents with "unsound" moves.

To put MrBrain's question in a different way, would a perfect player playing against me choose a "good" move (i.e. one which can't lose with further perfect play) but which I know how to counter in preference to a "bad" move (i.e. one which will necessary lose with perfect play by me) but which I will likely blunder in response to? Is 10% chance of winning and 90% chance of drawing more perfect than 80% chance of winning and 20% chance of losing? In practice it is rather limiting to define perfect play as excluding any chance of losing.

The widespread (and unjustified) faith in the objective meaningfulness of chess ratings has given a new spin to an old discussion. Somehow it seems less ill-defined to ask "What would the rating of a perfect player be?" than it does to ask "What is perfect play?". In truth, however, assigning a rating to perfect play adds a new layer of ill-definedness on top of the previously unresolved questions of perfect play.

I would argue that a high percentage of draws, while correlated with good play, is sufficiently distinct to scuttle the project of extrapolating the percentage of draws to a point of 100% draws and considering that perfect play. One excellent counter-example is Ulf Andersson, who in his prime drew a huge percentage of his games due to incredible technique at saving theoretically lost endgame positions. Yet even when he was drawing a higher percentage of games than everyone else at the top of the chess world, he still wasn't the best, because he didn't win as many as some others.

Now, having made it perfectly clear that I think Omar's proposal is doomed and will produce a result that is not meaningful, let me say that it holds some mathematical interest for me. I don't have time right now, but I will try at some point to post a mathematical model which, if true, would allow the desired calculation to be made. Of course the model doesn't correspond to reality, but I am interested in the result of the calculation nonetheless.

Title: Re: Rating of a perfect chess player
Post by 99of9 on Oct 22^nd, 2004, 1:12pm

Here's my definition of a perfect player that I'd be interested in knowing the rating of:

1) when in a theoretically winning position, randomly chooses from the moves which will keep it in a winning position.

2) when in a theoretically drawn position, randomly chooses from the moves which will keep it in a drawn position.

3) when in a theoretically lost position, randomly chooses from all legal moves.

Of course, as we discussed for tic-tac-toe, this is not the smartest player, and nor is it the player which will obtain the highest rating possible when playing in a human tournament. But it is a well defined, and traditionally "perfect" player.

I do not agree with Fritz that it is limiting to talk about perfect play as not having any chance of losing. His "winning probability" argument is not good when you don't know in advance what your opponent is. The opponent may well be perfect too. You are really talking about psychological exploitation of imperfect humans. Although I agree this is certainly possible, I don't think an alien (who had solved chess) would call this anything near perfect. Call it "exceptional used car salesman", able to sell junk.

However, as Mr Brain points out that we talked about in regard to tic-tac-toe, there are even different types of the traditional perfect. Some deliberately lay traps for humans when playing from a lost/drawn position. These are also psychology-dependent, so for simplicity this model player does not include this factor.

I think it is appropriate to ask the question Omar asked if this model player was placed in a pool of non-learning entities. (Perhaps different pools with incrementally increasing ratings as the rating of the player goes up, and wait til it reaches equilibrium). I don't know how well this draw % method would approximate the exact result, but I don't know of any better methods either. I'll be interested in seeing your model Fritz.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Oct 22^nd, 2004, 3:18pm

OK, here's a (dubious) mathematical model for chess performance, based on the idea that you don't win with good moves, but you lose with bad moves, and if both players make good moves (or offsetting bad moves) it's a draw.

Imagine two riflemen shooting at a target. If both hit the target, they keep playing. Also if both miss the target they keep playing. But if one hits and the other misses, the one who hits wins the game.

If you assume that each player has a fixed percentage chance to hit the target on each round, independent of the other player and independent of previous shots, and you assume that the game goes on indefinitely until one or the other is victorious (i.e. draws are not possible), then it turns out that the Elo ratings model is a perfect match for this scenario. That is to say, from knowing how often A beats B, and how often B beats C, you can predict precisely how often A beats C. The prediction is independent of the players being good shots or poor shots; you can't deduce from wins and losses alone what anyone's percentage chance to hit the target is. The relative gaps are all that matter, and Elo ratings make the right transitive predictions. (A somewhat remarkable mathematical fact, actually.)

Now this is a terrible model for chess precisely because your move does influence the chance of the other player making a blunder. You aren't each trying to make good moves in a vacuum, you are trying to make good moves against each other. But let's forget that for a while and run with this model.

Suppose that instead of allowing the target shooting to go on indefinitely, we cut off the game after a finite number of shots, say 40, and call it a draw if it wasn't decisive before then. In chess this might correspond to the game trading down equally to a point that no more mistakes can be made after move 40. Now some more complicated equations emerge, with rather different behavior.

First, if we count a draw as half a win and half a loss, Elo's model no longer holds true. The ratings at the top end start getting squeezed together by the draws. In a manner of speaking, the game isn't long enough any more for the most excellent players to demonstrate their full superiority: A 99.99% chance of hitting usually draws versus a 99.999% chance of hitting, even though the latter would win about ten times as often if the game were indefinitely long.

Second, when draws are allowed we lose the property that the scale doesn't matter. That is to say, from knowing how often A beats B, and how often B beats C, you can't predict precisely how often A beats C.

Third, and most relevant for this discussion, is that now if you know for two players A and B the chance that A wins, the chance that B wins, and the chance that A and B draw, you can infer exactly where they are on the absolute scale. That is to say, you can deduce the probability with which each player hits the target on any shot. Solving the question analagous to Omar's chess question, you can deduce how close each of them is to hitting 100% of the time, and calculate from that how often each would draw against a perfect (100% accurate) player.

(Actually, there is still a problem because draws also get more common as the players both get close to zero percent accuracy. To eliminate the problem one could assume that no one is intentionally trying to miss, and that both players have at least a 50% chance of hitting on every shot, so that increasing the skill of equal players always increases the chance of a draw.)

Anyway, I emphasize again how bogus it would be to apply this model to chess. There could be two aggressive players who seldom draw against each other, and we would infer from this that they aren't very good, whereas two conservative players who often draw would be assumed (by this model) to be excellent players. Still, even knowing how bogus the model is, we could take any given set of chess game result data, fit it as closely as possible to the model, and see what results fall out.

Just for fun.

Title: Re: Rating of a perfect chess player
Post by RonWeasley on Oct 22^nd, 2004, 3:59pm

I wasn't part of the previous discussions about perfect play, but I would like to propose a variation on 99's definition.

1) when in a theoretically winning position, chooses the variation forcing the win in the fewest number of moves. If there is more than one with equal number, chooses the variation where the opponent is most likely to make a mistake resulting in a win in fewer than the original number of moves.

2) when in a theoretically drawn position, chooses the variation maintaining the draw where the opponent is most likely to make a mistake resulting in a winning position.

3) when in a losing position, chooses the variation where the opponent is most likely to make a mistake resulting in a non-losing position.

First, I think fewest moves to a win is most perfect.

So what do I mean about the opponent being most likely to make a mistake? It depends on the type of opponent, but relates to maximizing the likelihood of the result. For a computer opponent, this may mean the largest number of plys before its value function detects the change in state. For a human opponent, based on perceived experience level, this could mean seeking non-intuitive variations that you think you can analyze better. I'm falling short of a quantitative description of this quality, but I think it is an essential part of the intelligence question Omar is posing. Comments?

The play of this perfect player would produce draws against other perfect players, presumably, and wins almost every time against non-perfect players. The rating would depend on the presence of perfect, near-perfect, and not-close-enough-to-perfect players.

So what's the rating if you always win? What if half the games are draws and the other half wins? If a fixed fraction, f, of opponents are perfect, what is the rating as a function of f, assuming an equal number of games with all opponents?

Sorry in advance if this is obsolete thinking.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Oct 22^nd, 2004, 4:06pm

on 10/22/04 at 13:12:12, 99of9 wrote:

That is a reasonable definition of a perfect player. I would predict, however, that such a player, active on the FIDE tour, would have a lower rating than Kasparov. I suspect that most grandmasters would have little trouble preserving the draw (assuming the initial position in chess is a draw) against aimless play like the above, whereas most grandmasters have considerable trouble preserving a draw against Kasparov.

For example, I expect that in the opening position in chess, 1.a3 preserves the draw for white, so your perfect player is as likely to play that as 1.e4. Even later in the game, whenever your perfect player builds up any pressure that isn't enough for the game to be "winning", it will be happy to give away that advantage, and even make a downright disadvantageous move as long as such a move doesn't make the position "theoretically lost".

Just on a hunch I'll peg your perfect player at a FIDE rating of 2600. Yes, it will gain rating points from playing anyone higher-rated, but it will lose rating points to anyone lower-rated who can hold the draw. Playing against the whole field, I'd expect its rating to average out somewhere well below world-championship level.

It's an interesting question the way you pose it, but I think most people have something more like Weasley's definition in mind, so as to insure that the perfect player will have a rating higher than any human.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Oct 22^nd, 2004, 4:23pm

on 10/21/04 at 23:43:40, omar wrote:

First lets assume that a perfect game of chess is a draw (kind of like a complex version of tic-tac-toe). This assumption could be wrong, but for the moment lets assume it is true. Now if two perfect players were playing each other the result would always be a draw. The draw percentage would be 100%. Suppose we make a graph of chess rating (x-axis) vs percentage of draws (y-axis). The games used for this graph should be games between two players that have a fairly close rating (maybe 50 points of one another). We know that the percentage of draws among beginners is about 3% and among the top players it is above 50%. I wonder what this line or curve looks like. Maybe we can extrapolate the line to 100% draws and see what rating it corresponds to. Would this be the appoximate rating of a perfect chess player. Has anyone seen a graph like this?

I have not seen a graph like this, but I doubt the data would cooperate to the extent of producing a straight line. How will you extrapolate if it is a curve? What if it appears that the percentage of draws approaches 100% only asymptotically as the ratings incease? That would imply that the perfect player has an infinite rating.

Or what if (like winning percentage as a function of rating difference) the graph is concave up until 50% and then concave down thereafter? The part of the graph we have (i.e. up to 50%) will look radically different from the remainder, making extrapolation very dicey.

That said, your idea is probably more likely to produce a reasonable result than mine is. :-) I'm curious to see such a graph, regardless of what we think it means about perfect chess. It might well be concave up everywhere, which is to say that the percentage of draws actually approaches 100% more and more quickly as the ratings increase. I won't even venture a guess on this score.

Title: Re: Rating of a perfect chess player
Post by omar on Oct 22^nd, 2004, 5:50pm

I wasn't expecting this to get into a discussion of what is a perfect player. But since it hasn't ever been discussed to death (at least on this forum) lets do it.

The model of a perfect player which Toby has described is what I generally think of as a perfect player. It randomly samples only the perfect moves from any given position. Maybe we can call it the 'random perfect player' (RPP) :-) There can also be perfect players which use some heuristics to select their move, but always select from the set of perfect moves for any given position. Lets call these the 'heuristic perfect players' (HPP).

Now in order to have ratings for our perfect players the field of players must contain some 'non-perfect players' (NPP); otherwise all games would have the same result and all our perfect players would always have the initial rating we assigned them.

If our field of players includes non-perfect players it is possible for some of the perfect players to have a higher rating than other perfect players. The moves selected by some of the HPP may lead more often to positions where the NPP are more likely to blunder (i.e. selecting a move from the losing set of moves) and thus resulting in more wins for the HPP. It is also possible that the moves selected by some of the HPP may lead more often to positions where the NPP are less likely to blunder and thus result in more draws for the HPP. So there will be a range of ratings for the perfect players. Now keep in mind that even the lowest rated perfect player will kill you if you make even one mistake throughout the game; after all it is a perfect player :-) So against most of the NPP all of the perfect players will win. Only against some of the NPP which are on the verge of playing almost perfectly will some perfect players win more often while others draw more often. So I would not expect the range of ratings for the perfect players to be very wide when compared to the difference between the average player in the field and the lowest rated perfect player.

I would suspect that the rating of the RPP will be somewhere in the middle of this range.

So the bottom line is: it is possible to define at least conceptually what a perfect player is; there can be a range of ratings among the perfect players; and that (at least rating wise) the RPP is probably a good example of an ideal perfect player.

Now as I said in my first post, I am only interested in an appoximate rating of a perfect chess player. The word approximate is very important here. So in this case there is no need for us to really even define what a perfect player is (but it is fun to think about it :-) ).

All Im proposing is making a graph of ratings vs 'draw percentage', extrapolating the line and reading off the rating where the line crosses 100% draws. The number might not be accurate, but I figure it is a lot easier to try doing this than to solve chess :-). It can't hurt to see what comes out; who cares if it is not very accurate.

So has anyone ever seen such a graph?

Omar

Title: Re: Rating of a perfect chess player
Post by 99of9 on Oct 22^nd, 2004, 6:35pm

on 10/22/04 at 16:06:18, Fritzlein wrote:

That is a reasonable definition of a perfect player. I would predict, however, that such a player, active on the FIDE tour, would have a lower rating than Kasparov.

I agree. Basically one's ability to hold a 0 net score against this beast relies on the player being able to not blunder for an entire game. It appears that humans may be able to do this most of the time now in chess, therefore they have surpassed Random-Perfect. I don't think we have surpassed Random-Perfect in arimaa yet (if arimaa is a draw, and we certainly haven't if arimaa is a win).

By the way: I participated in a computer Rock-Scissors-Paper (Roshambo) competition. Random-Perfect does badly in such competitions, because although it never loses and always draws (by playing randomly), it never exploits weak players. So that is another game where I think we have surpassed Random-Perfect.

http://www.cs.unimaas.nl/~donkers/games/roshambo03/

I agree however that the "answer" omar's method will provide is the answer to Ron's player's strength. Because humans are always trying to trap one another, and actively force their opponent toward a loss, they are probably on the way toward approximating Ron's player in some human-context kind of way.

Ron, here's one way you could have a non-context-dependent way of defining "good" choices of moves:

Explore the full game subtree corresponding to each satisfactory move. Count the number of leaves that are wins, subtract the number of leaves that are losses, and the highest total score is the move you should play.

This is still not the best player a human could ever face, but it would be pretty scary! I think this player would win any non-learning human chess tournament today. (the only problem is that since this player is deterministic, once someone drew, humans could learn the correct path and it would forever more be a draw - that's why I specify non-learning.)

Title: Re: Rating of a perfect chess player
Post by omar on Oct 25^th, 2004, 10:22am

on 10/22/04 at 16:23:48, Fritzlein wrote:

Actually I was also wondering what the graph might be like. I don't think the graph will be concave down or asymptotically approach 100%. Because at some finite rating it has to actually hit 100%. Think of the set of end nodes being sampled at a certian rating when the draw percentage is say 90%. As the rating (of both players) increases we are sampling a smaller subset of these end nodes. A subset which most likely contains fewer non-drawing nodes. Convex down would mean that we are throwing out drawing nodes faster than non-drawing nodes.

My guess is that it will probably be a curve that is convex up the whole way from 0 to maxRating.

Title: Re: Rating of a perfect chess player
Post by omar on Oct 27^th, 2004, 5:10pm

on 10/22/04 at 18:35:05, 99of9 wrote:

If humans are able to surpass Random-Perfect then we may never see a day when humans are consistently defeated no matter how good the computers get. Kasparov once said in an interview that he worries about the day when humans will never be able to win a game against the best computer. But this may not happen if we can reach Random-Perfect. Chess may well be a complex game in which humans can reach a level of perfect play. This may have already happend in Checkers. See:
http://www.wylliedraughts.com/Tinsley.htm

But I don't think that humans have surpassed random perfect (most of the time) in chess yet. If that was the case then most of the games among the top rated humans would be natural draws. Currently over 60% of such games are draws, but many of them are due to mutal draws. If mutual draws were not allowed I think it would be less than 50% draws.

Title: Re: Rating of a perfect chess player
Post by 99of9 on Oct 27^th, 2004, 7:17pm

on 10/27/04 at 17:10:27, omar wrote:

If humans are able to surpass Random-Perfect then we may never see a day when humans are consistently defeated no matter how good the computers get.

But I don't think that humans have surpassed random perfect (most of the time) in chess yet. If that was the case then most of the games among the top rated humans would be natural draws.

No, I disagree.

Humans do not need to *become* perfect to hold their own ELO-wise against random-perfect. They just have to be able to:

1) hold their own (draw) most of the time against R-P

and

2) be able to beat the rest of the field more often than R-P does.

I believe that even I might have the chess ability to achieve number 1. This may sound crazy, but think about it. You have to remember the massive difference between RANDOM-perfect and INTELLIGENT-perfect. R-P is just trying to stay in the drawing regime, it is not trying to achieve anything but that. Every time I threaten a piece swap, it will most likely move some other random piece, and allow me to make the swap. The more pieces I swap, the easier it becomes for even me to draw an endgame against a player playing without any purpose at all.

Thinking about the fact that R-P does not even try to lay traps, I'm pretty sure all grandmasters would be able to get draws against it, as long as they didn't try for a win. So it would perform exactly mid-field in any high level tournament (or slightly above that if someone blundered), as long as everyone knows who they're playing. In form Kasparov/Kramnik/Anand would still beat lower GMs, so condition #2 would also be achieved.

Intelligent-Perfect is an entirely different kettle of fish. I agree with Kasparov that there will come a point where humans can hardly ever get a draw against a beast of this nature (or even a 100-ply searcher). I think this is what you're really going to get an estimate of when you talk about draw percentages converging to 100%.

Title: Re: Rating of a perfect chess player
Post by MrBrain on Oct 28^th, 2004, 9:53am

Random perfect would not be as strong as what most would consider perfect. This reminds me of a story I read (I believe in the excellent book on computer checkers "One Jump Ahead") where a parameter was reversed in a chess computer in a computer tournament. (Going from memory here, so the story may not be exact, but this is the gist of it.)

When the computer found winning positions in the endgame database, instead of picking the move to win in the least number of moves, it instead picked the move to win in the most moves. As a result, the computer played very bizarre looking moves that made everyone think that the computer blundered. But eventually, the 50-move draw rule comes into effect forcing the computer to eventually mate.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Oct 29^th, 2004, 10:39am

on 10/22/04 at 15:59:22, RonWeasley wrote:

This is a reasonable definition of perfect play, and I think it corresponds well with most people's intuitions about perfect, but this definition has the possible flaw that an "imperfect" player can get a higher rating. According to the Elo system, the more points you score the higher your rating is. So suppose you are in a theoretically drawn position where the trickiest (against this particular human opponent) sound move is still fairly boring and will win only 20% of the time and draw 80% of the time, whereas making the best dangerous, unsound move will win 80% of the time and lose 20% of the time. Your perfect player will choose the former because it is only choosing among moves which preserve the draw, but a more aggressive player who will risk losing scores more points on average (0.8 vs. 0.6) and thus earns a higher Elo rating.

This sort of conundrum complicates the question "What is the Elo rating of a perfect player", because the notion of having as high a rating as possible may actually contradict the notion of perfect play, unless you want to actually define perfect play in terms of Elo rating, in which case you have to specify the opposition.

Title: Re: Rating of a perfect chess player
Post by RonWeasley on Nov 1^st, 2004, 7:35am

Fritzlein,

You've provided a great example of the difference between what I call a perfect player and a player that maximizes its ELO rating. My version of perfect player would not risk a loss, even if the expected ELO value were higher by taking the risk.

With this description of the ELO maximizer, we now must consider a co-evolving landscape where all ELO maximizers must tune their estimates of each other. And their estimates of their opponents' estimates. This is closer to the human game. Did Omar have this in mind specifically for the Arimaa machine intelligence question?

Title: Re: Rating of a perfect chess player
Post by omar on Nov 1^st, 2004, 3:25pm

It's quite interesting that an imperfect player could possibly have a higher ELO than the highest rated perfect player.

But I was really just interested in the rating of a perfect player as modeled by the random perfect player described by Toby.

I ran some experiments recently to see what the rating of a random perfect player would be in tic-tac-toe. With zero being the rating of the random player the rating of a random perfect player comes out to about 540.

Of course as we've discussed before, the rating of a player depends somewhat on the other players in the field.
What I did for the other players was to limit them by the number of plys they search. The players look a certain number of plys ahead using only win or lose information (no heuristics) to narrow down the list of valid moves and randomly select from the remaining moves. So player 0 looks zero ply ahead (it is the random player). Player 1 looks one ply ahead and selects the winning move if there is one; otherwise picks a random move. Player 2 looks two plys ahead and is able to block a simple win and find one move wins. At player 6 we have the perfect player as defined by Toby's definition of random perfect.

I had the players play 2000 games against each other player. Half of the games as the first to move (X) and the other half as second (O). Here is the data from those runs.

p1-p0 would mean player 1 against player 0.

p0-p0
as X wins=577 draws=121 lost=302 games=1000
as O wins=288 draws=137 lost=575 games=1000
p1-p0
as X wins=831 draws=72 lost=97 games=1000
as O wins=515 draws=81 lost=404 games=1000
p1-p1
as X wins=693 draws=40 lost=267 games=1000
as O wins=260 draws=51 lost=689 games=1000
p2-p0
as X wins=896 draws=93 lost=11 games=1000
as O wins=653 draws=258 lost=89 games=1000
p2-p1
as X wins=864 draws=110 lost=26 games=1000
as O wins=649 draws=211 lost=140 games=1000
p2-p2
as X wins=302 draws=515 lost=183 games=1000
as O wins=191 draws=495 lost=314 games=1000
p3-p0
as X wins=935 draws=55 lost=10 games=1000
as O wins=692 draws=215 lost=93 games=1000
p3-p1
as X wins=945 draws=43 lost=12 games=1000
as O wins=674 draws=175 lost=151 games=1000
p3-p2
as X wins=512 draws=350 lost=138 games=1000
as O wins=232 draws=459 lost=309 games=1000
p3-p3
as X wins=541 draws=264 lost=195 games=1000
as O wins=168 draws=296 lost=536 games=1000
p4-p0
as X wins=950 draws=46 lost=4 games=1000
as O wins=750 draws=211 lost=39 games=1000
p4-p1
as X wins=942 draws=46 lost=12 games=1000
as O wins=734 draws=192 lost=74 games=1000
p4-p2
as X wins=531 draws=376 lost=93 games=1000
as O wins=213 draws=614 lost=173 games=1000
p4-p3
as X wins=534 draws=330 lost=136 games=1000
as O wins=170 draws=502 lost=328 games=1000
p4-p4
as X wins=318 draws=518 lost=164 games=1000
as O wins=149 draws=513 lost=338 games=1000
p5-p0
as X wins=950 draws=35 lost=15 games=1000
as O wins=723 draws=159 lost=118 games=1000
p5-p1
as X wins=959 draws=27 lost=14 games=1000
as O wins=616 draws=136 lost=248 games=1000
p5-p2
as X wins=785 draws=190 lost=25 games=1000
as O wins=222 draws=503 lost=275 games=1000
p5-p3
as X wins=762 draws=199 lost=39 games=1000
as O wins=222 draws=470 lost=308 games=1000
p5-p4
as X wins=682 draws=282 lost=36 games=1000
as O wins=207 draws=504 lost=289 games=1000
p5-p5
as X wins=697 draws=254 lost=49 games=1000
as O wins=57 draws=266 lost=677 games=1000
p6-p0
as X wins=960 draws=40 lost=0 games=1000
as O wins=790 draws=210 lost=0 games=1000
p6-p1
as X wins=972 draws=28 lost=0 games=1000
as O wins=768 draws=232 lost=0 games=1000
p6-p2
as X wins=749 draws=251 lost=0 games=1000
as O wins=153 draws=847 lost=0 games=1000
p6-p3
as X wins=762 draws=238 lost=0 games=1000
as O wins=163 draws=837 lost=0 games=1000
p6-p4
as X wins=675 draws=325 lost=0 games=1000
as O wins=96 draws=904 lost=0 games=1000
p6-p5
as X wins=702 draws=298 lost=0 games=1000
as O wins=100 draws=900 lost=0 games=1000
p6-p6
as X wins=0 draws=1000 lost=0 games=1000
as O wins=0 draws=1000 lost=0 games=1000

I used this data to iteratively calculate the ratings using the performance rating formula. The ratings of the players came out as:

p1=94 p2=331 p3=364 p4=399 p5=441 p6=540
with p0 fixed at 0.

Initially it shows the same decreasing returns in ratings with increasing plys as seen in Chess and other games. But when it starts getting close to perfect the ratings shoot up again.

If chess also has a similar curve (especially when it approaches random perfect) then trying to determine the rating of the random perfect player by extropolating will probably not be possible (as Karl had mentioned earlier).

It would be interesting to see if this kind of a curve also appears in games which are slightly more complex than tic-tac-toe.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Nov 1^st, 2004, 6:59pm

on 11/01/04 at 15:25:26, omar wrote:

p0-p0
as X wins=577 draws=121 lost=302 games=1000
as O wins=288 draws=137 lost=575 games=1000
p1-p1
as X wins=693 draws=40 lost=267 games=1000
as O wins=260 draws=51 lost=689 games=1000
p2-p2
as X wins=302 draws=515 lost=183 games=1000
as O wins=191 draws=495 lost=314 games=1000
p3-p3
as X wins=541 draws=264 lost=195 games=1000
as O wins=168 draws=296 lost=536 games=1000
p4-p4
as X wins=318 draws=518 lost=164 games=1000
as O wins=149 draws=513 lost=338 games=1000
p5-p5
as X wins=697 draws=254 lost=49 games=1000
as O wins=57 draws=266 lost=677 games=1000
p6-p6
as X wins=0 draws=1000 lost=0 games=1000
as O wins=0 draws=1000 lost=0 games=1000

This is bad news for your project with respect to chess. The perfect player is p6. Plot the drawing percentages of games between equal players of increasing strength from p0 to p5. How the heck would you extrapolate that forward?

It is interesting that this exactly models one of the difficulties you anticipated for chess: different styles of player have different drawing percentages. The even-ply players are good at drawing, whereas the odd-ply players are good at winning. So being a stronger player doesn't necessarily translate into drawing more.

Title: Re: Rating of a perfect chess player
Post by omar on Nov 2^nd, 2004, 9:27am

Yes, it doesn't look very promising. The only hope is that maybe this was too simple of a problem and so has a very descrete and choppy space. Maybe the space for chess is more smoother. Also the behavior of the players I used was very descrete and choppy. The behaviour of real players would be more smooth. So it may still be that the plot for chess is more smoother.

Even so, if the curve shoots up suddenly when it approachs perfect play, then extropolating will cause us to predict a much higher rating for perfect play than what it actually is.

Title: Re: Rating of a perfect chess player
Post by omar on Nov 6^th, 2004, 12:12pm

I finally got around to finding the data and trying out this proposal. The trend in the data is much better for chess than it was for T-T-T. I would say that a perfect chess player is probably around 3700.

Have a look at the graph and see what you think.

http://arimaa.com/arimaa/rating/humanChessDrawPerc.xls

You might have to right click on the link and select 'Save Link Target As' to download it.

Title: Re: Rating of a perfect chess player
Post by 99of9 on Nov 6^th, 2004, 1:09pm

Interesting.

A linear regression through those points would give a 100% draw at a rating of 4500. But I can see why you estimated smaller, because the points do seem to be curving up a little.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Nov 6^th, 2004, 2:15pm

Interesting graph. I'm pretty skeptical of the extrapolation to 3700. The points that most make it look tipped up are the first and the last, which are based on 5 and 8 games respectively. Throw out those unreliable points and you've got a very different picture.

It's interesting that the projected perfect rating on this basis is so high.

Title: Re: Rating of a perfect chess player
Post by omar on Nov 7^th, 2004, 10:50pm

The 3700 number is really just a bit of an educated guess. A straight line would suggest a higher rating, especially if we throw out the first and last points. However, I have a feeling that it isn't a straight line and approaches 100% very sharply at some point. The TTT results tend to confirm this. That's why I guessed a rating number that was quite a bit smaller.

Title: Re: Rating of a perfect chess player
Post by omar on Aug 3^rd, 2008, 11:54pm

Recently I was looking at some rating lists for chess programs.
http://en.wikipedia.org/wiki/Chess_Engines_rating_lists
Some rating lists include the bots rating along with what percent of the games it played were draws. The different playing style causes different bots of even the same skill level to have pretty wide range of draw percentages. However it seemed like there was a trend towards higher percentage of draws as the ratings increased. I pulled the data into a spread sheet and looked at a scatter plot of rating (x axis) vs draw percentage (y axis).

http://arimaa.com/arimaa/rating/ChessDrawPerc.xls

Look on sheet2 and sheet3.

It sure looks like computer programs are drawing more often as their ratings increase. Since these are computers playing, I don't think any of the games on which this data is based were early unfought draws.

Sheet 1 has data of rating vs draw percentage for human players. At a rating of around 2500 humans have a draw percentage of about 45% while computers rated 2500 are at 30%.

I had previously tried to guess the rating of a perfect chess player based on human game data and came up with a rating of about 3700. After looking at the computer data I would want to increase that guess to about 4000 to 4500.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Aug 4^th, 2008, 6:25am

on 08/03/08 at 23:54:23, omar wrote:

Sheet 1 has data of rating vs draw percentage for human players. At a rating of around 2500 humans have a draw percentage of about 45% while computers rated 2500 are at 30%.

If I'm not mistaken, you are comparing data from different methodologies. In your HvH data you restricted the games to ones played within a hundred point rating range. For the computers, it looks like the drawing percentage is among all games (including unequal games) that the computer plays. The inclusion of unequal games will naturally lower the drawing percentage.

Still, it is interesting that computers are forging right on past human skill level without drawing all the time. It seems we weren't as close to perfect chess as we thought.

Title: Re: Rating of a perfect chess player
Post by omar on Aug 4^th, 2008, 7:43am

on 08/04/08 at 06:25:17, Fritzlein wrote:

Good point. Maybe that is what's causing the computer draw percentages to be lower. Perhaps the computer draw percentages would also be higher if only game between closely rated programs were considered. Before you mentioned this I was thinking it might be due only to humans more easily offering draws. Perhaps it's a combination of both. So maybe I'll stay conservative and guess at 4000 :-)

Quote:

Still, it is interesting that computers are forging right on past human skill level without drawing all the time. It seems we weren't as close to perfect chess as we thought.

Yes, it is interesting to think that there are still many levels of unexplored depth in even a game like chess which has been so throughly studied. With Moore's law expected to continue until at least 2020, we definitely can expect even todays programs to keep forging ahead. With even a 25 rating point increase per year the best programs should be reaching 3300 by 2020. It makes me wonder if humans will be able to keep up by learning from the computers or hit some upper limit of human thinking capacity.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Aug 4^th, 2008, 8:06am

on 08/04/08 at 07:43:19, omar wrote:

It makes me wonder if humans will be able to keep up by learning from the computers or hit some upper limit of human thinking capacity.

I thought it was already clear that humans aren't able to keep up at chess by learning from computers. I wonder whether computer chess will start to be something we are bored by, like we are by computers calculating a billion digits of pi. But maybe we will still want computer chess engines around to provide accurate commentary on grandmaster games.

By the way, should we move this thread to "off-topic" ?

Title: Re: Rating of a perfect chess player
Post by omar on Aug 4^th, 2008, 5:39pm

on 08/04/08 at 08:06:04, Fritzlein wrote:

I thought it was already clear that humans aren't able to keep up at chess by learning from computers.

Really I didn't know that. I knew that in end games humans could not keep up because some know sequences are too long and bazaar to memorize and don't lend to any guiding principles, but I thought there might be more we could learn from computers in the middle game. But maybe not.

If humans had some upper limit in their thinking capacity and computers continued to march on then it could be said that chess has more depth than can ever be explored by the human mind (or perhaps I should say the unaltered human mind :-) ). Quite amazing.

I would expect that in comparing two games which are both much deeper than can be explored by the human mind we would see the same number of levels (or ranks) attained by humans in both games. For example if 19x19 go had 30 human ranks than I would expect 23x23 go to also have the same (if it were played as much). But there is a big difference in the number of human ranks between chess and go. 30 for go compared to only 13 for chess.

http://arimaa.com/arimaa/forum/cgi/YaBB.cgi?board=other;action=display;num=1144819081;start=2#2

I am wonder what is causing this big difference. Both games have been around for a long time and have very large player pools; so humans are probably approaching their limits in both games. Could the difference be due to chess having a high draw percentage?

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Aug 4^th, 2008, 8:01pm

I am sure 23x23 Go would have more human ranks than 19x19 Go. Part of the reason that 19x19 Go has so many ranks in the first place is that there are about two hundred moves per game. That means there are many, many opportunities for the stronger player to outdo the weaker player. Since 23x23 Go would last almost three hundred moves, even finer distinctions in skill would translate into ranks of depth.

I think the drawishness of chess does indeed crimp its depth as measured in Elo points. If we define rank as a 75% chance of winning, then we can guess how the scale would expand if chess were drawless. Between chess players who have 50% draws, but the better player wins 75% of the decisive games, it comes out to a 62.5% score, or only 89 Elo points difference instead of 191 Elo points different. So the upper end of the chess range would probably expand to have twice as many ranks if draws were eliminated. The lower end of the chess rating scale would expand less because draws are less frequent there, but it too would expand somewhat.

Go, meanwhile, has bent over backwards to prevent draws. Not only is there a ko rule to forbid undoing the opponent's move, there is a superko rule to prevent complex cycles of repetition. Furthermore, since the scoring is in integers, the komi (point handicap) given to the second player always includes a half-point so the game can't end tied on score.

All in all, I think the assertion that Go is deeper than chess is a bit overblown. Go might be more subtle and amenable to deep insights of the human mind than chess is, but the measuring stick of ranks of depth is not as precise as we pretend.

Title: Re: Rating of a perfect chess player
Post by aaaa on Aug 4^th, 2008, 9:17pm

on 08/04/08 at 20:01:17, Fritzlein wrote:

I am sure 23x23 Go would have more human ranks than 19x19 Go.

I contest the flat-out boldness of this statement. It's my understanding that 19x19 Go may possibly be optimal in the sense that it demands the greatest finesse from a human player and thus exhibit the largest human skill depth possible, if not of all known games, then at least of all the differently-sized Go versions. This, on account of the fact that on a 19x19 board the balance between territory and influence is the most ideal.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Aug 4^th, 2008, 9:39pm

One should never be "sure" of anything. I wouldn't bet that I was right at odds of 10 to 0, so in that sense I agree with your challenge of the boldness of my statement. But (if there were any way to test it) I would bet at odds of 10 to 1. Whatever a 23x23 board would do to the influence/territory tradeoff, I seriously doubt it would make each of the 300 decisions more obvious, i.e. less conducive to separating the more skilled from the less skilled. If anything, I suspect it would not only create more decisions, but make each of those decisions more telling, if only because there would be more choices for each play. But even if the discrimination represented in each decision were less, I could still win my bet about there being more levels of skill overall in the 23x23 game.

Title: Re: Rating of a perfect chess player
Post by aaaa on Aug 4^th, 2008, 10:18pm

From what I've gathered, the mechanics actually work in the other direction; that is, as the board size increases beyond 19x19, human comprehensibility of the game (as a whole) quickly plummets to the point that between roughly equal players, it becomes much more a question of luck who wins such a game, meaning a more compressed potential skill range.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Aug 5^th, 2008, 5:18am

That could be true at present without being true inherently, simply because the 19x19 game is what everyone has studied. The game of 23x23 Go might initially have fewer ranks because it is a new game, just like Arimaa didn't have many ranks in the first year it was released. Looking at some early Arimaa games you might be tempted to say the outcome was largely a matter of luck, but that was a function of our collective inexperience, not of the game per se.

Title: Re: Rating of a perfect chess player
Post by aaaa on Aug 5^th, 2008, 6:58am

Don't tell me you're thinking the at-first-sight intuition about a brand new game is in any way comparable to the amount of expertise of masters of an ancient game that can be transferred to a different size with that game already having shown itself being flexibly played at different sizes in the first place?
On the one hand, the evolution of Go has shown itself to have been progressive enough to the point of the standard size having been changed from 17x17 to 19x19, while on the other, the long time that has passed since then suggests that there may be something intrinsically preferable about the current size. Although a mystic connection has been made about the closeness of the number of squares on the board to the number of days in a year, this strikes me as an ad hoc justification.

With chess on the other hand, having undergone significant changes much more recently, I think one could well make the argument that the preserving nature of modernity may well have short-circuited any potential, beneficial changes to the game as popularly played. In fact, the fact that people like Capablanca, Fischer and to a lesser extent Lasker, all legends of the game, proposed drastic changes to the game, is strong supporting evidence in this regard.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Aug 5^th, 2008, 10:46am

You could easily persuade me that there is something intrinsically preferable about the 19x19 board, and that is why Go is usually played on a board of that size. But that preference probably is not an attempt to maximize the number of player ranks as defined by 75% winning chance. I wouldn't be surprised if even at a 17x17 board size the game felt rather long, and the increase to 19x19 was reluctantly made to add depth at the cost of making a too-long game even longer. Perhaps the current size is a balance between being "deep enough" but "not too ridiculously long". Just a guess.

If you ask master Go players why they don't play on a 23x23 board, and they say that there would be less strategy (incidentally, I'm curious to see your reference for this), I would indeed expect that they say this because their skills and insights don't completely transfer to the larger size, not because all of their expertise remains applicable and there would be an inherent lack of insights and skills to be attained.

I once read a quote from a Go expert who said that playing on a 21x21 board would be very difficult, i.e. openly admitting that he wouldn't know what to do. To my mind the fact that experts can be baffled doesn't show that the larger game would inherently be beyond human comprehension; on the contrary it strongly undermines the notion that the Go community somehow knows perfectly well that 19x19 creates the most strategic game possible.

Title: Re: Rating of a perfect chess player
Post by aaaa on Aug 5^th, 2008, 12:01pm

It's not a question of whether amongst the best players one can set oneself apart from the others when it comes to playing larger sized boards. The question is, whether this is enough to offset the considerably confounding effect there would be on players over the entire range of possible skill levels, whose ranges between them would in terms of the players that dwell there metaphorically stretch. That is, it would become harder to improve to the point of reaching a certain winning percentage versus a fixed reference player.
It depends on your point of view whether you consider such a change to make the game more strategic or not.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Aug 5^th, 2008, 12:07pm

My contention is the that "confounding effect" is due to inexperience only. You might as well ask a beginning player whether a 9x9 board is more strategic than a 19x19 board. Just because play on a 19x19 board is totally opaque and confusing to a beginner does not mean the larger board size is inherently incomprehensible. Therefore I believe that

Quote:

That is, it would become harder to improve to the point of reaching a certain winning percentage versus a fixed reference player.

would simply not be true.

Indeed, I would still offer 10:1 odds to the contrary. :)

Title: Re: Rating of a perfect chess player
Post by lightvector on Aug 5^th, 2008, 11:05pm

If we measure ranks purely by win percentage, then I tend to agree with Fritzlein that a larger board almost certainly means more ranks. Consider if you will, the game of N-iterated Go - play N regular games of 19x19 Go, and the winner is the player who wins more in total. Purely by virtue of involving a greater total number decisions, this will magnify any win percentage differences between players potentially many times if N is large. Essentially, we are doing a statistical sampling. A larger board size should do the same thing. More space, a longer game, more local fights, will give more room for the slightly stronger player to overcome variation and chance.

Of course, if we want to talk about levels of strategy and understanding, then this is quite unsatisfying. It is hard to argue that N-iterated Go has a greater "strategic depth" than regular Go.

In the case where we do want to measure levels of skill in terms by levels of understanding and strategic ability (whatever that means!), I wouldn't go so far as to say a slightly larger board (like 23x23) would be incomprehensible. Shape knowledge, life and death, local tactics, extensions and stone relationships, etc, would all remain completely unchanged. In fact, the great majority of playing ability is determined by 2 things: local, group-level tactics/shapes, and the ability to judge their relative urgencies.

Even larger-scale decisions remain similar - the fact that there's a few extra lines of space way *over there* shouldn't too greatly affect how one decides to handle an invasion, or extend from a wall, or secure a large area *over here*.

Together, these account for almost all of a player's strength, and none of it is fundamentally affected by board size.

A larger board would, however, radically change the flavor of whole-board evaluation and the influence/territory balance, which makes it a little hard to tell.

Title: Re: Rating of a perfect chess player
Post by clauchau on Aug 6^th, 2008, 3:36am

There might be the same quasi-transcendental leap between 19x19 Go and, say, 723x723 Go as there is between 3x3 Go and 19x19 Go.

Someday someone will study Go on much bigger boards, probably through computers experiments. Surprising concepts and beautiful patterns may emerge. 1459x1459 Go might even be quite different from 1460x1460 Go due to some harmonics. Who knows yet.

Title: Re: Rating of a perfect chess player
Post by Janzert on Aug 6^th, 2008, 7:23am

Chris Fant did a few experiments with very large board sizes last year. Here you can see the result of a random playout on a 1600x1200 board (http://fantius.com/RandomGo1600x1200.png) (the final single point eyes are filled in with the owners color for display). It also generated a fair amount of discussion on the computer-go (http://computer-go.org/pipermail/computer-go/2007-February/thread.html#8721) list. If I'm recalling correctly there are some more playout result images buried in that thread as well.

Janzert

Title: Re: Rating of a perfect chess player
Post by 99of9 on Aug 6^th, 2008, 4:37pm

Great picture.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Aug 6^th, 2008, 5:16pm

on 08/06/08 at 16:37:23, 99of9 wrote:

Great picture.

Elmo and I just spent fifteen minutes finding faces, animals, and sundry objects in it. Do you see the camel?

Title: Re: Rating of a perfect chess player
Post by 99of9 on Aug 6^th, 2008, 6:13pm

on 08/06/08 at 17:16:47, Fritzlein wrote:

Do you see the camel?

No, because I didn't spend 15 minutes finding animals in it. :)

Title: Re: Rating of a perfect chess player
Post by clauchau on Aug 7^th, 2008, 12:28am

lol.

Cool. They do mention Go experts find 21x21 less interesting. Maybe the bigger the board, the more drawish the game, avoiding NP-complete endgames.

Title: Re: Rating of a perfect chess player
Post by omar on Aug 7^th, 2008, 9:52am

on 08/04/08 at 20:01:17, Fritzlein wrote:

I think the drawishness of chess does indeed crimp its depth as measured in Elo points. If we define rank as a 75% chance of winning, then we can guess how the scale would expand if chess were drawless. Between chess players who have 50% draws, but the better player wins 75% of the decisive games, it comes out to a 62.5% score, or only 89 Elo points difference instead of 191 Elo points different. So the upper end of the chess range would probably expand to have twice as many ranks if draws were eliminated. The lower end of the chess rating scale would expand less because draws are less frequent there, but it too would expand somewhat.

Thanks for your insights on this Karl. The implications of what you have said here are really amazing.

First this mean that if draw games are included in the rating of players then the shape of the "Rating vs Draw Percentage" should start curving up very quickly and perhaphs get close to 100% at even lower rating then 4000. Only if draw games are not included in the ratings will the graph be more linear.

But perhaps the most serious implication of this is that chess ratings which include draws don't fit the ELO model. I have always thought that 200 rating points meant about 75% winning changes for the higher player. But what you are saying is that though this might be true between 1000 vs 1200 players where the draw percentage is low, a 200 point differece between a 2600 vs 2800 is actually much higher than 75% winning chance for the higher rated player. Very interesting.

Quote:

All in all, I think the assertion that Go is deeper than chess is a bit overblown. Go might be more subtle and amenable to deep insights of the human mind than chess is, but the measuring stick of ranks of depth is not as precise as we pretend.

So perhaps we would see much more than 13 ranks in chess if draws were not included in the ratings.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Aug 7^th, 2008, 10:21am

on 08/07/08 at 09:52:54, omar wrote:

a 200 point differece between a 2600 vs 2800 is actually much higher than 75% winning chance for the higher rated player. Very interesting.

That's right. A expected score of 0.75 when there are 40% draws would mean that the stronger player wins 55% of the time and the weaker player wins 5% of the time. If we discard the draws the win ratio is actually 11 to 1 instead of 3:1, which would be a gap of 416 points instead of 191 points.

Title: Re: Rating of a perfect chess player
Post by omar on Aug 7^th, 2008, 10:23am

on 08/05/08 at 23:05:19, lightvector wrote:

Humm. Will the N-iterated Go actually have a larger range of ratings or are the ratings just more accurate? I would tend to think the ratings are just more accurate and the rating range is still the same regardless of how large N is.

Title: Re: Rating of a perfect chess player
Post by Fritzlein on Aug 7^th, 2008, 10:33am

on 08/07/08 at 10:23:44, omar wrote:

No, the range of ratings will be larger. By definition a rank is being able to win 3 of 4 games, or 75%.

Now if I can beat you 64% of the time at Go, I am by definition less than one rank better than you. (In fact, I am about 0.52 ranks better than you.)

But now let's change the game to best-of-five-Go. I will beat you 75% of the time at best-of-five-Go. Hey, presto, at this new game I am suddenly a full rank better than you, by definition.

If Go has about 40 ranks, then best-of-five-Go will have about 80 ranks.

Title: Re: Rating of a perfect chess player
Post by omar on Aug 7^th, 2008, 12:29pm

You are right, N-iterations of a game would expand the rating range of the original game. So I am inclined to agree that games which last longer and have more choices at each turn will have a larger rating range. Perhaps the ultimate rating range of a game is proportional to the game-tree complexity of the game.

But maybe one could argue that the N-iteration model does apply to actual longer games because each iteration is independent of the other.