Arimaa Forum (http://arimaa.com/arimaa/forum/cgi/YaBB.cgi)
Arimaa >> Bot Development >> bot that learns the rules without NN
(Message started by: leo on May 1st, 2006, 2:39pm)

Title: bot that learns the rules without NN
Post by leo on May 1st, 2006, 2:39pm
As I was rewriting from scratch my old 2002 code to fit better the needs of my goal-seeking bot, and was adding pattern recognition and sequence recognition modules, I began to bias the coding towards this radical idea: since the bot learns to memorize and classify and generalize local board states and state changes, why not use that for the very beginning of its learning: assimilating the game rules.

Here's how it works: the bot observes and memorizes the motions on the board with various widths of vision field, and it also has an "instinctive urge" to experiment with its "motor outputs" (moving pieces). The step and move validity checker acts as the "harsh reality" that will allow or forbid some actions from having a physical effect (the motor action is not followed by any motion in the visual field).

How should the rules be assimilated: the modules that generalize the memorized situations and actions will permit to extract some features such as "you can't move your own rabbits backwards" or "when your piece is near a stronger opponent piece and no friendly piece is around, your piece can't do anything" (the bot won't name that freezing since it doesn't have any language, but it's the equivalent).

The thing is not as simple as it looks though, there are many many levels of generalization and classification to extract these things out of the basic perception and proprioception (motor feedback). This looks very similar to neural net techniques but it doesn't use any NN. I'll report the results when I find the time to get all that up and running and can adapt the goal seeking modules to the general architecture.

Has anyone of you coded something similar?

Title: Re: bot that learns the rules without NN
Post by clauchau on May 2nd, 2006, 11:21am
Hi, dear Leo  :D

There is a promising genetic program that seems to be able to evolve programs written in a subset of the C language into shorter and faster ones :
http://critticall.com/

It's not public domain, but it makes me want to do the same :)

You could start with a silly program that lists the data you've got about valid moves, and it is supposed to give you a shorter version, in other words some better abstract understanding of what

Title: Re: bot that learns the rules without NN
Post by Swynndla on May 2nd, 2006, 9:02pm
clauchau, that's an interesting link!

leo, what techniques are you using for generalization and classification?  What sort of AI is it called?

Title: Re: bot that learns the rules without NN
Post by leo on May 2nd, 2006, 10:36pm

on 05/02/06 at 11:21:01, clauchau wrote:
There is a promising genetic program that seems to be able to evolve programs written in a subset of the C language into shorter and faster ones :
http://critticall.com/


Hi Claude :D

Amazing stuff you found... It's stunning in that the resulting algorithms have to be studied as we do when we find a new protein or micro-organism. They are simply not obvious, because they were not devised by our own mental tools...


Quote:
It's not public domain, but it makes me want to do the same :)

You could start with a silly program that lists the data you've got about valid moves, and it is supposed to give you a shorter version, in other words some better abstract understanding of what


If Critticall is able to improve code beyond a function block, that is apply the genetic selection over the whole code with embedded function calls and recursive function calls, it could indeed produce interesting things. The sad part would be that the resulting code couldn't explain us what it is doing :( Analysing it would be exciting exopsychology...

But it could happen that the resulting evolved code was similar to a human understanding of the game, that would tell us something about the evolution of the mind...

How would you devise such a genetic optimizer? Would you give C code for input, or pseudo-assembly as in the core war programs?

Title: Re: bot that learns the rules without NN
Post by leo on May 2nd, 2006, 11:11pm

on 05/02/06 at 21:02:01, Swynndla wrote:
leo, what techniques are you using for generalization and classification?  What sort of AI is it called?


This sort of AI has no name yet. Maybe Leo-AI for Low Earth Orbit AI, because it's down-to-earth ;D

Now seriously: The system is very similar to NN but it takes advantage of the particular nature of the game (only local interactions, the simple way the pieces step, the smooth strength scale of the pieces...) to hard-code what in a NN takes a lot of computation and time.

Generalization is done by statistically comparing local situations and actions, and extract more abstract patterns. Location, color and strength are the basic inputs for the higher levels. I'm not sure yet whether to hard-code relative strength information or let the program learn to extract it.

The higher generalization modules (that take other generalizated data as input) should not be hard-coded but will take time to perform although matter is rarefied at that altitude.

Low-level classification is based on several hard-coded channels: focus on one piece type, focus on the interaction between two piece types, presence of a trap, absence of action (illegal steps), trap action, piece moved by opponent (push and pull) and a lot of patterns of various sizes and actions of various numbers of steps whose categorization is not defined yet.

High-level classification should address complex tactics and strategy but that will be for version 2 of the bot I guess.

I'll need many more weeks to have something working and I'm not sure the bot will behave the way I want it to :-/

Title: Re: bot that learns the rules without NN
Post by omar on May 12th, 2006, 10:06pm
Interesting thread you've started Leo. I look forward to hearing how thing go with this approach.

You might also want to look at Genetic Programming. I remember from my graduate student days that Professor John Koza at Standford University had been researching this since the early 1990's.

http://www.genetic-programming.com/
http://www.genetic-programming.org/

Title: Re: bot that learns the rules without NN
Post by leo on May 28th, 2006, 9:06pm

on 05/12/06 at 22:06:31, omar wrote:
You might also want to look at Genetic Programming. I remember from my graduate student days that Professor John Koza at Standford University had been researching this since the early 1990's.

http://www.genetic-programming.com/
http://www.genetic-programming.org/


Thank you for the links, Omar. Genetic Programming shows some impressive achievements! I'm not sure I'm able to understand it well enough to use it though. For now I'm going on with my stubborn personal way of programing :)

I'm advancing slowly. The next step I'm preparing is training the bot by playing with only one rabbit and one cat on each side. It should learn by itself that it's good to move forward, and incredibly gratifying to push its own rabbit the other side. It will also learn, much more slowly, that its rabbit jumping down into a trap is a very very bad idea. I'll play very patiently with it in the beginning, not winning too often so as to let it adjust its parameters, otherwise it would simply consider any move he made a bad move that invariably leads to it losing the game.

Title: Re: bot that learns the rules without NN
Post by seanick on Oct 8th, 2006, 1:01am
just reread this topic, but have a better appreciation for it now as I have done more reading up on AI.
First of all, this demonstrates the problem with AI theory. We humans learn relatively fast, in that each one of us have learned how to do many very complex things, just to get to this page and being able to read it. However, this learning also involves social interaction - without teachers, TV, books, friends, parents, etc, how quickly would we have learned what we currently know? it  would take quite a bit longer. anyway the moral is, if you already know its foolish to pull a "bomb" and take the cat that has nothing protecting it behind that trap, you can hard-code that in the code. Why do you think some people are superstitious about walking under ladders? Not every person has to have a paint bucket to fall on their head before they learn to fear walking under ladders.. we can learn from other's mistakes too. after a certain point it becomes more difficult because experience proves that superstitions don't always have valid reasons, among other things, but the point remains, that the reason humans have a distinct advantage is the ability to reuse information we observe from others.
Learning everything from scratch is a useful tactic for AI in research, perhaps, but for a specific-use bot (not a virtual human, just a game-playing machine, literally - read the thread about "the big difference between chess and arimaa" in the off topic discussion area where Hydra's chess-specific ASIC's are mentioned), the ability to hard-code domain specific knowledge will greatly increase the potential speed at which any given AI based bot seems to learn.

Aside from that, I also was thinking about generalizing the board states in a bot, and/or subdividing it for a mask - overlay type comparison to replace a hashtable of possible board configurations with a more-reusable subpattern search. havent fully thought it through yet but it starts with the quadrants instead of the full board. 16 locations per quadrant, but the values that are stored to describe it include three other relative-quadrant-values, and each position involves 4 insertions- 1 for each structure of a quadrant of 16 spots + 3 other-quadrant values. If each of the other 3 quadrants are similar and the current quadrant where the move is being evaluated, currently already exists (because a similar state was evaluated before but not identical) then skip the eval unless it is later determined to be the best move. Then do a more thorough eval on it to ensure it doesnt have any faults in the final-choice verification eval. passes that one? give it as the best move.
the only problem is, not only does it not solve things like the windmill problem, aka horizon effect but it also does not easily handle cross-quadrant boundaries. I would have to make it a 3 or 4 cycle eval instead of just two to handle the extra boundaries, and if I do that it would make just as much sense to make it a 6(for types of pieces), 7(for rank crossings) or 8, (for board ranks and/or rows)- cycle eval. complexity and too much code duplication = bad. so maybe that theory doesn't have any practical use (or maybe just not yet).



Arimaa Forum » Powered by YaBB 1 Gold - SP 1.3.1!
YaBB © 2000-2003. All Rights Reserved.