Winboard Forum

by **Michael Sherwin** » 19 May 2006, 23:41

Since I touted Romis' learning ability in another thread I thought that I would elaborate on it a bit here. Especially since Romis' source code is no longer available from WBEC.

First let me start with ver. p2i. In p2i Romi maintained two threads/trees in a learn file, one for white and one for black. After each game the moves of the game up to ply 160 along with some pertinate information such as the score for each move was placed in the learn file using the aproprtate thread. if enough moves had been tried at any given node then the best move is backed up one ply. This eventually allows in theory, information from upto 160 ply away from the start position to be used in the search. On beginning a search the current node and its subtree is loaded into the hash file. The search then of course uses the values in the hash tables to help select a move.

The results of this is that Romi would learn to beat much stronger programs in a match, if given enough rounds to do so, even if the other program also has learning. As an example only, since I ran this test many months ago, Romi played a match with Crafty19.19 with no opening books, learning turned on and after about 50 rounds Romi won every game up to round 250 when the matched was stoped.

Now I will explain Romis' learning as it became to be in p3h. In p3c (or there abouts) was added the ability to use the learn file as an opening book by using the white win, black win and draw statistics to just pick a move with out searching. Later the two threads of white and black were made into one thread so Romi could learn and play the moves of her opponents when she is beaten by them. This book learning worked so well that I did not notice that I had broken regular learning. Then I made extensive changes to the compute function to solve a move selection bug and some other changes to make Romi play better and didn't even realize that I had broken book learning as well. Version p3g has no learning at all, because it is all broken.

This decribes the way it is now, in p3h. One thread. Can load several million games from pgn file (must strip comments and variations first) into the learn file. Can select random book moves. If no pgn file is loaded then Romi creates her own opening book from her own games as they are played. In a match with Crafty19.19 as described above with 60 games having been played, Romi is in the lead +28 -23 =9. Romi is still loosing games though and this is most likely due to some random selection. I will edit this post when the marathon match is complete.

RomiChessP3h is not available yet, but as soon as I am satisfied with it I will make the executibles available at Leos' WBEC website. There currently is no download available.

by **Dann Corbit** » 20 May 2006, 00:13

Are you saving statistics associated with board positions, are you using td-leaf or td-lamda learning for the eval, or is it something else?

by **Michael Sherwin** » 20 May 2006, 03:18

Hi Dann,

I have heard of td-thingies, but do not have the slightest idea what they are. All I do is save in a linked list that is linked to the right by sibling links and down by child links to form a tree that games are overlaid into. I save the following additional information: from square, to square, score, depth searched, a flag (not used yet), white wins, black wins and draws. I adjust the score stored in a node based on win-loss-drawn statistics, so as the statistics get better the score grows and if the statistics get worse the score falls.

After 84 games the score is in favor of Romi +45 -29 = 10.

by **bob** » 20 May 2006, 03:27

There is one serious flaw in your testing. By playing against an opponent with no book, you can certainly find winning lines. But give him even a minimalistic book, and the tree becomes impossibly big and this kind of learning simply will not work...

Trying to learn specific positions/moves has been proven to not work over the years, I don't think that has (or will) change. If you want a simple example, run some nunn-type testing, where there are 40 starting positions rather than one, then see how long it takes to stabilize and win every game. And then take a program with thousands (or millions) of book lines and the problem is simply beyond intractable...

by **Michael Sherwin** » 20 May 2006, 04:11

Hi Bob,

I agree with you completely. But I have a couple of points too make. In a competition such as division 4 of WBEC a lot of programs do not have opening books and they also tend to play many of the same lines, therefore this type of learning can pay off big in this circumstance. Even following a self created book for a few moves saves time on the clock. And with the ability to load pgn files into the learn file, such as your enormous.pgn which creates an 800+ mb opening book that Romi can modify herself and follow for 60 ply, with the ability to extend up to 160 ply, I just do not see the down side.

Hi again,

I also have a philosophical point that I would like to make. RomiChess was mostly designed with human opponents in mind. I wanted to create a lively and aggressive program that would challange people and yet not play so well that people would feel as if they had no chance. Romi makes long term sacrifices that are not necessarily sound and does provide chances for a human to win. Then if a human does win, Romi can turn around and use (because of the type of learning involved) the persons own moves against them. This type of reaction can make for more exciting psycological encounters. Also, a tournament organizer indicated that 'Romi can even loose from a better position', implying the number of times Romi wins from an inferior or even hopeless position.

I stoped the match between Romi and Crafty at 100 games, because I need my computer back. Final score Romi +55 -30 =15

by **bob** » 20 May 2006, 14:42

If the goal is to "learn" to beat programs without an opening book, ok. But for mainstream chess, that's inapplicable. You should try Crafty with its default book to get a better feel, since Crafty also learns as it plays, meaning you will never get the chance to play a won game a second time, nor get the chance to "turn the tables" and play a lost game from the other side... In the case of Crafty, it too learns when white wins a specific line and will play that when playing white, and avoid it when playing black, neutralizing this completely...

by **Michael Sherwin** » 20 May 2006, 18:22

My main goal is to play interesting chess against people. I loose track of this sometimes, because of all the engine-engine competitions that Romi is playing in and the very little feedback from individual human opponents. The feedback I have gotten is very positive though. As for testing against Crafty, I have always tested against Crafty since Crafty is a strong and even program that is neverless beatable by my program. It was pointless to test against say fruit, because Romi would get very few points and after an improvement Romi would still get very few points. However Romi has gone from 8/50 to 17/50 for p2i. This has fallen back to about 12/50 due to LMR, even though LMR for p3g has given 200-300+ Elo boost in 40/5 engine-engine competitions over p2i. Here I go again loosing track of my main goal of playing interesting chess against humans. To see Romi creating her own book from scratch as a result of the games that she plays is very interesting. Romi can grow as a player grows. This could be very helpful especially for young players.

by **Dann Corbit** » 22 May 2006, 19:43

bob wrote:There is one serious flaw in your testing. By playing against an opponent with no book, you can certainly find winning lines. But give him even a minimalistic book, and the tree becomes impossibly big and this kind of learning simply will not work...

Trying to learn specific positions/moves has been proven to not work over the years, I don't think that has (or will) change. If you want a simple example, run some nunn-type testing, where there are 40 starting positions rather than one, then see how long it takes to stabilize and win every game. And then take a program with thousands (or millions) of book lines and the problem is simply beyond intractable...

If you only save the pv nodes from actual play and load them into a hash table on start-up, I don't think that the tree will ever get too large.

By the time your tree grows to 100 million nodes, the memory to store it will be very cheap.

If you play standard chess instead of blitz, I don't think there is much worry about too much memory consumption if you store actual pv nodes only.

If the time control is too fast (e.g. lightning games) then I think that the learning is pretty well valueless anyway.

by **Michael Sherwin** » 22 May 2006, 21:17

If the time control is too fast (e.g. lightning games) then I think that the learning is pretty well valueless anyway.

The above mentioned match between Romi and Crafty was played at a TL of 40/2.
Final score Romi +55 -30 =15

P.S. I have had as good or better results at TL 80/1.

by **bob** » 22 May 2006, 22:55

I was not worrying about the "memory size". I was worried about the "space size". Play a program with a decent book, and let it use any sort of book learning at all, and this kind of learning will simply not work. After only 16 moves by white and black, the tree space is beyond enormous... The probability of getting a "hit" becomes tiny unless your opponent never varies until you do. This is not a reasonable assumption for most programs...

by **Dann Corbit** » 23 May 2006, 03:03

bob wrote:I was not worrying about the "memory size". I was worried about the "space size". Play a program with a decent book, and let it use any sort of book learning at all, and this kind of learning will simply not work. After only 16 moves by white and black, the tree space is beyond enormous... The probability of getting a "hit" becomes tiny unless your opponent never varies until you do. This is not a reasonable assumption for most programs...

It seems to me that running into a problem here would be a function of the size of the opponent's book. (IOW, if he has 100 moves, you will learn the whole book right away -- and also his learned responses if you record all pv nodes.) If he has 1000000 distinct moves in his book, you won't learn a very large percentage of them, but I think you will still learn enough to give you some kind of advantage over not learning.

It also seems probable that there will only be a few distinct responses to many standard book moves. And so (for instance) if you play 1.e4 you are not going to see 1. .. a6 as a response.

On the other hand, if you play 1000 different opponents (e.g. you might on FICS) each of whom has 1000000 distinct moves in his/her opening book, then the approach will probably not yield anything useful against the majority of the opponents (unless you discover some wonderful novelty that the majority of opponents fall for again and again).

On the other hand, it might be possible to use distributed learning to cope with even enourmous book spaces.

For instance, you could have your chess program include the equivalent of a 'ping' that sends a simple message to an echo server on some central PC. Every time your program plays another one, the current board position and predicted move is sent to your ping server, along with the depth and evaluation.

You give the program away to 100,000 users. Eventually, you might create a very capable book via the collected data, even against a wide variety of opponents. Of course, that book would consume a huge amount of memory. But it might be possible to minimize that by minimaxing branches of the book.

by **bob** » 23 May 2006, 03:18

Here's the problem. suppose I have a book with just 100 "branches". That is, I can play 100 book lines before I have to repeat something previously played.

Question: How many times do I have to repeat a single game before a program can learn to beat mine? I'd bet at least 25-50 games. So we have to play 2,500 to 5,000 games before he has learned to beat every book line I throw up?

Not going to happen.

And that is just for a book with 100 different endpoints. Mine typically have 1,000,000 or more... minimum... I think I can rest easy...

by **Michael Sherwin** » 23 May 2006, 04:28

If Romi learns to determine between 1.e4 and 1.d4 which move is better for Romi then Romi has benifited.

by **bob** » 23 May 2006, 05:41

the problem is both are good...

you begin to see the problem...

And if I start with 1. f4, you see yet _another_ problem..

by **Michael Sherwin** » 23 May 2006, 07:31

the problem is both are good...

you begin to see the problem...

And if I start with 1. f4, you see yet _another_ problem..

Yes, both 1.e4 and 1.d4 are good moves, however, Romi may do better by playing one or the other. And if Romi is playing black and 1.f4 is played then if Romi learns to determine between 1. ... e5 and 1. ... d5 which is better for Romi then Romi has benifited. I realize that way out by the leaves Romi may take a million years to determine between moves, because it may take that long to reach any particular leaf enough times to make a determination. However, the lower plys can be determined in relatively few games.

by **Michael Sherwin** » 23 May 2006, 11:52

If the time control is too fast (e.g. lightning games) then I think that the learning is pretty well valueless anyway.

TL 80/1. Is this lightning?

RomiChessP3h - Cra1919 : 44.0/50 43-5-2 (1=10101010111=111111101111111111111111111111111111) 88% +346

by **bob** » 23 May 2006, 14:39

Again, you make it sound simple. But how many games will it take you to "learn" the best response to 1. f4? 25 is not going to do it. 2500 is not enough because my book has _plenty of lines below that point. After 1. f4 d5 my book has 8 alternatives. How long to learn the best response for each of those? And so it goes.

This is an _exponential_ (intractable) problem because the games take so long to play, and it takes so many to learn to beat that particular opening, if you can, which causes to "play without learned data" in almost every game you play when the opponent has a book and learning as well... It takes a long time for losses at move 90 to back up to change your play at move 2.

by **bob** » 23 May 2006, 14:41

also, you should be playing opponents with an opening book or you are never going to see the "effect" I am talking about. What serious program would ever play in an important tournament with no book at all??? So "learning" against a program that is crippled offers nothing when you play real games where your opponent uses a book also...

by **Michael Sherwin** » 23 May 2006, 16:44

But, what does it hurt, since my main goal is to play against humans that often have (by choice) a very narrow book and most likely may want to practice one particular line. Testing a computer oponent with even a small book or several Nunn positions would not be apropriate, However, testing against a computer oponent that has no book (book not presant) that does learn would be. 8-)

Darn, you just had to make me resort to the use of smilies!

by **bob** » 23 May 2006, 17:44

It doesn't hurt a thing to try anything.

A little history since I don't know how long you have been interested in computer chess...

What you are trying to do is _exactly_ what humans try to do on the chess servers. Find an opening you will play, given the chance, and then they "fish" around in that opening until they learn some specific line that they can use to beat you. Then they do it over and over, draining your rating to nothing. Then they move on to the next program/opening/etc and repeat.

Crafty "grew up" in this hostile environment, and I have taken great pains, both with book learning/result learning which influences opening move choices from the book to avoid repeated losses, and then with position learning to slowly learn to avoid repeated losses if an opponent finds some two-move opening book line that takes me out of book before I have a chance to make any choices. And in the past 5 years, Crafty hasn't lost the same game multiple times, which was my goal.

It seems to me that you are trying to do just what the humans are trying to do. And I have a solution that simply blocks that completely. If you "turn that solution" off by using no opening book for Crafty, then yes you should be able to learn to beat it since it will play the same moves without learning ability. But there is no way to do this on (say) a chess server, or at a chess tournament, so the practical uses of it are very limited...

That's my only concern with the approach. It is so easy to defeat that specific kind of learning that the results will never be that great overall

Now if you use that against me (me being a human me) then I don't play like you suggest. If I lose, I am not going to play the same exact opening line again to see if you will change and let me win, I'm going to vary earlier in the game to see if I can find a line you play poorly. You are not really testing against a human with your current testing, because a computer will simply play the same moves every time unless something changes (time, cpu speed, hash size, etc). No human behaves like that, which limits the effectiveness of your idea...

Winboard Forum

RomiChess && learning or the emperor has no clothes

RomiChess && learning or the emperor has no clothes

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Re: RomiChess && learning or the emperor has no clot

Who is online