Opening Book

Programming Topics (Computer Chess) and technical aspects as test techniques, book building, program tuning etc

Moderator: Andres Valverde

Opening Book

Postby Bernd Nürnberger » 28 Apr 2006, 20:11

Hello,

I'm new to this forum. My name is Bernd, I'm a CS student and I am in computer chess since roughly two years now working on a Java engine called King's'Out which I rewrote several times :?

Now to my question(s).

(1)
I want to include an opening book. I already asked at CCC and got the answer that there is no standard for opening book file formats.. nethertheless I would be glad if someone could briefly explain how to organize such a file.. after browsing some sources, messages, I consider to store hash keys (sorted) together with the moves into a file and do a binary search for each position while in book. Is this a common way to do it? Are there better ways? Are there at least some formats some programs share?

(2)
As I'm not good at chess, I do not plan to make up a book manually. I want to generate one form a PGN file. Could someone just give me some general ideas about that. How many plies are common? Should some lines be longer than others? What can I do with the information about who won the game/is it used? Are moves always chosen by the frequence they are played at a certain position?...

(3)
From the starting position you are "in book". That means I try to lookup a book move for each position encountered and will do *no* search.. until a position with no book move is found. Upon then I am out of book and will not try to lookup book moves. Is this o.k. or are there better ways to handle openings...

Thanks much in advance for your answers...

Greetings,

Bernd
Bernd Nürnberger
 
Posts: 23
Joined: 28 Apr 2006, 10:55

Re: Opening Book

Postby Ilari Pihlajisto » 28 Apr 2006, 21:06

Hi, I'm also a newbie (started writing my chess program about 8 months ago), but I do have a working opening book already so I may be able to help.

(1)
I don't think you need to store the moves in the book, just the positions and a score for each position. The score is the number of times the position was reached in the PGN collection. That's a very simple way to score book positions, but it's enough to make sure that your program isn't likely to play any weird moves from the book.

But you got it right that the positions should be ordered so that they can be found quickly with a binary search.

(2)
The depth of opening books seems to vary a lot. I use a maximum depth of 13 full moves by default. No lines in my book can be exceed that max depth. I use the result of the PGN game to decide which player's moves to store in the book. By default my engine doesn't store any moves from the losing side.

(3)
My program doesn't have any "in book" mode. It checks for a book move on every turn. It takes so little time that it won't degrade the performance at all. And sometimes the program gets back in the book after a couple of "out of book" moves.
User avatar
Ilari Pihlajisto
 
Posts: 78
Joined: 18 Jul 2005, 06:58

Re: Opening Book

Postby Bernd Nürnberger » 29 Apr 2006, 09:16

Hi,

thanks for your answer. It brought me some new ideas. I think I will also browse some source codes to see how others do it..

One more question:
How do you decide how many plies are stored for each game?

Bernd
Bernd Nürnberger
 
Posts: 23
Joined: 28 Apr 2006, 10:55

Re: Opening Book

Postby Ilari Pihlajisto » 29 Apr 2006, 17:13

Bernd wrote:How do you decide how many plies are stored for each game?


Like I said the maximum depth is 13 full moves (26 plies) by default in my engine. So it always stores 26 plies unless the game is shorter than that. It's very rare that my engine stays in book for more than 10 full moves so it wouldn't make any sense to increase the max depth of 13 moves. In fact, no ELO points would probably be lost if I lowered the depth down by a couple of moves.

The data structure for a book position in my engine is as simple as this:
Code: Select all
struct {
  uint64 hashKey; /* Zorbis hash key for the position */
  uint16 score;     /* The number of times the position was reached */
}


My book file was created from 100MB of PGN files with a max depth of 13 full moves and no moves from the losing side. The book file is about 8MB in size.
User avatar
Ilari Pihlajisto
 
Posts: 78
Joined: 18 Jul 2005, 06:58

Re: Opening Book

Postby Bernd Nürnberger » 29 Apr 2006, 18:24

Thanks for additional information, Ilari

I wonder why that this simple method is fast..

If I got it right, it does mean to:
- generate all moves
- make/unmake all of them
- for each resulting position lookup its frequency
- chose a move randomly while moves that lead to positions with higher score are more probable..

Doesn't that mean hundreds of seeks in the file :? .. or do you hold the book in memory?

Greetings,

Bernd
Bernd Nürnberger
 
Posts: 23
Joined: 28 Apr 2006, 10:55

Re: Opening Book

Postby Ilari Pihlajisto » 29 Apr 2006, 21:27

If I got it right, it does mean to:
- generate all moves
- make/unmake all of them
- for each resulting position lookup its frequency
- chose a move randomly while moves that lead to positions with higher score are more probable..


Yes, you got it right, I think.


Doesn't that mean hundreds of seeks in the file?


Yes, assuming that the book holds 1 million positions, about 20 fseek & fread operations per move in the worst case, which would be 600 operations in a position where there are 30 legal moves. That sounds like a lot of work, but it still doesn't take any noticeable time. So there's no need to hold the book in RAM.

I'll post some benchmark results tomorrow (it's midnight in Finland right now) and my source code (in C) for the opening book.
User avatar
Ilari Pihlajisto
 
Posts: 78
Joined: 18 Jul 2005, 06:58

Re: Opening Book

Postby Bernd Nürnberger » 29 Apr 2006, 21:40

Yes, assuming that the book holds 1 million positions, about 20 fseek & fread operations per move in the worst case, which would be 600 operations in a position where there are 30 legal moves. That sounds like a lot of work, but it still doesn't take any noticeable time. So there's no need to hold the book in RAM.

I'll post some benchmark results tomorrow (it's midnight in Finland right now) and my source code (in C) for the opening book.


Nice to hear that external storage is sufficient.. looking forward to your numbers..

Bernd
Bernd Nürnberger
 
Posts: 23
Joined: 28 Apr 2006, 10:55

Re: Opening Book

Postby Ilari Pihlajisto » 30 Apr 2006, 15:22

Okay, I have some numbers. I had my engine play a quick game against Gerbil, and here's how much time was spent searching the book:

Code: Select all
Legal moves: 20, Book moves: 16, time: 37 ms
Legal moves: 28, Book moves: 15, time: 54 ms
Legal moves: 29, Book moves: 3, time: 763 ms
Legal moves: 31, Book moves: 4, time: 58 ms
Legal moves: 37, Book moves: 4, time: 74 ms
Legal moves: 36, Book moves: 3, time: 70 ms
Legal moves: 34, Book moves: 4, time: 99 ms
Legal moves: 37, Book moves: 7, time: 99 ms
Legal moves: 36, Book moves: 0, time: 68 ms
Legal moves: 40, Book moves: 0, time: 77 ms
Legal moves: 44, Book moves: 0, time: 115 ms
Legal moves: 45, Book moves: 0, time: 90 ms
Legal moves: 40, Book moves: 0, time: 79 ms
Legal moves: 43, Book moves: 0, time: 117 ms
Legal moves: 45, Book moves: 0, time: 119 ms
Legal moves: 41, Book moves: 0, time: 77 ms
Legal moves: 39, Book moves: 0, time: 73 ms
Legal moves: 37, Book moves: 0, time: 74 ms
Legal moves: 35, Book moves: 0, time: 70 ms
Legal moves: 38, Book moves: 0, time: 73 ms
Legal moves: 39, Book moves: 0, time: 74 ms
Legal moves: 40, Book moves: 0, time: 74 ms
Legal moves: 47, Book moves: 0, time: 120 ms
Legal moves: 47, Book moves: 0, time: 91 ms
Legal moves: 48, Book moves: 0, time: 121 ms
Legal moves: 42, Book moves: 0, time: 82 ms
Legal moves: 42, Book moves: 0, time: 80 ms
Legal moves: 40, Book moves: 0, time: 78 ms
Legal moves: 5, Book moves: 0, time: 11 ms
Legal moves: 4, Book moves: 0, time: 8 ms
Legal moves: 49, Book moves: 0, time: 95 ms
Legal moves: 29, Book moves: 0, time: 66 ms
Legal moves: 29, Book moves: 0, time: 57 ms
Legal moves: 30, Book moves: 0, time: 59 ms
Legal moves: 30, Book moves: 0, time: 55 ms
Legal moves: 29, Book moves: 0, time: 57 ms
Legal moves: 28, Book moves: 0, time: 55 ms
Legal moves: 29, Book moves: 0, time: 143 ms
Legal moves: 27, Book moves: 0, time: 51 ms
Legal moves: 27, Book moves: 0, time: 52 ms
Legal moves: 24, Book moves: 0, time: 46 ms
Legal moves: 3, Book moves: 0, time: 5 ms
Legal moves: 26, Book moves: 0, time: 50 ms
Legal moves: 27, Book moves: 0, time: 51 ms
Legal moves: 24, Book moves: 0, time: 47 ms
Legal moves: 24, Book moves: 0, time: 48 ms
Legal moves: 20, Book moves: 0, time: 39 ms
Legal moves: 20, Book moves: 0, time: 40 ms
Legal moves: 21, Book moves: 0, time: 43 ms
Legal moves: 22, Book moves: 0, time: 41 ms
Legal moves: 4, Book moves: 0, time: 6 ms
Legal moves: 1, Book moves: 0, time: 2 ms
Legal moves: 5, Book moves: 0, time: 7 ms
Legal moves: 24, Book moves: 0, time: 49 ms
Legal moves: 30, Book moves: 0, time: 60 ms
Legal moves: 33, Book moves: 0, time: 68 ms
Legal moves: 27, Book moves: 0, time: 55 ms
Legal moves: 36, Book moves: 0, time: 77 ms
Legal moves: 6, Book moves: 0, time: 12 ms
Legal moves: 35, Book moves: 0, time: 73 ms
Legal moves: 35, Book moves: 0, time: 71 ms
Legal moves: 18, Book moves: 0, time: 40 ms
Legal moves: 19, Book moves: 0, time: 39 ms
Legal moves: 17, Book moves: 0, time: 35 ms
Legal moves: 19, Book moves: 0, time: 41 ms
Legal moves: 23, Book moves: 0, time: 46 ms
Legal moves: 17, Book moves: 0, time: 35 ms
Legal moves: 19, Book moves: 0, time: 38 ms
Legal moves: 16, Book moves: 0, time: 33 ms
Legal moves: 14, Book moves: 0, time: 29 ms
Legal moves: 22, Book moves: 0, time: 44 ms

average time: 70 ms


According to this test the average time is 70 milliseconds, which should be very acceptable unless you're playing lightning fast games. The time for the third position in the list is so long probably because the system was under heavy load at that time.

The engine is written in C and my system is:
Athlon XP 2600+ @2Ghz
7800 RPM IDE hd
Linux with 2.6.15-21-686 kernel
User avatar
Ilari Pihlajisto
 
Posts: 78
Joined: 18 Jul 2005, 06:58

Re: Opening Book

Postby Bernd Nürnberger » 30 Apr 2006, 15:32

Ok, thanx..

btw. I see you use Linux. I also do. Is your engine public available?

Bernd
Bernd Nürnberger
 
Posts: 23
Joined: 28 Apr 2006, 10:55

Re: Opening Book

Postby Ilari Pihlajisto » 30 Apr 2006, 16:00

Yeah, I use the beta version of Ubuntu 6.06. My engine (called Sloppy, because that's how it plays) is still under development but I'm going to release it very soon under the GPL. It's a bitboard engine with a lot of potential, but the eval function is just horrible so the ELO is currently only about 2200. It's the first real application I've ever written in C, so right now I just want to finish the damn thing.
User avatar
Ilari Pihlajisto
 
Posts: 78
Joined: 18 Jul 2005, 06:58

Re: Opening Book

Postby Bernd Nürnberger » 30 Apr 2006, 16:37

Yep, IMO Ubuntu is the best.. still using official 5.10 on PPC at home. I rewrote my engine several times.. could still be finished, but I want to do it a little too perfect.. hope my current rewrite will survive and be published.. will be the best Java engine :wink:
Bernd Nürnberger
 
Posts: 23
Joined: 28 Apr 2006, 10:55

Re: Opening Book

Postby Volker Annuss » 30 Apr 2006, 19:16

Hello Bernd,

my opening book format is like in this example:
Code: Select all
e2e4 99 e7e5 99 *
        e7e6 15 * *
d2d4 33 d7d5 99 * * *


Every move is followed by a weight. * takes back a move.
In the starting position, there are two alternatives: e2e4 is chosen with a probability of 0.75 /* 99 / (99+33) */ and d2d4 with a probability of 0.25.
After e2e4 the move e7e5 is chosen with a probability of ~0.87 /* 99 / (99+15) */ and e7e6 with a probability of ~0.13 /* 15 / (99+15) */.
In the starting position, the book is read into the same memory that is used for the hash table later in the game. It is not used like the hash table but in a very different way. A hash key, one move and its weight is stored in one hash table slot. When there is more than one book move in a position, there is also more than one element with the same hash key (but different move and weight) in the table. The table is sorted by hash key for a binary search.

There are some extensions for Chess960 books but they are still experimental.

Advantages:
- fast lookup
- no extra memory required
- human readable book format (In public books I remove blanks from the file to get a better compression for the .zip file.)

Disadvantages:
- no chance to find book moves when out of book and back into book by transposition
- maximum book size depends on Hash size


How many plies are common?
I don't know how many are common, my limit is 45 plies but most lines are shorter (see below).


Should some lines be longer than others?

Normally a position only goes into the book, when it was reached in more than one game. The number of games is a parameter of the book creation. The weight of a game is also a parameter of the book creation, so you can make high quality games more important.

What can I do with the information about who won the game/is it used? Are moves always chosen by the frequence they are played at a certain position?...

When you put much work in selecting your games, there is no need to use the game result. My engine Hermann prefers moves that have a high average score.

From the starting position you are "in book". That means I try to lookup a book move for each position encountered and will do *no* search.. until a position with no book move is found. Upon then I am out of book and will not try to lookup book moves. Is this o.k. or are there better ways to handle openings...

There are other ways, but I also do it like that. One disadvantage is, that once you are out of book, there is no way to find book moves again, even if another book position is reached later in the game.
A little improvement is possible for some engines: Also cancel your "in book" state, when you can be sure, your next move will be out of book. When you can start pondering earlier, this is more than just one saved book lookup.

Greetings
Volker
Volker Annuss
 
Posts: 49
Joined: 25 Jan 2005, 11:14

Re: Opening Book

Postby Bernd Nürnberger » 01 May 2006, 10:03

Hello Volker,

thanks for your reply.

Volker Annuss wrote:my opening book format is like in this example:
Code: Select all
e2e4 99 e7e5 99 *
        e7e6 15 * *
d2d4 33 d7d5 99 * * *


Every move is followed by a weight. * takes back a move.
In the starting position, there are two alternatives: e2e4 is chosen with a probability of 0.75 /* 99 / (99+33) */ and d2d4 with a probability of 0.25.
After e2e4 the move e7e5 is chosen with a probability of ~0.87 /* 99 / (99+15) */ and e7e6 with a probability of ~0.13 /* 15 / (99+15) */.
In the starting position, the book is read into the same memory that is used for the hash table later in the game. It is not used like the hash table but in a very different way. A hash key, one move and its weight is stored in one hash table slot. When there is more than one book move in a position, there is also more than one element with the same hash key (but different move and weight) in the table. The table is sorted by hash key for a binary search.


On the one hand the idea to use the TT for book is nice ... on the other hand, I have two reasons to not do it that way:
- I wanna write clean OO code.. each object has a clear concern; I would have to clutter my TT code too much
- Initially I also thought of holding book in main memory.. but Ilari's performance test showed me that there is little time wasted by looking up book moves externally..

Volker Annuss wrote:There are some extensions for Chess960 books but they are still experimental.


Wasn't the indention of Chess960 to get rid of opening books? :D

Volker Annuss wrote:Advantages:
- fast lookup
- no extra memory required
- human readable book format (In public books I remove blanks from the file to get a better compression for the .zip file.)


Yes, these are clearly advantages.. but as noted, the external lookups seems to need really less time as I thought earlier..

Disadvantages:
- no chance to find book moves when out of book and back into book by transposition
- maximum book size depends on Hash size


The second point would really bother me.. don't like limits and avoid it as much as I can..

Should some lines be longer than others?

Normally a position only goes into the book, when it was reached in more than one game. The number of games is a parameter of the book creation. The weight of a game is also a parameter of the book creation, so you can make high quality games more important.


How do you rate a game to be of high quality? By ELO of participants?

What can I do with the information about who won the game/is it used? Are moves always chosen by the frequence they are played at a certain position?...

When you put much work in selecting your games, there is no need to use the game result. My engine Hermann prefers moves that have a high average score.


What do you mean with high average score? That the positoin, the moves leads to occurs in many games?

Bernd
Bernd Nürnberger
 
Posts: 23
Joined: 28 Apr 2006, 10:55

Re: Opening Book

Postby Ilari Pihlajisto » 01 May 2006, 12:30

I did another benchmark, this time with a 900KB book instead of the 8MB book that I used earlier.

Code: Select all
Legal moves: 20, Book moves: 12, time: 26 ms
Legal moves: 21, Book moves: 5,  time: 28 ms
Legal moves: 23, Book moves: 4,  time: 30 ms
Legal moves: 27, Book moves: 4,  time: 35 ms
Legal moves: 30, Book moves: 0,  time: 39 ms
Legal moves: 34, Book moves: 0,  time: 41 ms
Legal moves: 33, Book moves: 0,  time: 46 ms
Legal moves: 35, Book moves: 0,  time: 45 ms
Legal moves: 38, Book moves: 0,  time: 49 ms
Legal moves: 37, Book moves: 0,  time: 49 ms
Legal moves: 35, Book moves: 0,  time: 46 ms
Legal moves: 37, Book moves: 0,  time: 49 ms
Legal moves: 41, Book moves: 0,  time: 55 ms
Legal moves: 45, Book moves: 0,  time: 55 ms
Legal moves: 7,  Book moves: 0,  time: 9 ms
Legal moves: 45, Book moves: 0,  time: 60 ms
Legal moves: 34, Book moves: 0,  time: 43 ms
Legal moves: 44, Book moves: 0,  time: 60 ms
Legal moves: 47, Book moves: 0,  time: 63 ms
Legal moves: 47, Book moves: 0,  time: 61 ms
Legal moves: 47, Book moves: 0,  time: 63 ms
Legal moves: 51, Book moves: 0,  time: 65 ms
Legal moves: 49, Book moves: 0,  time: 68 ms
Legal moves: 47, Book moves: 0,  time: 61 ms
Legal moves: 42, Book moves: 0,  time: 53 ms
Legal moves: 40, Book moves: 0,  time: 52 ms
Legal moves: 3,  Book moves: 0,  time: 4 ms
Legal moves: 39, Book moves: 0,  time: 136 ms
Legal moves: 45, Book moves: 0,  time: 92 ms

Average time: 51 ms


Even though the book is now 9 times smaller, the lookup time was reduced by only about 30 percent, thanks to the good performance of the binary search.
User avatar
Ilari Pihlajisto
 
Posts: 78
Joined: 18 Jul 2005, 06:58

Re: Opening Book

Postby Bernd Nürnberger » 01 May 2006, 13:49

Hi Ilari,

yes, I think the time is negligible.. an binary search does scale well... doubling size should only cause one more seek per move...

I think I will also use this approach...

Bernd
Bernd Nürnberger
 
Posts: 23
Joined: 28 Apr 2006, 10:55

Re: Opening Book

Postby Ilari Pihlajisto » 01 May 2006, 15:40

Btw, do you have any plan how you're going to store the positions from PGN files into the book? I know that it doesn't really matter how quickly a book can be generated, but it was a nice challenge to optimize. Currently I'm storing the positions in an Avl tree (a balanced binary tree), and when all the PGN games have been read, write the positions into the book file. This was 50x faster than using an array and quicksorting it, and allows my engine to create an 8MB book from 100MB of PGN files in just 20 seconds.
User avatar
Ilari Pihlajisto
 
Posts: 78
Joined: 18 Jul 2005, 06:58

Re: Opening Book

Postby Bernd Nürnberger » 01 May 2006, 16:05

Hi,

I also thought of some sort of tree. Initially I thought that I would use a tree to hold the opening book internal. But this is not necessary looking at the numbers you posted (hopefully, I will get similiar results with Java.. :| ).

But for creation I think, I will also stick to some tree. I think a *binary* tree is a good idea. While generating, you can find the position whose score to update quickly in O(log n) and writing the book to a file is a simple depth first traversal of the tree. If memory is too small, you could also think of a B-Tree and write to a file simultaneously while browsing PGN.. but I think memory is not a problem any more these days... the computer I am just using has 2 GB.. should be enough :)

I will tell you my numbers when I will have finished my opening book... at the moment I am about rewriting the core.. after that I will come back to the opening book...

Greetings,

Bernd
Bernd Nürnberger
 
Posts: 23
Joined: 28 Apr 2006, 10:55

Re: Opening Book

Postby Volker Böhm » 01 May 2006, 17:24

Hi,

Spike does use a different approach for a book. It is a very old colde (where memory was expensive). Spike stores a move in two bytes including a weight and the tree-management information.
In these two bytes one bit is set if the current position has a child-move and another bit is set if the current move has one more sibling move. This is enought to be able to walk through the tree. Three bits are used to store the weight from 0-7 and 11 bits are used to store the move.
For every lookup Spike runs through the whole amount of moves (depth first search through a tree) until it finds the position it looks for. With some simple tricks this is fast enough for some million of nodes in the tree.

Pros:
1. Very small book even with large amount of moves
2. Supports transpositions
3. No disk access while playing (I like to turn of disk while playing tournaments)

Cons:
1. Complicated code, bat to debug
2. Binary format, not human readable
3. Doesnt work well with really huge books, thus with Chess960 Books it could get problems (largest Book I tried were somewhere above 5 million moves that is no problem on current computers).

Greetings Volker
Volker B?hm, Spike Team
Volker Böhm
 
Posts: 66
Joined: 27 Sep 2004, 18:52
Location: Germany

Re: Opening Book

Postby Ilari Pihlajisto » 01 May 2006, 17:46

Yes, a binary tree is definitely a good data structure here. But it should be a balanced binary tree (avl tree, red-black tree etc.) because when you want to add new positions to an existing book file, the book has to be first added to the tree. And because the book is already ordered we get a worst-case scenario if we use a normal binary tree.
User avatar
Ilari Pihlajisto
 
Posts: 78
Joined: 18 Jul 2005, 06:58

Re: Opening Book

Postby Volker Böhm » 02 May 2006, 14:55

Hi Ilari,

I don?t understand your posting. Why must the tree be "balanced"? As far as I remember a balanced tree is a tree where each subtree of a node has the as near as possible the same amount of nodes.

What does this condition help for a binary book tree?

Greetings Volker
Volker B?hm, Spike Team
Volker Böhm
 
Posts: 66
Joined: 27 Sep 2004, 18:52
Location: Germany

Next

Return to Programming and Technical Discussions

Who is online

Users browsing this forum: No registered users and 44 guests