Page 1 of 1

Opening Book Considerations

PostPosted: 08 Aug 2006, 15:50
by Tom Cant
I'v implemented a simple opening book for my engine (Klimax), and was wondering if there were any draw-backs to the way I'v done it. Basically, the book file looks like this:
Code: Select all
e2e4 c7c5 g1f3 b8c6 d2d4 c5d4 f3d4 g8f6 b1c3 d7d6 c1g5 e7e6 d1d2 f8e7 e1c1 e8g8
e2e4 c7c5 g1f3 b8c6 d2d4 c5d4 f3d4 g8f6 b1c3 d7d6 c1g5 e7e6 d1d2 a7a6 e1c1 h7h6
.
.
.
etc


I was about to write an app to convert openings in a PGN file to this format. Would it be a waste of time if I did so? Are there any major pit-falls here?

Thanks for your input!
Tom

Re: Opening Book Considerations

PostPosted: 08 Aug 2006, 16:59
by Alvaro Begue
Actually, that's the exact format that I used in Ruy-L?pez. Well, I also allowed the usage of "-" instead of " " between two moves, which means that the previous move should be avoided. This way, you can easily say "don't play the Sicilian" by entering this line somewhere in your opening book (I usually use the first lines for this):
e2e4 c7c5-

It has drawbacks (not very compact, much slower to access than using a hash, hard to encode exact probabilities for moves...), but it is really easy to edit, which is why I used it.

You will need some utilities to keep the file sorted, to remove repeated games and to verify that all the moves are valid. These are all easy to write.

Also, I recommend using separate files for white and black.

Re: Opening Book Considerations

PostPosted: 08 Aug 2006, 17:47
by YvesLejeail
One important point if you want to create a pgn reader inside your chess prog is to clean the pgn files with a special programme (like pgntrim5), i.e. removes all the dubious notations, comments, .... With this programme you can download on the web, it is also possible to format your pgn file with a constant number of moves,
which makes easier all the following pgn treatment. The programmation of the pgn treatment is then not so hard, once you have done these 2 steps.
Hope it helps, :wink:
Yves

Re: Opening Book Considerations

PostPosted: 08 Aug 2006, 18:03
by Tom Cant
Would someone be able to direct me to a tutorial/article for hashing the opening book?

Thanks for the replies, I really appreciate it!
Tom


[edit]:
Alvaro Begue wrote:Also, I recommend using separate files for white and black.

I'm not sure I understand the point of this. When my engine looks for a move in the book, it doesn't matter who the side-to-move is, since I know what the current move number is. Could you elaborate?

Re: Opening Book Considerations

PostPosted: 08 Aug 2006, 19:20
by Leen Ammeraal
For example, the book for White contains only good moves for White,
but may contain bad moves for Black. So White can also find a
good move in positions that occur only if Black had made a
bad move. Such a list of moves should obviously not be used
for Black to pick a move from.
I am not sure if that was Alvaro's reason for using separate
books for White and Black, but I do it in this way.
Leen Ammeraal

Re: Opening Book Considerations

PostPosted: 08 Aug 2006, 20:06
by Alvaro Begue
Leen Ammeraal wrote:For example, the book for White contains only good moves for White,
but may contain bad moves for Black. So White can also find a
good move in positions that occur only if Black had made a
bad move. Such a list of moves should obviously not be used
for Black to pick a move from.
I am not sure if that was Alvaro's reason for using separate
books for White and Black, but I do it in this way.
Leen Ammeraal

Yes, that is exactly why I keep them separated.

Re: Opening Book Considerations

PostPosted: 09 Aug 2006, 05:44
by Ron Murawski
Tom Cant wrote:Would someone be able to direct me to a tutorial/article for hashing the opening book?

Thanks for the replies, I really appreciate it!
Tom


Hi Tom,

Here's an easily understood format:
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -
d2d4 37802/ e2e4 43715/ c2c4 23456/

The first line is the fen position, the second line contains the book's possible moves and the amount of times each move was played (perhaps from scanned pgn files).

The additional information of side-to-move, castling rights, and possible ep square ensure that illegal opening moves don't get played and permit all the information to be held in a single book file. This is a nice, human-readable format that can be edited by hand. The downside, due to the long fen description, is its non-compact-ness. But, since this format also handles all opening transpositions so that you do not have to repeat the entire move-sequence leading up to the current position, it is probably more compact than your format.

The fen position can easily be replaced by a 64-bit Zobrist hash value.
See: http://www.brucemo.com/compchess/progra ... obrist.htm
The resulting format would be called a hashed opening book and is a very compact way to store the opening moves. Hashed opening books can never be edited by hand because of the impossibility of starting with a 64-bit number and attempting to create the position that caused it. Zobrist hashing is called a "one-way function".
See: http://en.wikipedia.org/wiki/One-way_function
There is an extremely small (but measurable) possibility that a hashed opening book could cause an opening book error if two hashed positions happen to resolve to the same 64-bit number.

Edited to add:
All opening books should be pre-sorted (by fen or by Zorbist values) so you can use a fast binary search to fetch your moves.

Best regards,
Ron

Re: Opening Book Considerations

PostPosted: 09 Aug 2006, 19:38
by Tom Cant
Thanks everyone! That really helped. :)

Re: Opening Book Considerations

PostPosted: 12 Aug 2006, 11:37
by Nicolai Czempin
Ron Murawski wrote:
Tom Cant wrote:Would someone be able to direct me to a tutorial/article for hashing the opening book?

Thanks for the replies, I really appreciate it!
Tom


Hi Tom,

Here's an easily understood format:
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -
d2d4 37802/ e2e4 43715/ c2c4 23456/

The first line is the fen position, the second line contains the book's possible moves and the amount of times each move was played (perhaps from scanned pgn files).

The additional information of side-to-move, castling rights, and possible ep square ensure that illegal opening moves don't get played and permit all the information to be held in a single book file. This is a nice, human-readable format that can be edited by hand. The downside, due to the long fen description, is its non-compact-ness. But, since this format also handles all opening transpositions so that you do not have to repeat the entire move-sequence leading up to the current position, it is probably more compact than your format.

The fen position can easily be replaced by a 64-bit Zobrist hash value.
See: http://www.brucemo.com/compchess/progra ... obrist.htm
The resulting format would be called a hashed opening book and is a very compact way to store the opening moves. Hashed opening books can never be edited by hand because of the impossibility of starting with a 64-bit number and attempting to create the position that caused it. Zobrist hashing is called a "one-way function".
See: http://en.wikipedia.org/wiki/One-way_function
There is an extremely small (but measurable) possibility that a hashed opening book could cause an opening book error if two hashed positions happen to resolve to the same 64-bit number.

Edited to add:
All opening books should be pre-sorted (by fen or by Zorbist values) so you can use a fast binary search to fetch your moves.

Best regards,
Ron


I also have separate files for white and for black. Instead of storing the Zobrist key, I give lines of the form "e2e4.e7e5 g1f3.b8c6"; that's only to make it easier to maintain.

I read the files on startup (before the game starts), and put them into the openings hashtable. When the game starts, I ask this hashtable for moves.

So the answer is almost instantaneous, no searching required.

This works for now because my book is relatively small (750 positions). But all I'd need to do once it gets significantly big is to make sure to free the memory once I'm out of the book, to make more space for the real hashtables.

I don't offer more than one move per position, although I will support that in a future release.