Fast(er) bitboard move generator
Posted: 14 Jun 2006, 15:05
Hi!
I read some of the posts here about deBruijn key hashing in bitboards, and decided to make a new bitboard engine without rotated boards.
First, my results using perft and comparing to crafty 20.1
My platform is Fedora FC5 x86_64 on an AMD64 3000+ 512k L2, 1Gbyte ram.
crafty: Starting position perft 6 uses 11.16s, mine uses 7.43s, 50%faster
Int the well known position "r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w QKqk -\n"
crafty uses 17.57s on perft 5. Mine uses 11.12s, 58% faster.
For fun i tested this "q3k3/6q1/4q3/2q5/5Q2/3Q4/1Q6/3K3Q w - -\n".
Crafty perft 5, 150.36s, mine 91.27s, 64.7% faster.
(yes, I have tested the number of moves, the move generator should be error free).
I made two tables: Attacksrook[64][8192] and AttacksBishop[64][8192] and
found a hash key (deBruijn key ?) for each square of the tables, so that
Pattern=RookMask[square]&(AllWPieces | AllBPieces);
Address= (Pattern*key[square])>>(64-13);
Attacks = AttacksRook[square][Address];
This reduses the number of instructions to make the combined rank+file attacks, by cost of 8Mbyte and cache pollution. Its faster for me, I used rotated bitboards till now
I dont expect my code to be optimal, I use no assembly inlined (actually I find deBruijn bitscanforward to be faster than the bsfq assembly instruction, even as I use H8 as bit 0 to avoid subtraction as used in crafty inlineamd.h).
If someone wants the source code give a message.(It was much easier to make than the rotated bitboards code).
Regards,
Lasse Hansen
I read some of the posts here about deBruijn key hashing in bitboards, and decided to make a new bitboard engine without rotated boards.
First, my results using perft and comparing to crafty 20.1
My platform is Fedora FC5 x86_64 on an AMD64 3000+ 512k L2, 1Gbyte ram.
crafty: Starting position perft 6 uses 11.16s, mine uses 7.43s, 50%faster
Int the well known position "r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w QKqk -\n"
crafty uses 17.57s on perft 5. Mine uses 11.12s, 58% faster.
For fun i tested this "q3k3/6q1/4q3/2q5/5Q2/3Q4/1Q6/3K3Q w - -\n".
Crafty perft 5, 150.36s, mine 91.27s, 64.7% faster.
(yes, I have tested the number of moves, the move generator should be error free).
I made two tables: Attacksrook[64][8192] and AttacksBishop[64][8192] and
found a hash key (deBruijn key ?) for each square of the tables, so that
Pattern=RookMask[square]&(AllWPieces | AllBPieces);
Address= (Pattern*key[square])>>(64-13);
Attacks = AttacksRook[square][Address];
This reduses the number of instructions to make the combined rank+file attacks, by cost of 8Mbyte and cache pollution. Its faster for me, I used rotated bitboards till now
I dont expect my code to be optimal, I use no assembly inlined (actually I find deBruijn bitscanforward to be faster than the bsfq assembly instruction, even as I use H8 as bit 0 to avoid subtraction as used in crafty inlineamd.h).
If someone wants the source code give a message.(It was much easier to make than the rotated bitboards code).
Regards,
Lasse Hansen