Winboard Forum

Posted: **04 Jan 2011, 20:46**

I am a total newbie at coding a chess engine -- I just started a few days ago -- and I am a little at a loss about a few things:

After using bitwise operations to generate the bitboard of squares that a piece can move to (e.g.: whiteKnight & KNIGHT_ATTACKS[currSquare] & ~whitePiecesBitboard), do I then have to loop through it and extract the moves? Isn't this kind of wasteful?
With rotating bitboard, how do you rotated the board by 45 degrees (for bishop and queen moves)?
Instead of rotating the board, can't you just have a whole bunch of lookup tables for horizontal, vertical, and diagonal moves?

Thanks in advance!

- Moshe

Posted: **04 Jan 2011, 23:39**

Indeed, extracting the moves is an expensive part of bitboard move generation, but you only have to loop over bits that are set, as there are tricks to get the least-significant 1 bit making use of the CPU carry logic.

One does not rotate the bit board in rotated bitboard, but one keeps 4 (incrementally updated) boards in different oriantation.

The trick is to collect the relevant occupancy bits to make a table index from it. Only in one of the orientations the moves along a given ray form a contigous group of bits. But 'magic bitboards' use a 64-bit multiplication to collect the bits irrespective of orientation, and thus only need to store a single board orientation.

Posted: **05 Jan 2011, 14:08**

Instead of rotating the board, can't you just have a whole bunch of lookup tables for horizontal, vertical, and diagonal moves?

I would strongly suggest using magic bitboards. They are _much_ easier to use than rotated bitboards.

Posted: **09 Jan 2011, 14:16**

Another approach is called shifted bitboards - that's what I use in my engine.

Shifted bitboards are described on NagaSkaki web page. NagaSkaki is rated around 2000 ELO so it's quite good. My engine is rated around 1600 ELO so it's quite bad

GCC provides built-in functions for finding index of LSB etc.

Posted: **09 Jan 2011, 18:06**

BitSet wrote:Another approach is called shifted bitboards - that's what I use in my engine.

Shifted bitboards are described on NagaSkaki web page. NagaSkaki is rated around 2000 ELO so it's quite good. My engine is rated around 1600 ELO so it's quite bad

GCC provides built-in functions for finding index of LSB etc.

56 operations for rook or bishop looks a bit too expensive for my taste. Despite magics with rather huge tables has only three operations, and-mul-shift-lookup, I suggest to have a closer look to the similar working Classical Approach in one run with 13 64-bit operations (including four bitscans).

Posted: **10 Jan 2011, 18:54**

Is Classical Approach (NOT in one run) faster than Shifted Bitboards?

Posted: **10 Jan 2011, 19:37**

BitSet wrote:Is Classical Approach (NOT in one run) faster than Shifted Bitboards?

Depends on the 64-bit hardware, with recent >= core2 and K10 with fast bitscan instruction, yes. On old AMD K8 Athlons with dead slow 10/11 cycles vector path bsf/bsr probably not. In 32-bit mode so many 64-bit shifts are horror of course.

Per ray it is more or less 11 instructions (with some parallel gain, which otoh costs some registers), D is compile time constant

Code: Select all: (x << D) | (x << 2D) | (x << 3D) | (x << 4D) | (x << 5D) | (x << 6D)

versus one or-bitscan-L1lookup, the or "of the blocker" is needed for a branchless approach:

Code: Select all: raydirLookup[bitscan(X | ENDBIT)]

Since the non-rotated attack-getter interface is all equal, one may play with different implementations and approaches like Kindergarten BBs, Obstruction Difference, Hyperbola Quintessence, or finally Magic BBs later, if one has nothing else to do ;-)

One possible improvement of Shifted BBs would be to use parallel prefix shifts with 6 instead of 11 instructions (but more dependencies) :

Code: Select all: x |= (x << D); x |= (x << 2D); x |= (x << 4D);

but still ...

Posted: **10 Jan 2011, 21:58**

So in conclusion: there is no sense in replacing Shifted Bitboards with Classic Approach (even all in one) because on my K7 it won't work faster? All non-rotated approaches are like O(n^2) sorting algorithms compared to O(n*log(n)) QuickSort - magic boards? Am I right?

Posted: **11 Jan 2011, 00:29**

BitSet wrote:So in conclusion: there is no sense in replacing Shifted Bitboards with Classic Approach (even all in one) because on my K7 it won't work faster? All non-rotated approaches are like O(n^2) sorting algorithms compared to O(n*log(n)) QuickSort - magic boards? Am I right?

I fear not. I don't get your Big O thing, what n? There are no loops in getting attacks in all the mentioned methods, thus O(1) like an array access or two. Magic Bitboards is non-rotated as well (requires only one occupied bitboard) and it is all about Space-Time tradeoff, i.e. Kindergarten with tables in L1-cache versus Magics with the risk of cache misses, depending on the other cache- and memory footprint of the program.

On a K7 I would neither use Shifted nor Classical, which are Imho both overkill for a 32-bit machine and likely to hurt bitboard reputation compared to mailbox or 0x88 board arrays with offset move generation ;-)

Posted: **11 Jan 2011, 15:02**

OK, never mind, I won't go into details because most people on forum don't understand what was in my mind when I was typing. I only wanted to know if classic and shifted are equivalent in the sense of efficiency. And because they are, there is no sense in replacing shifted with classic. Magic are too complicated and I guess benefit in playing strength will be almost none.

Posted: **11 Jan 2011, 22:39**

BitSet wrote:OK, never mind, I won't go into details because most people on forum don't understand what was in my mind when I was typing. I only wanted to know if classic and shifted are equivalent in the sense of efficiency. And because they are, there is no sense in replacing shifted with classic. Magic are too complicated and I guess benefit in playing strength will be almost none.

If you have 10,000 cycles per node anyway and Shifted BB < 5% in total runtime of your program, yes. You are right for that specific 32-bit Athlon K7, that to replace Shifted by Classical is not the best choice. It is far less code and registers, but still too slow due to the K7 vector path 32-bit bsr/bsf. Both methods may end in the 100 cycle range or worse, while one can have it with 20 or less cycles with other non-rotated 32-bit methods, not to mention Rotated.

If a small nested dir/inc loop to traverse the squares on a board or mailbox array, even to set up an attack bitbord with BitSet, might become close in speed to a bitboard attack-getter, there seems something wrong with the choice of the board representation ;-)

Cheers,
Gerd

Winboard Forum

Bitboard Move Generation

Bitboard Move Generation

Re: Bitboard Move Generation

Re: Bitboard Move Generation

Re: Bitboard Move Generation

Re: Bitboard Move Generation

Re: Bitboard Move Generation

Re: Bitboard Move Generation

Re: Bitboard Move Generation

Re: Bitboard Move Generation

Re: Bitboard Move Generation

Re: Bitboard Move Generation