Hello,
[edit] : a small edit as I missed out something. The main eval() of Snailchess is disabled as it brings down nps by about 55%. Also the best move is not "best" in anyway as my move-ordering is half-done and has no meaning for almost all the moves.
I tested the magic codes and it happen to have TSCP's board mapping so it is easy for me to slot in different codes to compare 3 bitboard move generations 1) bitwise 2) rotated 3) magic. As I have incremental updates of sliding attacks, the partial updates of a single direction due to the positions of from/to all uses the same bitwise codes. Only when a sliding piece itself moves when a new generation of the attack map is needed would the relevant codes be called. Also whenever the king moves as Snailchess incrementally updates the "king-as-queen" attacks .
My hardware is a P4 1.4 Ghz 128mb ram, win98 and compiles with Visual C++ 4.02, blend of pentium/x86 codes, with "only _inline" option, ie if specified. Snailchess is non-recursive with a stack of structs. It has 2 copies of make(), same copy for for bitwise and magic. The rotated version is basically the same, with incremental codes for 2 rotated bishop BB + 1 rotated file BB. Also when on bitwise/magic, the rotated variables are deleted from the stack and the memory alignment happen to fit nicely in 128 byte. With rotated, which has 3 extra u64, it probably need some bad struct padding which may be a slight disadvantage.
1) magic -
As I don't know much about configuration, the magic codes is used
w/o any changes. A small test is done and it seems the fastest is with the following defines untouched, ie minimize + inline
#define MINIMIZE_MAGIC
#define USE_INLINING
2) rotated-
My rotated lookup is almost identical to Crafty's, the only difference is Crafty seperates the bs-lookup for 2 directions with total memory
2 X 32k bishop + 1 x 32k file + 1 x 32k rank. Mine has a total of 3 x 32k as the bs-lookup are ORed together and masking extracts the proper diretion. My incremental codes for updates of the rotated BB seems to have negligible overhead.
3) bitwise -
Although the codes don't seem to look efficient, it happens to do fairly well "on average". The reason probably is analagous to bitscan forward/ reverse. One of the up/down direction probably is like bitscan forward and looping terminates much faster than the other bad direction.
3 positions are tested with fixed-depth search and the the searches are done in the order as given below:-
pos 1
r4rk1/3b3p/pqn1pbp1/3p1n2/1P3P2/3B4/PPNBQ1PP/R2NKR2 w Q - 0 21
bitwise +6.3%
1 a1b1 ( 94) depth( 7) pvL( 8) maxPly(32) nps(148059) sec(116.27 ) ftime(0) Poll(262)
Probe (9426618) HitGE_D( 8.0% ) HitL23/L1( 4.2% ) HitL2( 10271 ) HitL3( 20082 )
Result 1/2-1/2{Analysis Mode}
rotate 0%
1 a1b1 ( 94) depth( 7) pvL( 8) maxPly(32) nps(139233) sec(123.64 ) ftime(0) Poll(262)
Probe (9426618) HitGE_D( 8.0% ) HitL23/L1( 4.2% ) HitL2( 10271 ) HitL3( 20082 )
Result 1/2-1/2{Analysis Mode}
magic -0.6%
1 a1b1 ( 94) depth( 7) pvL( 8) maxPly(32) nps(138382) sec(124.40 ) ftime(0) Poll(262)
Probe (9426618) HitGE_D( 8.0% ) HitL23/L1( 4.2% ) HitL2( 10271 ) HitL3( 20082 )
Result 1/2-1/2{Analysis Mode}
bitwise +6.6%
1 a1b1 ( 94) depth( 7) pvL( 8) maxPly(32) nps(148403) sec(116.00 ) ftime(0) Poll(262)
Probe (9426618) HitGE_D( 8.0% ) HitL23/L1( 4.2% ) HitL2( 10271 ) HitL3( 20082 )
Result 1/2-1/2{Analysis Mode}
end test1
pos 2
rnb1kbnr/ppp1pppp/2q5/3N4/2pPP3/5N2/PP1B1PPP/R2QKB1R b KQkq e3 0 1
bitwise +1.9%
1 b7b5 ( -70) depth( 8) pvL( 9) maxPly(33) nps(140466) sec(59.76 ) ftime(0) Poll(128)
Probe (4642192) HitGE_D( 10.4% ) HitL23/L1( 4.1% ) HitL2( 8533 ) HitL3( 10459 )
Result 1/2-1/2{Analysis Mode}
rotate 0%
1 b7b5 ( -70) depth( 8) pvL( 9) maxPly(33) nps(137791) sec(60.92 ) ftime(0) Poll(128)
Probe (4642192) HitGE_D( 10.4% ) HitL23/L1( 4.1% ) HitL2( 8533 ) HitL3( 10459 )
Result 1/2-1/2{Analysis Mode}
magic +3.3%
1 b7b5 ( -70) depth( 8) pvL( 9) maxPly(33) nps(142299) sec(58.99 ) ftime(0) Poll(128)
Probe (4642192) HitGE_D( 10.4% ) HitL23/L1( 4.1% ) HitL2( 8533 ) HitL3( 10459 )
Result 1/2-1/2{Analysis Mode}
bitwise +2.7%
1 b7b5 ( -70) depth( 8) pvL( 9) maxPly(33) nps(141508) sec(59.32 ) ftime(0) Poll(128)
Probe (4642192) HitGE_D( 10.4% ) HitL23/L1( 4.1% ) HitL2( 8533 ) HitL3( 10459 )
Result 1/2-1/2{Analysis Mode}
end test2
pos 3
1r4k1/5ppp/p7/P2pp2q/1r4n1/N4QBP/2P3P1/R4K1R b - - 0 28
bitwise +1.4%
1 b4e4 ( -115) depth( 8) pvL(10) maxPly(31) nps(151595) sec(39.27 ) ftime(0) Poll(90)
Probe (3385317) HitGE_D( 15.4% ) HitL23/L1( 4.3% ) HitL2( 8179 ) HitL3( 13427 )
Result 1/2-1/2{Analysis Mode}
rotate 0%
1 b4e4 ( -115) depth( 8) pvL(10) maxPly(31) nps(149464) sec(39.83 ) ftime(0) Poll(90)
Probe (3385317) HitGE_D( 15.4% ) HitL23/L1( 4.3% ) HitL2( 8179 ) HitL3( 13427 )
Result 1/2-1/2{Analysis Mode}
magic +2.4%
1 b4e4 ( -115) depth( 8) pvL(10) maxPly(31) nps(153076) sec(38.89 ) ftime(0) Poll(90)
Probe (3385317) HitGE_D( 15.4% ) HitL23/L1( 4.3% ) HitL2( 8179 ) HitL3( 13427 )
Result 1/2-1/2{Analysis Mode}
bitwise +1.4%
1 b4e4 ( -115) depth( 8) pvL(10) maxPly(31) nps(151595) sec(39.27 ) ftime(0) Poll(90)
Probe (3385317) HitGE_D( 15.4% ) HitL23/L1( 4.3% ) HitL2( 8179 ) HitL3( 13427 )
Result 1/2-1/2{Analysis Mode}
end test 3
As magic seems to fair badly in pos1, a repeat is done for pos1 again and showed test 1 ok
The probable reason is that the best move happen to be a rook move which is bad for magic
pos 1
bitwise
1 a1b1 ( 94) depth( 7) pvL( 8) maxPly(32) nps(148749) sec(115.73 ) ftime(0) Poll(262)
Probe (9426618) HitGE_D( 8.0% ) HitL23/L1( 4.2% ) HitL2( 10271 ) HitL3( 20082 )
Result 1/2-1/2{Analysis Mode}
magic
1 a1b1 ( 94) depth( 7) pvL( 8) maxPly(32) nps(137586) sec(125.12 ) ftime(0) Poll(262)
Probe (9426618) HitGE_D( 8.0% ) HitL23/L1( 4.2% ) HitL2( 10271 ) HitL3( 20082 )
Result 1/2-1/2{Analysis Mode}
Best Regards
Rasjid
magic codes:-
- Code: Select all
void getBAttackBB_SQ64(brMove_t * pMS, const u64 all){
static const u64 diagonal[2][16] = {
0x0000000000000001, 0x0000000000000102, 0x0000000000010204, 0x0000000001020408,
0x0000000102040810, 0x0000010204081020, 0x0001020408102040, 0x0102040810204080,
0x0204081020408000, 0x0408102040800000, 0x0810204080000000, 0x1020408000000000,
0x2040800000000000, 0x4080000000000000, 0x8000000000000000, 0000000000000000,
0x0000000000000080, 0x0000000000008040, 0x0000000000804020, 0x0000000080402010,
0x0000008040201008, 0x0000804020100804, 0x0080402010080402, 0x8040201008040201,
0x4020100804020100, 0x2010080402010000, 0x1008040201000000, 0x0804020100000000,
0x0402010000000000, 0x0201000000000000, 0x0100000000000000, 0000000000000000
};
//work with local dir[2], better then pMS->bb[]
const u64 dir[2] = {
diagonal[0][RANK64(pMS->from) + FILE64(pMS->from)],
diagonal[1][7 + RANK64(pMS->from) - FILE64(pMS->from)]
};
u64 u = Bmagic(pMS->from, all) | (dir[0] & dir[1]);
pMS->bb[0] = ( pMS->dir[0] = dir[0]) & u;
pMS->bb[1] = ( pMS->dir[1] = dir[1]) & u;
pMS->bal = pMS->bb[0] ^ pMS->bb[1];
assert( (pMS->bb[0] | pMS->bb[1]) ==
getBAttackBB_Shift(pMS->bb[0] & pMS->bb[1], all));
return;
}
bitwise
- Code: Select all
void getBAttackBB_SQ64(brMove_t * pMS, const u64 all){
static const u64 diagonal[2][16] = {
0x0000000000000001, 0x0000000000000102, 0x0000000000010204, 0x0000000001020408,
0x0000000102040810, 0x0000010204081020, 0x0001020408102040, 0x0102040810204080,
0x0204081020408000, 0x0408102040800000, 0x0810204080000000, 0x1020408000000000,
0x2040800000000000, 0x4080000000000000, 0x8000000000000000, 0000000000000000,
0x0000000000000080, 0x0000000000008040, 0x0000000000804020, 0x0000000080402010,
0x0000008040201008, 0x0000804020100804, 0x0080402010080402, 0x8040201008040201,
0x4020100804020100, 0x2010080402010000, 0x1008040201000000, 0x0804020100000000,
0x0402010000000000, 0x0201000000000000, 0x0100000000000000, 0000000000000000
};
//work with local dir[2], better then pMS->bb[]
const u64 dir[2] = {
diagonal[0][RANK64(pMS->from) + FILE64(pMS->from)],
diagonal[1][7 + RANK64(pMS->from) - FILE64(pMS->from)]
};
u64 pos = dir[0] & dir[1];
assert(pos & all);
for (int ii = 2; ii--; ){
u64 lb = 1;
assert(pos & dir[ii]);
u64 i;
if ( !(i = dir[ii] & all & BITS_L(pos)) );
else{
while ( i & i - 1){
i &= i - 1;
}
lb = i & 0-i;
}
i = dir[ii] & all & BITS_G(pos);
pMS->bb[ii] = ( pMS->dir[ii] = dir[ii]) & (((i &= 0 - i) - lb) | i);
}
pMS->bal = pMS->bb[0] ^ pMS->bb[1];
assert( (pMS->bb[0] | pMS->bb[1]) == getBMapSq64(pMS->from, all));
return;
}
rotated:
- Code: Select all
void getBMapRtd(const board_t * board, stack_t * pS, brMove_t * pElem){
static const u64 diagonal[2][16] = {
0x0000000000000001, 0x0000000000000102, 0x0000000000010204, 0x0000000001020408,
0x0000000102040810, 0x0000010204081020, 0x0001020408102040, 0x0102040810204080,
0x0204081020408000, 0x0408102040800000, 0x0810204080000000, 0x1020408000000000,
0x2040800000000000, 0x4080000000000000, 0x8000000000000000, 0000000000000000,
0x0000000000000080, 0x0000000000008040, 0x0000000000804020, 0x0000000080402010,
0x0000008040201008, 0x0000804020100804, 0x0080402010080402, 0x8040201008040201,
0x4020100804020100, 0x2010080402010000, 0x1008040201000000, 0x0804020100000000,
0x0402010000000000, 0x0201000000000000, 0x0100000000000000, 0000000000000000
};
pElem->dir[0] = diagonal[0][RANK64(pElem->from) + FILE64(pElem->from)];
pElem->dir[1] = diagonal[1][7 + RANK64(pElem->from) - FILE64(pElem->from)];
pElem->bb[0] = lookupRtdDiag( pS, pElem->from, 0) & pElem->dir[0];
pElem->bb[1] = lookupRtdDiag( pS, pElem->from, 1) & pElem->dir[1];
pElem->bal = pElem->bb[0] ^ pElem->bb[1];
return;
}
#define lookupRtdDiag(pS, pos, dir) (rtdDiagLookup[(pos)] \
[(int)((pS)->stack->bbRtdDiag[(dir)] >> rtdDiagShift[(dir)][(pos)]) & 63])
#define lookupFile(pS, pos) (ftrLookup[(pos)] \
[(int)((pS)->stack->bbFTR >> rtdFileShift[(pos)]) & 63])
#define lookupRank(pS, pos) (rankLookup[(pos)] \
[(int)((pS)->all >> rankShift[(pos)]) & 63])