Winboard Forum

by **Pradu** » 09 Nov 2006, 21:49

bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...

There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.

by **smcracraft** » 10 Nov 2006, 06:18

Pradu wrote:
bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...

There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.

How would I do that with this?

Code: Select all: return ( (GameBoard.p[side][pawn] & MoveArray[ptype[side^1]][tsq]) | (Knight[tsq] & GameBoard.p[side][knight]) | (Bmagic(tsq^63, GameBoard.Occupied) & (GameBoard.p[side][bishop]|GameBoard.p[side][queen])) | (Rmagic(tsq^63, GameBoard.Occupied) & (GameBoard.p[side][rook]|GameBoard.p[side][queen])) | (King[tsq] & GameBoard.p[side][king]) );

This expression returns bits of all attackers on the board
for the given side.

I'd like to use your idea to obtain Pinned pieces
as well as X-ray for uses in other ways...

but have no idea how to pursue it.

Stuart

by **Pradu** » 10 Nov 2006, 07:58

smcracraft wrote:
Pradu wrote:
bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...

There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.

How would I do that with this?

Stuart

Code: Select all: attacks=Rmoves(square,occupancy); xrayattacks=Rmoves(square,(occupancy&~attacks)) & ~attacks

Code: Select all: attacks=Bmoves(square,occupancy); batteryattacksB=Bmoves(square,(occupancy&~Queens[myside])) & ~attacks

Code: Select all: Finding diagonal pinners //Diagonal pins possiblePinned=Bmoves(pos->KingPos[side],pos->AllPieces)&pos->PiecesSide[side]; pinners=Bmoves(pos->KingPos[side],pos->AllPieces^possiblePinned) & ((pos->Pieces[B]|pos->Pieces[Q])&pos->PiecesSide[!side]);

by **bob** » 10 Nov 2006, 14:45

Pradu wrote:
bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...

There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.

I'm not sure I follow. I've always done x-ray attacks using rotated bitboards with no effort. Ditto for pins, where I need that in GenerateCheckEvasions()... And even doing all that, I got a zero speed increase...

by **smcracraft** » 10 Nov 2006, 15:46

Pradu wrote:
smcracraft wrote:
Pradu wrote:
bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...

There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.

How would I do that with this?

Stuart

Code: Select all
attacks=Rmoves(square,occupancy); xrayattacks=Rmoves(square,(occupancy&~attacks)) & ~attacks

Code: Select all
attacks=Bmoves(square,occupancy); batteryattacksB=Bmoves(square,(occupancy&~Queens[myside])) & ~attacks

Code: Select all
Finding diagonal pinners //Diagonal pins possiblePinned=Bmoves(pos->KingPos[side],pos->AllPieces)&pos->PiecesSide[side]; pinners=Bmoves(pos->KingPos[side],pos->AllPieces^possiblePinned) & ((pos->Pieces[B]|pos->Pieces[Q])&pos->PiecesSide[!side]);

Thanks much. This is one set of code I won't be seeking to
provide conditional compilation #ifdef's for with non-magic!

I will be spending some time this weekend to verify the
above does indeed find x-rays, diagonals, and batteries
and then thinking for quite a while on how best to use
them in the program.

Magic is *exactly* the kind of library-support I had hoped would
be provided some day when I read Frey/Slate/Atkin bemoan
the lack of tools for chess debugging.

Magic goes further than any tool could, by providing a
significant subset of the hardest part of the move generator
to do efficiently, and provide a method where simple table
lookup can be used across all pieces, not just non-sliders.

Stuart

by **Pradu** » 10 Nov 2006, 18:46

bob wrote:I'm not sure I follow. I've always done x-ray attacks using rotated bitboards with no effort. Ditto for pins, where I need that in GenerateCheckEvasions()... And even doing all that, I got a zero speed increase...

Well it is atleast as fast as rotated according to your tests. Can you post a quick profile of Crafty with magics here? It could be that generating move bitboards isn't done much timewise in Crafty. As for comparing versitality of rotated and magic approaches:

Let me guess that you have extra rotated xray-databases (RmovesXray[64][64], BmovesXray[64][64]) to do fast xrays? Even if one can do xrays with rotated, it isn't versitle. One can probably get away with it for checking pins but for anything complex it won't work because you can't xray selectively.

Some things you can do faster with magics are like finding pinners to your pieces (not opponents pieces as well).

Code: Select all: possiblePinned=Bmoves(pos->KingPos[side],pos->AllPieces)&pos->PiecesSide[side]; pinners=Bmoves(pos->KingPos[side],pos->AllPieces^possiblePinned) & ((pos->Pieces[B]|pos->Pieces[Q])&pos->PiecesSide[!side]);

Take the example of battery attacks which you can't do with rotated at all.

Code: Select all: //Battery attacks including regular attacks batteryattacksB=Bmoves(square,occupancy&~diagonalSliders[myside])

Tell me how one would do battery with rotated bitboards without any expensive LastOne/FirstOne calls and without 6 additional rotated bitboards for bishops+queens and rooks+queens. (You'll need to add 3 extra rotated bitboards for every case to generate selective xray attacks).

Or how about multiple xrays which might be required in SEE?

The thing is, one can't avoid FirstOne/LastOne calls completely to do some basic things with rotated bitboards...unless you can show me otherwise.

In any case, anything you can do with rotated, you can do with magics -- and then some. :mrgreen:

by **bob** » 10 Nov 2006, 21:10

Pradu wrote:
bob wrote:I'm not sure I follow. I've always done x-ray attacks using rotated bitboards with no effort. Ditto for pins, where I need that in GenerateCheckEvasions()... And even doing all that, I got a zero speed increase...

Well it is atleast as fast as rotated according to your tests. Can you post a quick profile of Crafty with magics here? It could be that generating move bitboards isn't done much timewise in Crafty. As for comparing versitality of rotated and magic approaches:

Let me guess that you have extra rotated xray-databases (RmovesXray[64][64], BmovesXray[64][64]) to do fast xrays? Even if one can do xrays with rotated, it isn't versitle. One can probably get away with it for checking pins but for anything complex it won't work because you can't xray selectively.

No, I just look at the diagonals to see if a sliding piece is beyond the piece in question (see swap.c and SwapXray() in it to see what I mean)...

Some things you can do faster with magics are like finding pinners to your pieces (not opponents pieces as well).
Code: Select all
possiblePinned=Bmoves(pos->KingPos[side],pos->AllPieces)&pos->PiecesSide[side]; pinners=Bmoves(pos->KingPos[side],pos->AllPieces^possiblePinned) & ((pos->Pieces[B]|pos->Pieces[Q])&pos->PiecesSide[!side]);

Take the example of battery attacks which you can't do with rotated at all.
Code: Select all
//Battery attacks including regular attacks batteryattacksB=Bmoves(square,occupancy&~diagonalSliders[myside])

Tell me how one would do battery with rotated bitboards without any expensive LastOne/FirstOne calls and without 6 additional rotated bitboards for bishops+queens and rooks+queens. (You'll need to add 3 extra rotated bitboards for every case to generate selective xray attacks).

In Swap() for example, After I "use" a queen or bishop to capture on a diagonal, I just look the other direction from that queen/bishop to see if there is another one back there. Very cheap...

Or how about multiple xrays which might be required in SEE?

See above. Once I use a piece, I do a very quick (one line) check to see if there is another slider behind it. And once I use that one, I look again. I don't really need to find all battery pieces at once, although I suppose I might find a way to use that reasonably in the SEE code. However, SEE is not a major part of my total time anyway...

The thing is, one can't avoid FirstOne/LastOne calls completely to do some basic things with rotated bitboards...unless you can show me otherwise.

At least in my SEE stuff I don't need any MSB/LSB stuff to find batteries at all... I do use MSB/LSB to pick off each individual piece as it participates in the capture "party"...

In any case, anything you can do with rotated, you can do with magics -- and then some.

There are complications. Sometimes I only want a specific diagonal. With rotated I can look that up. With magic I have to compute both and then remove one, which is not so easy with diagonals. I also use rank attacks or file attacks for things like asking if two rooks are connected on the 7th... Again I have to and off 1/2 the data the magic stuff gives me...

Here's my latest testing going back and modifying things to use your smaller tables:

32bit:
log.001: time=1:45 mat=0 n=93585150 fh=90% nps=883K
log.002: time=1:47 mat=0 n=93585150 fh=90% nps=869K

so there magic is actually slower, probably those 64 bit multiples that are killers on a 32 bit machine...

64bit:
log.001: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M
log.002: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M

No difference there at all. Looks like the 2mb version is actually faster in 32 bit mode and about equal on the 64 bit xeons we have. I'll try to run some opteron tests later...

by **Zach Wegner** » 11 Nov 2006, 03:31

bob wrote:At least in my SEE stuff I don't need any MSB/LSB stuff to find batteries at all... I do use MSB/LSB to pick off each individual piece as it participates in the capture "party"...

But the thing is, you don't need to do this, because you don't need to know the actual square of the attacker. You know the piece type, so you can just do an x ^= x & -x. Actually it is a little more complicated because you don't necessarily strip the LSB: y = x & piece_bb; x ^= y & -y. But this is still going to be faster than an LSB.

There are complications. Sometimes I only want a specific diagonal. With rotated I can look that up. With magic I have to compute both and then remove one, which is not so easy with diagonals. I also use rank attacks or file attacks for things like asking if two rooks are connected on the 7th... Again I have to and off 1/2 the data the magic stuff gives me...

I agree, and this is one of the reasons I don't use magic. However anti-rotated is the best of both worlds, easily modified occupied sets and separated directions.

Here's my latest testing going back and modifying things to use your smaller tables:

32bit:
log.001: time=1:45 mat=0 n=93585150 fh=90% nps=883K
log.002: time=1:47 mat=0 n=93585150 fh=90% nps=869K

so there magic is actually slower, probably those 64 bit multiples that are killers on a 32 bit machine...

64bit:
log.001: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M
log.002: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M

No difference there at all. Looks like the 2mb version is actually faster in 32 bit mode and about equal on the 64 bit xeons we have. I'll try to run some opteron tests later...

That's rather strange. I've never seen results where the large tables are faster. Perhaps you might want to run the tests multiple times, as there is some variation, at least in the 32 bit (1:45 and 1:47 for non-magic).

by **smcracraft** » 11 Nov 2006, 03:32

Pradu wrote:
smcracraft wrote:
Pradu wrote:
bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...

There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.

How would I do that with this?

Stuart

Code: Select all
attacks=Rmoves(square,occupancy); xrayattacks=Rmoves(square,(occupancy&~attacks)) & ~attacks

Code: Select all
attacks=Bmoves(square,occupancy); batteryattacksB=Bmoves(square,(occupancy&~Queens[myside])) & ~attacks

Code: Select all
Finding diagonal pinners //Diagonal pins possiblePinned=Bmoves(pos->KingPos[side],pos->AllPieces)&pos->PiecesSide[side]; pinners=Bmoves(pos->KingPos[side],pos->AllPieces^possiblePinned) & ((pos->Pieces[B]|pos->Pieces[Q])&pos->PiecesSide[!side]);

Hi,

I tried your code but am getting nowhere finding xrays, battery
and pins.

Please see:

Code: Select all: ShowXrayBatteryPins(void) { int square, side; BitBoard attacks, xrayattacks,batteryattacksB,pinners,possiblePinned, x; for (side = white; side <= black; side++) { x = GameBoard.AllPieces[side]; while (x) { square = NEXTBIT(x); CLEARBIT(x,square); if (square == c6 || square == c3 || square == f3 || square == f6) { printf("woops\n"); } attacks=Rmagic(square^63,GameBoard.Occupied); xrayattacks=Rmagic(square^63,(GameBoard.Occupied&~attacks)) & ~attacks; attacks=Bmagic(square^63,GameBoard.Occupied); batteryattacksB=Bmagic(square^63,(GameBoard.Occupied&~GameBoard.p[side][queen])) & ~attacks; possiblePinned=Bmagic(GameBoard.kingloc[side],GameBoard.AllPieces[side]); pinners=Bmagic(GameBoard.kingloc[side],GameBoard.Occupied^possiblePinned) & ((GameBoard.p[side][bishop]|GameBoard.p[side][queen]| GameBoard.p[side^1][bishop]|GameBoard.p[side^1][queen])&GameBoard.AllPieces[side^1]); if (xrayattacks) { printf("xrayattacks:\n"); ShowBitBoard(&xrayattacks); } if (batteryattacksB) { printf("batteryattacksB:\n"); ShowBitBoard(&batteryattacksB); } if (pinners) { printf("pinners:\n"); ShowBitBoard(&pinners); } } } }

by **bob** » 11 Nov 2006, 03:39

Zach Wegner wrote:
bob wrote:At least in my SEE stuff I don't need any MSB/LSB stuff to find batteries at all... I do use MSB/LSB to pick off each individual piece as it participates in the capture "party"...
But the thing is, you don't need to do this, because you don't need to know the actual square of the attacker. You know the piece type, so you can just do an x ^= x & -x. Actually it is a little more complicated because you don't necessarily strip the LSB: y = x & piece_bb; x ^= y & -y. But this is still going to be faster than an LSB.

Actually you do. Take the case of a bishop with a queen behind it and on another diagonal you have something like a pawn with a bishop behind it. As I use a piece in the Swap() calculation, I need to replace that piece with the piece that is behind it. Therefore I need to know _specifically_ which piece is in back. Knowing that there are lots of pieces in battery with other attackers is not important right now, I just need to know what is behind the piece I just used to make a capture...

There are complications. Sometimes I only want a specific diagonal. With rotated I can look that up. With magic I have to compute both and then remove one, which is not so easy with diagonals. I also use rank attacks or file attacks for things like asking if two rooks are connected on the 7th... Again I have to and off 1/2 the data the magic stuff gives me...
I agree, and this is one of the reasons I don't use magic. However anti-rotated is the best of both worlds, easily modified occupied sets and separated directions.

Here's my latest testing going back and modifying things to use your smaller tables:

32bit:
log.001: time=1:45 mat=0 n=93585150 fh=90% nps=883K
log.002: time=1:47 mat=0 n=93585150 fh=90% nps=869K

so there magic is actually slower, probably those 64 bit multiples that are killers on a 32 bit machine...

64bit:
log.001: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M
log.002: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M

No difference there at all. Looks like the 2mb version is actually faster in 32 bit mode and about equal on the 64 bit xeons we have. I'll try to run some opteron tests later...
That's rather strange. I've never seen results where the large tables are faster. Perhaps you might want to run the tests multiple times, as there is some variation, at least in the 32 bit (1:45 and 1:47 for non-magic).

I did run it multiple times with multiple positions. It caught my eye too... Remember my test is not just move generation, it is a complete search, where those 64 bit multiplies might get pipelined OK, but the extra memory reference can be a killer since cache is getting romped on pretty hard by everything and an extra L2 miss can be expensive... And with random accessing into the tables, fetching 128 bytes at a chunk (my L2) that extra reference might hurt more than expected.

by **smcracraft** » 11 Nov 2006, 04:24

Pradu,

Here's a rewrite that I think is more inline with your original
note, though it still doesn't find pinned pieces at c3, c6, f3, f6
after e4, e5, Nf3, Nf6, Nc3, Nc6, Bb5, Bb4, d3, d6, Bg5 Bg4.

Code: Select all: void /* * Show x-ray, battery, and pins */ ShowXrayBatteryPins(void) { int square, side; BitBoard attacks, xrayattacks,batteryattacksB,pinners,possiblePinned, x; for (side = white; side <= black; side++) { xrayattacks = batteryattacksB = pinners = NULLBITBOARD; x = GameBoard.p[side][bishop]|GameBoard.p[side][rook]; while (x) { square = NEXTBIT(x); CLEARBIT(x,square); if (BitPosArray[square] & GameBoard.p[side][bishop]) { attacks=Bmagic(square^63,GameBoard.Occupied); batteryattacksB=Bmagic(square^63,(GameBoard.Occupied&~GameBoard.p[side][queen])) & ~attacks; } else if (BitPosArray[square] & GameBoard.p[side][rook]) { attacks=Rmagic(square^63,GameBoard.Occupied); xrayattacks=Rmagic(square^63,(GameBoard.Occupied&~attacks)) & ~attacks; } if (xrayattacks) { printf("xrayattacks:\n"); ShowBitBoard(&xrayattacks); } if (batteryattacksB) { printf("batteryattacksB:\n"); ShowBitBoard(&batteryattacksB); } } possiblePinned=Bmagic(GameBoard.kingloc[side]^63,GameBoard.AllPieces[side]); pinners=Bmagic(GameBoard.kingloc[side]^63,GameBoard.Occupied^possiblePinned) & ((GameBoard.p[side][bishop]|GameBoard.p[side][queen]| GameBoard.p[side^1][bishop]|GameBoard.p[side^1][queen])&GameBoard.AllPieces[side^1]); if (pinners) { printf("pinners:\n"); ShowBitBoard(&pinners); } } }

by **Zach Wegner** » 11 Nov 2006, 04:33

bob wrote:Actually you do. Take the case of a bishop with a queen behind it and on another diagonal you have something like a pawn with a bishop behind it. As I use a piece in the Swap() calculation, I need to replace that piece with the piece that is behind it. Therefore I need to know _specifically_ which piece is in back. Knowing that there are lots of pieces in battery with other attackers is not important right now, I just need to know what is behind the piece I just used to make a capture...

I don't follow. Looking at the Crafty Swap() function, you can replace square with a bitboard that is set to WhitePawns & attacks, for example, and set attacked_piece to pawn or whichever piece you find. Then later you can find the LSB of this bitboard and clear it from attacks. You keep a bitboard with these LSBs ored into it, to represent what pieces have been used. The SwapXray function then uses this bitboard to calculate new attackers: Bmagic(square, Occupied & ~used) & (BishopsQueens & ~used), and likewise for rook directions (not sure if these names are the same as in Crafty). You can pass in the piece type to find which directions need to be recalculated (pawns, bishops, queens=diagonals; rooks, queens=orthogonals).

Of course whether this is faster than using bitscans, I don't know. I currently don't use xrays and I've been meaning to get rid of SEE as we know it altogether (I will post details if and when I can work it out).

by **Pradu** » 11 Nov 2006, 14:03

I did run it multiple times with multiple positions. It caught my eye too... Remember my test is not just move generation, it is a complete search, where those 64 bit multiplies might get pipelined OK, but the extra memory reference can be a killer since cache is getting romped on pretty hard by everything and an extra L2 miss can be expensive... And with random accessing into the tables, fetching 128 bytes at a chunk (my L2) that extra reference might hurt more than expected.

You are using MINIMIZE_MAGIC right? It only makes 1 L2 reference but the PERFECT_HASH thing uses 2 L2 references. I think most of us determined that PERFECT_HASH was a slower method, try MINIMIZE_MAGIC instead.

by **Gerd Isenberg** » 11 Nov 2006, 18:02

bob wrote:There are complications. Sometimes I only want a specific diagonal. With rotated I can look that up. With magic I have to compute both and then remove one, which is not so easy with diagonals. I also use rank attacks or file attacks for things like asking if two rooks are connected on the 7th... Again I have to and off 1/2 the data the magic stuff gives me...

Here's my latest testing going back and modifying things to use your smaller tables:

32bit:
log.001: time=1:45 mat=0 n=93585150 fh=90% nps=883K
log.002: time=1:47 mat=0 n=93585150 fh=90% nps=869K

so there magic is actually slower, probably those 64 bit multiples that are killers on a 32 bit machine...

64bit:
log.001: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M
log.002: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M

No difference there at all. Looks like the 2mb version is actually faster in 32 bit mode and about equal on the 64 bit xeons we have. I'll try to run some opteron tests later...

May be due to L2/L3 size the intel chips are faster with huge arrays, saving two additional L1 accesses for the variable shift and the pointer to the square-array. What imul performance does those 64-bit xeons have? Guess a lot more than the 4 K8 cycles. Let's see how crafty will perform on an amd-box. For 32-bit mode - 64bit imul is a three 32-bit imul call, i already found your 32-bit result stunning.

For distinct rank-, file- and diagonal-attacks you should really consider the 9.5K approach. I like to call it kindergarten bitboards now, to distinguish it from Lasse's or Pradu's magic bitboards. The multiplication of a file-pattern with a diagonal-, antidiagonal- or rank-pattern, with bits a-h are either zero or one, is rather easy to understand - even for preschool children i guess ;-)

Code: Select all: +- -+ +- -+ +- -+ | h 0 0 0 0 0 0 0 | | 0 0 0 0 0 0 0 0 | | 0 0 b c d e f g | | g 0 0 0 0 0 0 0 | | 0 0 1 0 0 0 0 0 | | 0 0 a b c d e f | | f 0 0 0 0 0 0 0 | | 0 0 0 1 0 0 0 0 | | 0 0 0 a b c d e | | e 0 0 0 0 0 0 0 | * | 0 0 0 0 1 0 0 0 | = | 0 0 0 0 a b c d | | d 0 0 0 0 0 0 0 | | 0 0 0 0 0 1 0 0 | | 0 0 0 0 0 a b c | | c 0 0 0 0 0 0 0 | | 0 0 0 0 0 0 1 0 | | 0 0 0 0 0 0 a b | | b 0 0 0 0 0 0 0 | | 0 0 0 0 0 0 0 1 | | 0 0 0 0 0 0 0 a | | a 0 0 0 0 0 0 0 | | 0 0 0 0 0 0 0 0 | | 0 0 0 0 0 0 0 0 | +- -+ +- -+ +- -+ +- -+ +- -+ +- -+ | 0 1 0 0 0 0 0 0 | | 0 0 0 0 0 0 0 0 | | 0 0 b c d e f g | | 0 1 0 0 0 0 0 0 | | 0 b 0 0 0 0 0 0 | | 0 0 b c d e f g | | 0 1 0 0 0 0 0 0 | | 0 0 c 0 0 0 0 0 | | 0 0 0 c d e f g | | 0 1 0 0 0 0 0 0 | * | 0 0 0 d 0 0 0 0 | = | 0 0 0 0 d e f g | | 0 1 0 0 0 0 0 0 | | 0 0 0 0 e 0 0 0 | | 0 0 0 0 0 e f g | | 0 1 0 0 0 0 0 0 | | 0 0 0 0 0 f 0 0 | | 0 0 0 0 0 0 f g | | 0 1 0 0 0 0 0 0 | | 0 0 0 0 0 0 g 0 | | 0 0 0 0 0 0 0 g | | 0 1 0 0 0 0 0 0 | | 0 0 0 0 0 0 0 0 | | 0 0 0 0 0 0 0 0 | +- -+ +- -+ +- -+

by **bob** » 11 Nov 2006, 22:46

Pradu wrote:
I did run it multiple times with multiple positions. It caught my eye too... Remember my test is not just move generation, it is a complete search, where those 64 bit multiplies might get pipelined OK, but the extra memory reference can be a killer since cache is getting romped on pretty hard by everything and an extra L2 miss can be expensive... And with random accessing into the tables, fetching 128 bytes at a chunk (my L2) that extra reference might hurt more than expected.

You are using MINIMIZE_MAGIC right? It only makes 1 L2 reference but the PERFECT_HASH thing uses 2 L2 references. I think most of us determined that PERFECT_HASH was a slower method, try MINIMIZE_MAGIC instead.

I am not actually using either. I started playing with this by using code posted here... The "minimize" version is what I am testing, not what you later released as "perfect".... Here is my AttacksRook() macro that I am testing with. I see four memory accesses there, the indices array, the mask array, the magic array and the shift array. Any or all of those could be out of L1/L2...

#define AttacksRook(square) *(magic_rook_indices[square]+((((tree->pos.w_occupied|tree->pos.b_occupied)&magic_rook_mask[square])*magic_rook[square])>>magic_rook_shift[square]))

Is that the version you are thinking about???

by **Pradu** » 12 Nov 2006, 10:15

bob wrote:#define AttacksRook(square) *(magic_rook_indices[square]+((((tree->pos.w_occupied|tree->pos.b_occupied)&magic_rook_mask[square])*magic_rook[square])>>magic_rook_shift[square]))
Is that the version you are thinking about???

Yes this is the correct one.

Winboard Forum

Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Re: Fastest Magic Move Bitboard Generator ready to use

Who is online