Winboard Forum

by **H.G.Muller** » 08 Sep 2006, 09:48

This stuff is really fascinating. 8-)

I just can't help wasting time on it. (I already have a very fast 0x88/mailbox-based move generator, and it is about the only thing I have, so I should really start working on search and evaluation... But for a mathemagician this stuff is almost irrisistable! :wink:

)

I have a pretty good idea why minimal-bit magics aren't possible even with redundant bits. XORing the redundancy bits for these sparse bitsets is really the same as addition, so setting redundancy bits is equivalent to adding a constant to the product. This can already be acheived quite effectively by havibng only a single redundancy bit that is always set. Without any computational overhead you could always leave the bit for one of the corner squares set (in both the masks and the AllPieces board). If you take a1 for that (the lsb!), it can be moved into the region of the key bits through the high bits of the magic, independent of anything else.

But it doesn't help much. The Bishop occupancy set has upto 9 bits, (in the center), and the Rook occupancy set upto 12 bits (for the corners). To make keys smaller than this, the magic multiplication would have to provide not only shuffling and shifting of bits to the high end of the bitboard. It would also have to do some sort of compression, by conveniently mapping occupancies that result in the same attack set onto the same key.

This now, can't be done. Multiplication with a constant (magic or otherwise) is a linear process, so it basically produces a linear combination of the occupancy bits occ[k]:

product = SUM w[k]*occ[k] + offset

Even in the ideal case where we could set the weights w[k] totally independently (because the occ[k] bits were spread out over the word farther apart than the key length), the weight of one bit is always the same, and not dependent on the presence or absence of other bits. (i.e. there are no terms like occ[k]*occ[m] contributing to the product). Rounding (clipping off the low bits) can introduce a very mild non-linear interaction between the bits (e.g. if they are both 1 the sum just exceeds the boundary between two 'quantization levels', but if only one or non are set, it is just below it). But you can use this trick only ones, by cleverly scaling the weights and chosing an offset (from the redundancy bits). And the type of compression we need is highly non-linear:

Suppose we are looking at 3 occupncy bits on one ray. They can have 8 states, that result in 4 different attack sets. If I write the first square on the ray (nearest to the attacker) left, these would be:

000: 4 moves
001: 3 moves
010: 2 moves
011: 2 moves
100: 1 move
101: 1 move
110: 1 move
111: 1 move

In theory the number of moves can be represented by 2 bits, one bit less than the occupancy. We see that the rightmost occupancy bit only makes a difference once (3 or 4 moves). This can be achieved by giving it a weight that is smaller than the quantization step, and tuning the offset such that this small step just exceeds the first threshold, but doesn't make any significant contribution when any of the other bits are set (i.e. any combination of their weights always stay clear of other thresholds.)

But then you have used up all your freedom: even if 000 is just below a threshold, 100 and 010 must fall in different 'bins' as 001, so the weight of one of them would have to be at least one quantum plus what you need to push the offset over the threshold, and the other still one quantum more. But that means that 110 will add at least three quanta plus twice the 'offset push', i.e. 110 can never be in the same bin as 100. So you cannot do with a two-bit key for this.

You will run into this problem every time. To have the farther pieces make a difference if the closer piece is absent, (as they must), they will make an unwanted difference if the closer piece is there. Because the difference they make to the product is independent on the presence or absence of other pieces. You will never be able to compress the key length by even one full bit.

Although this holds per ray, and this ray could be compressed a fractional bit (e.g. 5 states in stead of 8), you might hope that the presence of multiple rays (two for the Rook corners) might at least give you a full bit. But you can use the offset+rounding trick only once, you don't want the rounding in one ray to affect the rounding of the other, because the rays are truly independent.

Only if you would do more rounding (clip off not only the bits on the right, but also clear some bits in the center of the key) you will create independent thresholds. But this would be kind of defeating the purpose, you would be creating a key with holes in it.

So I think minimal magics just don't exist. For Rooks I find magics for 12-bit keys very easily. For 11-bit keys I find them easily, except for the corner squares (which I don't find at all, even at thousand times more searching) For 10-bit keys, I find them easily except for the edges...

by **Pradu** » 08 Sep 2006, 11:40

H.G.Muller wrote:So I think minimal magics just don't exist. For Rooks I find magics for 12-bit keys very easily. For 11-bit keys I find them easily, except for the corner squares (which I don't find at all, even at thousand times more searching) For 10-bit keys, I find them easily except for the edges...

I guess this is the common consensus, Lasse couldn't find all 11 bit keys either. There are 11-bit keys possible for the corners but only some of them (ask Grant Osborne if you want them). Anyways, it really dosen't matter that we need all 11 bit keys because it is only for a few squares. By the way, there isn't much math other than using ring theory and breaking down multiplication into shift-adds -- it really isn't that complex (hopefully people don't get daunted by something simple). I'm using magics because it is simpler conceptually than rotated for me, not just expressly for speed. :mrgreen:

It is true that if you add redundant bits it is the same as adding a constant in the end but...the index keys WILL change and all the lower bits after the multiply will matter too because of carrys. I haven't tried adding redundant bits to the occupancy bitbiard yet so I'm not sure if minimal bit magics exist if you add redundant bits. Also you can also have inversions when using redundant bits. Notice I used ^ instead of | when talking about them. So it isn't really just an addition at the end if you use ^ to add redundant bits. So I guess the possiblity of minimal bit magics is still alive ... but I'm too lazy to follow through with it.

I had some ideas about factoring out the magic number/key to break things down even more (just an idea in case your still interested in finding minimal bit magics). Please post if you do find them!

Also other than adding redundant bits, there might be other techniques to change the mapping of your occupancy. But the ^ is the cheapest I think.

by **Pradu** » 09 Dec 2006, 01:31

I have an idea for the generation of the best magic keys, but I am unable to implement it in code right now because of other things keeping me busy for the next month so I will post the idea here and hopefully someone will write some code for it and generate the best magic keys possible.

From here on out when I say n-bit magic I refer to a magic that will create an index that is n-bits.

The problem we are trying to solve is how to create 10-bit, 11-bit magics for 12 relevant bits in the occupancy. I believe one can figure this out by simplifying the problem by first assuming that we can create any index mapping we desire then figure out if the conditions are possible. I'll go through a quick example here:

Ok say we have a rank:
xabc-
With are rook placed on the x and a, b, and c as our relevant occupancy bits and - the edge square.

We know that there are 4 moves possible so it might be possible to have a 2-bit index (which I shall show not-possible later).

First lets try to create a 3-bit index. This is easy-each mapping maps to each bit in the index:

a maps to 001
b maps to 010
c maps to 100

And we will have a table with 8 entries. Now say we wanted to reduce this to two bit indecies. Lets try out an incorrect set of mappings for each bit:

a maps to 01
b maps to 10
c maps to 01

We can see clearly that this will not work because a=1,b=0,c=0 will produce the same bitboard as a=0,b=0,c=1, which is certainly not true. From this we can see a necessary condition:

All mappings for each relevant bit must be unique.

Ok.. lets try this then

a maps to 01
b maps to 10
c maps to 11

Oh but this does not work! if a=1,b=0,c=1 then our combined mapping is 01+11=0, but if a=0,b=0,c=0 the combined mapping is also 0. And we know for certain the moves of an empty occupancy is not the same as as the moves for one of the bits of the occupancy being on. So now we have proven any indecies less than 3-bits are not possible for 3 relevant bits. But this does not mean it isn't possible at all as it has been shown by 11-bit rook magics for 12 relevant bits and for many of the bishop magics as well. The above discussion leads us to the general condition for a working index mapping created by a magic:

The combinations (additions for the case of our magic shift-adds) of colliding mappings of the relevant bits onto the index must be unique.

Let me explain what the above means:

Lets expand our example to 6 bits like what is required for rooks

xabcdef-

a-f are our relevant occupancy bits (don't confuse with the file-letter).

So we can group our relevant bits into the colliding mappings:

if a=1 all values of bcdef will collide
if b=1 all values of cdef will collide
if c=1 all values of def will collide
ect.

Now we what we do to figure out our index mapping is to recursively try out sets of mappings until one of the mappings do not meet the criterion (it combines with a previous mappings to cause a transposition with a non-colliding mapping). After we figure out all the possible mappings, we go backwards and try to see if any multiplication of the actual placement of the bits in the bitboard can achieve that mapping, and if it can we have the best magic key possible. That's it, now we just have to write a program to do the above...which will take a lot of time, but we will go through a very small subset of 64 bit numbers and figure out the best magic keys possible in seconds instead of a million years :mrgreen:

. Once we figure out what changes we need to make to make the mappings work, we use the carry mappings to change the original mappings.

by **H.G.Muller** » 09 Dec 2006, 11:51

I want to add one remark:

Although I found it very hard to reduce the number of bits in a key by one below the obviously sufficient minimum, (which would reduce the table size for that square by a factor of 2), it seemed very easy to reduce the the table size by nearly a factor 2. It might thus pay to not only optimize on the number of bits, but also on the range spanned by the key. If a certain 12-bit magic is guaranteed to only produce a key between 0 and 2200, rather than between 0 and 4095, you could pack the tables correspondingly closer, and it is in practice nearly as good as an 11-bit key (which would require 2048 in stead of 2200 entries in the table.

Why do I think this is might be a lot easier?

Well, a multiplication of the occupancy bits abc... will give you a number A*a + B*b + C*c + .... The right-shift you do later can be seen as a division scaling the multipliers ABC... so that they need no longer be integers, but have an integer part of (say) 12 bits, and 20 bits of fraction.

Unfortunately the multipliers can't be chosen completely independently, because of the mapping of the squares in the bitboard. But suppose that they approximately can (most favorable case).

Now suppose a is the occupancy at the end of a long ray. (In particular the squares that require the most bits always have long rays!) Take the multiplier A of that occupancy equal to a very small negative (fractional) number -u. If no other squares of that ray are unoccupied, the contribution of that ray to the index would be 0 if a=0, and -u, which is rounded (chopped by the shift-right) to -1, if a=1. So a makes a difference, like it should, no matter how small u is, because of the rounding. Occupancy of the other ray(s) doesn't alter that, because we take their weights all integer.

The other 5 squares of the ray we give weights 1+u, 2+u, 4+u, 8+u, 16+u. If any of these are occupied, so the contribute to the key, addition of A*a does not have an effect after rounding. You thus automatically collapse states with a occupied/unoccupied to the same key in all cases were access to a is blocked. u must be small enough that even if all those bits are set, to give 31+5*u, the 5*u < 1, so it does not increase the key range. Not a very severe restriction, which is good, since many bits of the multipliers might not be freely selectable, due to the fact that the A, B, C, ... are in practice dependent on each other.

The range spanned is thus {-1, ... 31}, or 33 possibilities rather than the 64 you would expect for 6 bits. (Just to big to be encodeble in 5 bits, though...) The weights of the squares of the other ray could thus be taken 33, 66, 132, ... You safe a factor 33/64 on the size of the table.

by **Pradu** » 09 Dec 2006, 13:06

H.G.Muller wrote:I want to add one remark:

Although I found it very hard to reduce the number of bits in a key by one below the obviously sufficient minimum, (which would reduce the table size for that square by a factor of 2), it seemed very easy to reduce the the table size by nearly a factor 2. It might thus pay to not only optimize on the number of bits, but also on the range spanned by the key. If a certain 12-bit magic is guaranteed to only produce a key between 0 and 2200, rather than between 0 and 4095, you could pack the tables correspondingly closer, and it is in practice nearly as good as an 11-bit key (which would require 2048 in stead of 2200 entries in the table.

Why do I think this is might be a lot easier?

Well, a multiplication of the occupancy bits abc... will give you a number A*a + B*b + C*c + .... The right-shift you do later can be seen as a division scaling the multipliers ABC... so that they need no longer be integers, but have an integer part of (say) 12 bits, and 20 bits of fraction.

ABC is the factored form of the magic? I usually factor it by powers of 2 to make the* a <<. Btw a right-shift is more accurate

. A division by a value other than a power of 2 is far different than a right-shift by a power of 2. For example if it wasn't a power of 2 it would have a factor like 3 or something. Division by non-power of 2s isn't as clear because one cannot use the distributive property. For multiplication it was simply (A + B)( bit1 + bit2 + bit3) and we can get our shift-adds, but for division it is (1/(A+B))(bit1 + bit2 + bit3) so the mapping cannot be split up into independent factors of the magic.

Unfortunately the multipliers can't be chosen completely independently, because of the mapping of the squares in the bitboard. But suppose that they approximately can (most favorable case).

This is true, we must also remember the carry mappings into the index from the lower bits.

Now suppose a is the occupancy at the end of a long ray. (In particular the squares that require the most bits always have long rays!) Take the multiplier A of that occupancy equal to a very small negative (fractional) number -u.

Why would it matter if it was negative or not? What do you mean by fractional number?

If no other squares of that ray are unoccupied, the contribution of that ray to the index would be 0 if a=0, and -u, which is rounded (chopped by the shift-right) to -1, if a=1. So a makes a difference, like it should, no matter how small u is, because of the rounding. Occupancy of the other ray(s) doesn't alter that, because we take their weights all integer.

The other 5 squares of the ray we give weights 1+u, 2+u, 4+u, 8+u, 16+u.

Oh I see now, you mean u=MSB.

If any of these are occupied, so the contribute to the key, addition of A*a does not have an effect after rounding. You thus automatically collapse states with a occupied/unoccupied to the same key in all cases were access to a is blocked. u must be small enough that even if all those bits are set, to give 31+5*u, the 5*u < 1, so it does not increase the key range. Not a very severe restriction, which is good, since many bits of the multipliers might not be freely selectable, due to the fact that the A, B, C, ... are in practice dependent on each other.

The range spanned is thus {-1, ... 31}, or 33 possibilities rather than the 64 you would expect for 6 bits. (Just to big to be encodeble in 5 bits, though...) The weights of the squares of the other ray could thus be taken 33, 66, 132, ... You safe a factor 33/64 on the size of the table.

The explanation isn't clear, can you give an example and generate a magic here for square A1 for rooks?

I seriously doubt though you can save a half without a lower bit key. For this to happen you must make sure that when a=1 or a=0 all combinations of the other bits must be a low number, and since your method by hand has each one pretty much mapping to it's own bit, this is very difficult.

Give an example magic for A1 and I'll agree completely it is possible.

by **H.G.Muller** » 09 Dec 2006, 13:22

Well, perhaps it is better to use a more pragmatic approach, and simply search for a magic that uses as small a key range as possible for the given number of bits, rather than just taking the first one that works, without worrying about the theory too much. Since the keys always start at 0 (for an empty board), that means keeping track of the maximum key a magic produces, and make that as small as possible. If you can make it less than half of the available range, you know you have found a sub-minimal key, which can be encoded with one bit less. But the number of bits really isn't that important. It is the space taken by the tables that counts, and any savings will help, even if they don't reduce the number of bits.

by **Michael Sherwin** » 09 Dec 2006, 16:03

H.G.Muller wrote:Well, perhaps it is better to use a more pragmatic approach, and simply search for a magic that uses as small a key range as possible for the given number of bits, rather than just taking the first one that works, without worrying about the theory too much. Since the keys always start at 0 (for an empty board), that means keeping track of the maximum key a magic produces, and make that as small as possible. If you can make it less than half of the available range, you know you have found a sub-minimal key, which can be encoded with one bit less. But the number of bits really isn't that important. It is the space taken by the tables that counts, and any savings will help, even if they don't reduce the number of bits.

Excuse me for repeating myself from another thread. If memory usage is the main concerne here as that is what is slowing down the magic method the most then going to a split index would solve that. Just for example, have the magic pack the 12 bits for a corner square after shifting back, into the right most bits of ranks 1 & 2. Then the split index would require [2 ^ 6] + [2 ^ 6] = 128 memory cells rather than [2 ^ 12] = 4096 memory cells! We need to find out what the trade off is for huge memory reduction at the cost of a couple of extra instructions. This will have to be done for a fully functional program with a large positional hash, so the drag on the system by the move generators can be accuratly measured in true real world circumstances.

by **H.G.Muller** » 09 Dec 2006, 16:13

I think you are right in this respect, but because the index splitting gives such a huge memory saving, I set my aim higher: I would like it to work for all Queen directions simultaneously, rather than just Rook or Bishop separately. In that case the normal minimal key goes up to 19 bits, so 10+9 after splitting. Although that is already better than a Rook in unsplit magic, the memory saving you would get from making that the 10-part doesn't use the full 1024 entries but below 600 can still be substantial. So it is still worth thinking along these lines.

by **Michael Sherwin** » 09 Dec 2006, 17:20

H.G.Muller wrote:I think you are right in this respect, but because the index splitting gives such a huge memory saving, I set my aim higher: I would like it to work for all Queen directions simultaneously, rather than just Rook or Bishop separately. In that case the normal minimal key goes up to 19 bits, so 10+9 after splitting. Although that is already better than a Rook in unsplit magic, the memory saving you would get from making that the 10-part doesn't use the full 1024 entries but below 600 can still be substantial. So it is still worth thinking along these lines.

Agreed!

However, we still need the analysis of a one to one comparison b & r vs. b & r, before we will know just where we stand with, q vs. b & r.

by **Tord Romstad** » 11 Dec 2006, 18:53

H.G.Muller wrote:Well, perhaps it is better to use a more pragmatic approach, and simply search for a magic that uses as small a key range as possible for the given number of bits, rather than just taking the first one that works, without worrying about the theory too much. Since the keys always start at 0 (for an empty board), that means keeping track of the maximum key a magic produces, and make that as small as possible.

I had nothing important to do this evening, so just for fun I hacked up this magic rook number generator. It asks the user for a square index, a desired key size, and the number of magic numbers wanted, and spits out randomly generated magic numbers for the square. Here's an example for the h8 square:

Code: Select all: Enter a square index: 56 Enter a number of bits: 12 How many magic rook numbers do you want? 10 84a4104021008001 1810200402492 8000218008410053 4402980010013 e406234080001101 824800040201301 202d001048800025 250220308000c103 c01810422004012 2000514080010021

I used this program to generate 1000 12-bit keys for the a1 square, and checked the maximum key size for each of them. For all 1000 magic numbers, the maximum key size turned out to be exactly 4095. Based on this simple experiment, I doubt that it is easy to get close to 2000 by brute force alone.

Tord

by **Grant Osborne** » 11 Dec 2006, 21:35

Hi Tord

The 11-bit magic for A1 is 0x9CFFFCDDFCED714A (well it is for my board orientation) :wink:

Grant

by **Tord Romstad** » 12 Dec 2006, 00:26

Grant Osborne wrote:Hi Tord

The 11-bit magic for A1 is 0x9CFFFCDDFCED714A (well it is for my board orientation)

Hi Grant,

Thanks for the information! Yes, I remember you have posted this before. Unfortunately I can't use it, because using keys found by somebody else is cheating. :wink:

Tord

by **Grant Osborne** » 12 Dec 2006, 09:56

Tord

Maybe there are more 11-bit keys for this square, I have not looked. If you use your program to find them, try using much denser numbers. The same for the H1 square.

Grant

by **Tord Romstad** » 12 Dec 2006, 12:10

Grant Osborne wrote:Maybe there are more 11-bit keys for this square, I have not looked. If you use your program to find them, try using much denser numbers. The same for the H1 square.

Hello Grant,

Thanks for the suggestion. Perhaps I'll try it some time later, but I think my current magic number generator is too stupid for the task. It uses brute force and random numbers, which is sufficient for finding normal 10 (centre), 11 (edge) and 12 (corner) bit magic numbers for all squares in a second or so, but probably isn't enough for finding better numbers within reasonable time.

In any case, 11 bit keys for corner squares wouldn't help me much. I currently use a non-homogeneous attack array and an additional array to reduce the table size, like this:

Code: Select all: inline Bitmap rook_attacks(Square s, Bitmap blockers) { Bitmap b = blockers & RMask[s]; return RAttacks[RAttackIndex[s] + ((b * RMult[s]) >> RShift[s])]; }

Because there are only 4 corner squares, bringing them down to 11 bits would hardly reduce the table size at all.

10 bit magic numbers for all squares, on the other hand, would be very interesting, because they would make it possible to use a homogeneous array without wasting too much space. Unfortunately, finding such numbers seems to be very hard.

Tord

Winboard Forum

Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Re: Generating Magic Keys for Bitboard Move Generation

Who is online