Winboard Forum

by **Alessandro Scotti** » 17 Jun 2005, 13:14

Hi,
I am adding bitbases to Kiwi based on the Heinz "knowledgeable" approach (see http://www.aarontay.per.sg/Winboard/egtb.html for references to Heinz paper) and it looks like each bitbase requires quite a bit of work. I would like to start with the most important 4-piece endings... can you tell me what are the most common and/or useful in your experience? For now I have KPK, KPPK and KBPK, but that's just because I've personally seen Kiwi blunder on those! :-)

Also, I wonder whether all of the non-trivial bitbases are really needed. For example how important is KQKQ?
The latter question is also important for size, as I would like the entire bitbase set to take as little space as possible so skipping some can be very useful.

by **Tony van Roon-Werten** » 17 Jun 2005, 14:34

Alessandro Scotti wrote:Hi,
I am adding bitbases to Kiwi based on the Heinz "knowledgeable" approach (see http://www.aarontay.per.sg/Winboard/egtb.html for references to Heinz paper) and it looks like each bitbase requires quite a bit of work. I would like to start with the most important 4-piece endings... can you tell me what are the most common and/or useful in your experience? For now I have KPK, KPPK and KBPK, but that's just because I've personally seen Kiwi blunder on those!
Also, I wonder whether all of the non-trivial bitbases are really needed. For example how important is KQKQ?
The latter question is also important for size, as I would like the entire bitbase set to take as little space as possible so skipping some can be very useful.

Fortunately, the unimportant ones are mostly the smaller ones.

KPKP is important, then KPKB , KPKN , KPKR, KPKQ then equals without pawns then unequals without.

But best is to add them all, or else take special care for search instabilities.
When fe your bitbase tells the pawn wins in KPKQ but the score is < beta (because you already failed high and are now resolving) you will continue to search, promote the pawn, hit KQKQ (wich you didn't want to implement) and evaluation will return about 0, wich will most likely fail low again.

Tony

by **Alessandro Scotti** » 17 Jun 2005, 18:05

Hi Tony,
you are right, I'll have to implement them all or almost all (and in the proper order)!
Maybe there is another way to save space. I could include some bitbases only for the stronger side on move, then extend one ply if I detect an hit when the other side is on move. At the next ply I should always get a result from the bitbase and return immediately, so possibly that's not too expensive. What do you think?

by **Anonymous** » 17 Jun 2005, 20:04

Alessandro Scotti wrote:Hi Tony,
you are right, I'll have to implement them all or almost all (and in the proper order)!
Maybe there is another way to save space. I could include some bitbases only for the stronger side on move, then extend one ply if I detect an hit when the other side is on move. At the next ply I should always get a result from the bitbase and return immediately, so possibly that's not too expensive. What do you think?

It might be even better, to not extend at all, in case you got no bitbase hit (but you would expect one). Just search on - at the next search depth, you will find all those moves you would have extended anyway, and now have a hit, because side to move changed.

The same method also works very well for TBs - having only white to move TBs will be sufficient - in the worst case, you will need one depth more to find the correct move. OTOH, you will save lots of resources, caches will work more efficiently, ... Needs a bit more logic for the probing, however. With all TBs you only need to probe after capture or conversion. Now you have to also check if one more earlier was a capture or conversion.

For Bitbases, this logic can (and probably should) be left out anyway. They are so cheap to access. Also, when you have some bitbases, that are encoded by knowlege (like Heinz suggests it), you will have some cases, that you cannot decide. For example KBKN - almost always draw, but your knowledge will detect the possible exceptions easily (when one K is in the corner, and ...). Deciding if this is a real exception, however, will be difficult. In Yace, I just return "BB_DOES_NOT_KNOW". One or several plies deeper, it will know, usually. I see no efficient way, to implement this in another way.

If you only want 4-men bitbases, it might not be worth the trouble, to try to save some RAM. IIRC, yace needs about 18 MB for those. For all 3/4 men + kppkp it needs about 50 MB. Probably most of this can always be kept swapped out in tight memory situations. (When you have a g and h pawn, the part with c and d pawns will not be accessed).

Regards,
Dieter

by **Anonymous** » 17 Jun 2005, 20:21

>Fortunately, the unimportant ones are mostly the smaller ones.

Tony, I agree with most you have said - but is the above really true? Seems not too much difference to me - at least no big factor.

Code: Select all: 26.01.2005 00:05 660.997 kbkp.bbb 26.01.2005 00:05 660.997 kbkp.bbw 26.01.2005 00:05 218.411 kbnk.bbb 26.01.2005 00:05 218.411 kbnk.bbw 26.01.2005 00:05 660.997 kbpk.bbb 26.01.2005 00:05 660.997 kbpk.bbw 26.01.2005 00:05 660.997 knkp.bbb 26.01.2005 00:05 660.997 knkp.bbw 26.01.2005 00:05 660.997 knpk.bbb 26.01.2005 00:05 660.997 knpk.bbw 26.01.2005 00:05 72.240 kpblk.bbw 26.01.2005 00:05 86.688 kpk.bbb 26.01.2005 00:05 86.688 kpk.bbw 26.01.2005 00:05 814.868 kpkp.bbw 26.01.2005 00:05 254.647 kppk.bbb 26.01.2005 00:05 254.647 kppk.bbw 26.01.2005 00:06 18.741.946 kppkp.bbb 26.01.2005 00:06 18.741.946 kppkp.bbw 26.01.2005 00:05 28.644 kqk.bbb 26.01.2005 00:05 28.644 kqk.bbw 26.01.2005 00:05 218.411 kqkb.bbb 26.01.2005 00:05 218.411 kqkb.bbw 26.01.2005 00:05 218.411 kqkn.bbb 26.01.2005 00:05 218.411 kqkn.bbw 26.01.2005 00:05 1.057.594 kqkp.bbb 26.01.2005 00:05 1.057.594 kqkp.bbw 26.01.2005 00:05 349.457 kqkq.bbw 26.01.2005 00:05 349.457 kqkr.bbb 26.01.2005 00:05 349.457 kqkr.bbw 26.01.2005 00:05 28.644 krk.bbb 26.01.2005 00:05 28.644 krk.bbw 26.01.2005 00:05 218.411 krkb.bbb 26.01.2005 00:05 218.411 krkb.bbw 26.01.2005 00:05 218.411 krkn.bbb 26.01.2005 00:05 218.411 krkn.bbw 26.01.2005 00:05 1.057.594 krkp.bbb 26.01.2005 00:05 1.057.594 krkp.bbw 26.01.2005 00:05 349.457 krkr.bbw

You might wonder, why I have kbnk bitbase, and not implemented it by knowlege. I got frustrated, when I tried to code all exceptions and checked the code against Nalimov TBs to verify it. So I just included it - the 440 kB will not hurt.

Regards,
Dieter

by **Alessandro Scotti** » 17 Jun 2005, 21:31

Hi Dieter,
thanks for the info... that's good news too! :-)

I'm also not concerned by RAM usage, but I'm trying to compress the bitbases in order to save space on disk and there has been a lot of difference between having only one side or both in my experiments. For example KPPK shrinks to 9K when the strong side is on move, but takes 40K for the other side. Or, KBPK is 14K versus 34K.
I've also manually coded KBBK and KNNK (all cases) but like you did I'll probably use a bitbase for KBNK, that should compress a lot anyway.
I would like to store all bitbases in a few hundreds K's if possible. I spoke with Fabio Cavicchio during the last italian championship and he said Delfi takes about 200K to store all of its bitbases, but then Delfi is Delfi... :-)

by **Dann Corbit** » 18 Jun 2005, 00:06

I am wondering about Dieter's bitbase system.

Why did you not create every 5 man file? It seems a small cost.

What have your experiments shown for Elo difference?

Is the format of your bitbase files public?

I am guessing you just use the nalimov index scheme and the object of the lookup is 2 bits wide. Is that right?

Do you memory map the bitbase files? Do you load them into ram?

by **Tony van Roon-Werten** » 18 Jun 2005, 10:23

Dieter B?r?ner wrote:
>Fortunately, the unimportant ones are mostly the smaller ones.

Tony, I agree with most you have said - but is the above really true? Seems not too much difference to me - at least no big factor.

You might wonder, why I have kbnk bitbase, and not implemented it by knowlege. I got frustrated, when I tried to code all exceptions and checked the code against Nalimov TBs to verify it. So I just included it - the 440 kB will not hurt.

Regards,
Dieter

I was mostly speaking about KQKQ and KRKB and the stuff. They are among the smaller ones and IMO less important because when 1 side wins, it's mostly by an immediate capture wich you quiescence would return without bitbases as well.

All in all, it doesn't really matter how big the 4pieces are once you try to implement the 5 pieces

I recognize the feeling about the more trivial bases. Just implement them like the other ones and be done with it. The most trivial however can be encoded easily. (ie KQRK and KRRK) Winner to move, always win, looser to move and not stalemate always loss.

Tony

by **Anonymous** » 18 Jun 2005, 12:35

Dann Corbit wrote:I am wondering about Dieter's bitbase system.

Why did you not create every 5 man file? It seems a small cost.

What have your experiments shown for Elo difference?

Is the format of your bitbase files public?

I am guessing you just use the nalimov index scheme and the object of the lookup is 2 bits wide. Is that right?

Do you memory map the bitbase files? Do you load them into ram?

Dann, answers to your questions in the order they appear, and at the
end a sample, that shows how I implemented it.

I created kppkp and krpkr. kppkp needs "only" 2*18MB, krpkr needs
1806*48*61*60/5*2 = ~120MB already.

Loading all 5 men bitbases into RAM is too much for current desktop systems.
An idea would be to dynamically load interesting ones, depending on the
current board position.

No experiments about the Elo difference done.

The format is not public - the problem is, that it would be lots of work,
to document it. It is only documented by the actual implementation with
basically no comments :-(

The implementation was more or less "ad hoc".

I am using an indexing scheme rather similar to the scheme of Nalimov. But they
differ in details - for example in the major to minor/pawn ordering of
the pieces for index calculation. Also I have no ep positions stored
(the probe function will just return BB_WORTHLESS in case, ep is set).
Nalimov shrinks the index space a bit, by avoiding to store some
"contact checks" - I don't use this idea. Would be a lot of work
to implement it, and make indexing functions much more complicated.

Space requirements can easily be calculated. For kxky and kxyk, x,y no pawn:

462*62*61

For kxkp

1806*48*61

etc. For win/draw devide by 8, win/draw/loss devide by 5 (see below).
462 is the number of legal K-K positions, where the white K is always
mirrored to the a1-d1-d4 triangle, and when on the diagonal a1-h8, the black
K is mirrored to the a1-h1-h8 triangle. 1806 is the number of legal
K-K positions, where the white K is always mirrored to the left half
of the board. That part of the index is calculated by a simple lookup
to a precalculated table (kk_array[wk][bk]).

For win/draw/loss tables, I store 5 positions in each byte (Heinz
stores 4 positions). Lookup works by:

Code: Select all: static unsigned wdl_tab[5] = {1, 3, 9, 27, 81}; /* Careful: index evaluated twice! (Could be easily avoided) */ #define RES_WDL_POS(table, index, res) \ do \ { \ unsigned ti = (index)/5; \ unsigned tim = (index)%5; \ res = (table[ti]/wdl_tab[tim])%3; \ } while(0)

3^^5=243, so 5 positions fit in one byte.

I load them into memory (no memory mapping, which would not port easily).

I am using a different method for "finding" the correct bitbase/probing function,
than the one described by Heinz. I basiscally use some tables indexed by the
the men on the board. I have pawn=1, knight=2, ...
So to access the kxky bitbases, I just use irec_kxky_tab[piece1-1][piece2-1].probefunc().
The caller will take care to get piece1 and piece2 into the right order. if piece1>piece2,
colors/board/stm will be mirrored (for positions without pawns, board
mirroring will not be needed). For example for kxky, I have:
(Some lines are a bit long - I fear the formatting will be f**ked up when displayed.)

Code: Select all: typedef struct irec4_tab { int (*probe_func)(unsigned char **, int, int, int, int, int, int, int, int *); const char *name; unsigned char *bb_table[2]; unsigned long crc32[2]; unsigned flags; unsigned len; } IREC4_TAB; /* [5][5] would be sufficient */ static IREC4_TAB irec_kxky_tab[5][8] = { /* Pawn */ { {probe_kpkp, "kpkp", {NULL, NULL},{0x1b2bedd1UL, 0}, WDL, WDL_SIZE(KPKP_SIZ)}, /* kpkp */ {NULL, NULL, {NULL, NULL},{0, 0}, 0, 0}, /* kpkn */ {NULL, NULL, {NULL, NULL},{0, 0}, 0, 0}, /* kpkb */ {NULL, NULL, {NULL, NULL},{0, 0}, 0, 0}, /* kpkr */ {NULL, NULL, {NULL, NULL},{0, 0}, 0, 0} /* kpkq */ }, /* Knight */ { {probe_kmkp, "knkp", {NULL, NULL},{0x0043461bUL, 0xff21e06bUL}, 0, WD_SIZE(KXKP_SIZ)}, /* knkp */ {probe_kmkm, NULL, {NULL, NULL},{0, 0}, NOTB, 0}, /* knkn */ {NULL, NULL, {NULL, NULL},{0, 0}, 0, 0}, /* knkb */ {NULL, NULL, {NULL, NULL},{0, 0}, 0, 0}, /* knkr */ {NULL, NULL, {NULL, NULL},{0, 0}, 0, 0} /* knkq */ }, /* Bishop */ { {probe_kmkp, "kbkp", {NULL, NULL},{0xfd331caeUL, 0x43c588e9UL}, 0, WD_SIZE(KXKP_SIZ)}, /* kbkp */ {probe_kmkm, NULL, {NULL, NULL},{0, 0}, NOTB, 0}, /* kbkn */ {probe_kmkm, NULL, {NULL, NULL},{0, 0}, NOTB, 0}, /* kbkb */ {NULL, NULL, {NULL, NULL},{0, 0}, 0, 0}, /* kbkr */ {NULL, NULL, {NULL, NULL},{0, 0}, 0, 0} /* kbkq */ }, /* Rook */ { {probe_khkp, "krkp", {NULL, NULL},{0xf75df9cfUL, 0xd8ac90daUL}, WDL, WDL_SIZE(KXKP_SIZ)}, /* krkp */ {probe_khkm, "krkn", {NULL, NULL},{0xa3546575UL, 0x24edf19aUL}, 0, WD_SIZE(KXKY_SIZ)}, /* krkn */ {probe_khkm, "krkb", {NULL, NULL},{0xe6c2486eUL, 0xe7e1f204UL}, 0, WD_SIZE(KXKY_SIZ)}, /* krkb */ {probe_khkh, "krkr", {NULL, NULL},{0xa9d9fc99UL, 0}, WDL, WDL_SIZE(KXKY_SIZ)}, /* krkr */ {NULL, NULL, {NULL, NULL},{0, 0}, 0, 0} /* krkq */ }, /* Queen */ { {probe_khkp, "kqkp", {NULL, NULL},{0x0ddf01e2UL, 0x09bc5e65UL}, WDL, WDL_SIZE(KXKP_SIZ)}, /* kqkp */ {probe_khkm, "kqkn", {NULL, NULL},{0x34954cd1UL, 0x2a1882fcUL}, 0, WD_SIZE(KXKY_SIZ)}, /* kqkn */ {probe_khkm, "kqkb", {NULL, NULL},{0x1993ac58UL, 0xae6a84d1UL}, 0, WD_SIZE(KXKY_SIZ)}, /* kqkb */ {probe_khkh, "kqkr", {NULL, NULL},{0x6c9d0eeaUL, 0x1d46e24dUL}, WDL, WDL_SIZE(KXKY_SIZ)}, /* kqkr */ {probe_khkh, "kqkq", {NULL, NULL},{0x06dcf330UL, 0}, WDL, WDL_SIZE(KXKY_SIZ)} /* kqkq */ } }; For example the line /* Knight */ { {probe_kmkp, "knkp", {NULL, NULL},{0x0043461bUL, 0xff21e06bUL}, 0, WD_SIZE(KXKP_SIZ)}, /* knkp */

means, for knight vs. pawn, use the probe_kmkp (minor vs. pawn) function, "knkp"
means that the files on disk will be knkp.bbw and knkp.bbb; the two
NULL pointers will become dynamically allocated memory to the actual
data for wtm and btm when loaded from disk to RAM; then two CRC
checksums, to validate, that the files are not corrupted; the 0 means this is a win draw table -
for win/draw/loss there would appear the constant WDL, WD_SIZE() is a macro, that calculates
the size in bytes from the number of positions - it is the number of entries devided
by 8 rounded up.

The actual probing function then looks like

Code: Select all: /* code assumes A1 is 0, H1 is 7 and H8 is 63 */ /* Nalimov's trick for saving a bit of space */ #define EXCLUDE3(sq, sq1, sq2, sq3) (sq-((sq>sq1)+(sq>sq2)+(sq>sq3))) #define KXKP_INDEX(wk, bk, x, p) \ (kkpos[wk][bk]*61*48 + EXCLUDE3(x,wk,bk,p)*48 + (p)-8) #define RES_WD_POS(table, index, res) \ do \ { \ unsigned ti = index/8; \ unsigned tim = index%8; \ res = (table[ti]>>tim)%2; \ } while(0) #define MIRROR_LEFTRIGHT(p) ((p)^7) int probe_kmkp(unsigned char **table, int side, int piece1, int piece2, int wk, int bk, int x, int p, int *score) { unsigned index; int bit; int res; if (!table[side]) return BB_WORTHLESS; if (col(wk) >= 4) { /* mirror the board left to right */ wk = MIRROR_LEFTRIGHT(wk); bk = MIRROR_LEFTRIGHT(bk); x = MIRROR_LEFTRIGHT(x); p = MIRROR_LEFTRIGHT(p); } index = KXKP_INDEX(wk, bk, x, p); RES_WD_POS(table[side], index, bit); if (bit == BB_DRAW) { *score = 0; res = BB_EXACT; } else { /* P-side wins, look for exceptions */ if (piece1 == bishop) { if (bk == A1 && p == A2 && (wk == C1 || wk == C2 || wk == C3)) return BB_WORTHLESS; if (bk == A8 && p == A7 && (wk == C8 || wk == C7 || wk == C6)) return BB_WORTHLESS; } else { /* KNKP: Many exceptions */ if ((bk == A1||bk == A2) && (p==A2||p==A3||p==A4) && Distance(wk,bk) <= 3) return BB_WORTHLESS; if ((bk == A8||bk == A7) && (p==A7||p==A6||p==A5) && Distance(wk,bk) <= 3) return BB_WORTHLESS; } if (side == black) { *score = 500; res = BB_LOWER_BOUND; } else { *score = -500; res = BB_UPPER_BOUND; } } return res; }

Note, that I always return the same score above, when won. This means, that
there might be no progress. But it does not matter, really - when the position
is at the root of the search, lookup will be done in the real TBs. Heinz
uses some scoring heuristics, that will guarantee, that there will be progress.
I was too lazy to implement it. It would not have any advantage either, when
you have TBs available at the root - just use some more time.

All of above is pure easy technics. Delfi seems to do wizzard things, however :-)

Regards,
Dieter

by **Alessandro Scotti** » 18 Jun 2005, 14:20

Hi Dieter,
I don't use a special indexing scheme because there's so much going on at the moment that I can't possibly manage another level of complexity. Code is designed so that it can be put in at a later time though.
At some time Kiwi sources will be released under the GPL and I can't use EGTB probing code, so I must score positions (a la Heinz) and also embed my own bitbases in the engine, or at least make them as small a download as possible. That's why I'm interested in compression.
The method I'm using now is performing reasonably well in practice, but requires some work to be done for each bitbase. It takes three steps to compress a bitbase:
- a "predictor" function that predicts the outcome of each position;
- preliminary RLE compression (optional);
- ZIP compression.
Let me present an example with KBPK, strong side on move. In its raw form the bitbase would take 64*64*48*64 / 2 bits or 768K. Since KBP almost always win, actually 97.22% of those bits will have the same value (assuming invalid positions are properly set). This compress very well of course and actually zips to about 18K. I've found that in such cases adding a preliminary layer of RLE compression helps a lot, and here it will take the final size to 15K, that's more than 15% smaller.
So far so good. Now the problem is: can we improve the 97.22% figure? Yes, and here enters the predictor. This function encodes the bitbase by guessing the value of each position, then replacing the bitbase entry with a 0 if the guess was correct, or 1 if it was wrong. In the KBPK predictor I've put this simple line:

Code: Select all: if( FileOfSquare(wp) == 0 && ! is_light_square(wb) ) { if( distance( bk, A8 ) < (distance( wk, A8 )-1) && distance(bk,A8) < distance( wp,A8) ) { predicted_value = 0; } }

This says that when the weak king is closer to the promotion square of a rook pawn and the bishop cannot control that square, the position is draw. It doesn't have to be 100% accurate, all that's matter is that it's right more than it's wrong! :-)

With this one liner we now have 97.63% zero's in the bitbase, and it compresses to 14K.
That's the idea... very similar to what you said about KBNK: your code knews most positions but not all. So it wasn't a perfect recognizer, but it will probably make for an extremely good predictor, allowing for great compression of the corresponding bitbase.
In my code I take advantage of existing bitbases too, so I use KPK in the KPPK predictor, here's the full code:

Code: Select all: int bbAdjustKPPK( PackedArray * pa, int wtm, int op ) { printf( "Adjusting KPPK/%d...", wtm ); // Predictions are based on the KPK bitbases, get them PackedArray * kpk_wtm = getBitBase( bb_KPK, bb_WtM ); PackedArray * kpk_btm = getBitBase( bb_KPK, bb_BtM ); if( kpk_wtm == 0 || kpk_btm == 0 ) { printf( "*** Warning: KPPK adjuster can't find KPK bitbase\n" ); return -1; } // Use also the KPPK table with black to move PackedArray * kppk_btm = 0; if( wtm ) { kppk_btm = getBitBase( bb_KPPK, bb_BtM ); } PositionEnumerator posEnum; posEnum.addPiece( WhitePawn ); posEnum.addPiece( WhitePawn ); unsigned good_predictions = 0; while( posEnum.hasMorePositions() ) { if( FileOfSquare( posEnum.getWhiteKingPos() ) >= 4 ) { // Skip this position (simmetry) posEnum.gotoNextPosition(); continue; } if( ! posEnum.isValidPosition( wtm ? true : false ) ) { good_predictions++; posEnum.gotoNextPosition(); continue; } int index = getBitBaseIndex( bb_KPPK, posEnum ); // Predict and fix the bitbase int wk = posEnum.getWhiteKingPos(); int bk = posEnum.getBlackKingPos(); int p1 = posEnum.getPiecePos( 2 ); int p2 = posEnum.getPiecePos( 3 ); if( p1 > p2 ) { int x = p1; p1 = p2; p2 = x; } unsigned predicted_value = kpk_wtm->get( getBitBaseIndex(bb_KPK,wk,bk,p1) ) | kpk_wtm->get( getBitBaseIndex(bb_KPK,wk,bk,p2) ) | kpk_btm->get( getBitBaseIndex(bb_KPK,wk,bk,p1) ) | kpk_btm->get( getBitBaseIndex(bb_KPK,wk,bk,p2) ); if( RankOfSquare(p1) == 1 || RankOfSquare(p2) == 1 ) { predicted_value = 1; } if( FileOfSquare(p1) < 7 && p2 == (p1+1) ) predicted_value = 1; if( FileOfSquare(p1) < 6 && p2 == (p1+2) ) predicted_value = 1; if( FileOfSquare(p1) < 5 && p2 == (p1+3) ) predicted_value = 1; if( kppk_btm != 0 ) { predicted_value |= kppk_btm->get(index); } if( predicted_value == pa->get(index) ) { good_predictions++; } pa->set( index, pa->get(index) ^ predicted_value ); posEnum.gotoNextPosition(); } printf( " done, accuracy = %05.2f\n", (good_predictions * 100.0) / getBitBaseIndexRange(bb_KPPK) ); return 0; }

This was easy and fun to write, and made KPPK compress to 9K.
The bitbases for the weaker side do not compress so well, so I'm a bit concerned about those, but if it is actually possible to use bitbases for only one side then probably the few hundreds K target is feasible.

by **Anonymous** » 18 Jun 2005, 15:17

Cool ideas, Alessandro! Thanks for sharing them.

Delfi uses bitbases compressed in RAM - yes? Do you have much expertize in compression? What would be a fast method, that is easily understandable, but still compresses reasonably? Your smart suggestions sound a bit too heave for "on the way" processing. Or do you intend to use this during probing? Just RLE will for general files probably not compress very well.

I can believe, you had fun with your compression experiments. You might also try the following idea for your RLE phase: First enlarge the file, by using one more state for each position: instead of WDL, WDL + broken. Now during the RLE phase, you can just assume the state, that fits better for any broken position (that give a longer run). Perhaps even the more sophisticated compression algorithms could be tuned, to use those "does not matter to what it will decompress" information.

BTW. You seem to have a much better zip than me :-)

My kbpk files compress an order of magnitude worse, than yours. How is this explainable? Perhaps because I use 0 for broken positions, while most of the remaining file uses 1 for "won"? OTOH, I don't have that many broken positions - index space is 1806*48*61.

Code: Select all: C:\yacet>zip x.zip kbpk.* adding: kbpk.bbb (208 bytes security) (deflated 75%) adding: kbpk.bbw (208 bytes security) (deflated 82%) C:\yacet>unzip -v x Archive: x.zip Length Method Size Ratio Date Time CRC-32 Name -------- ------ ------- ----- ---- ---- ------ ---- 660997 Defl:N 163541 75% 25.01.05 23:05 52d702e8 kbpk.bbb 660997 Defl:N 122037 82% 25.01.05 23:05 6c3276de kbpk.bbw -------- ------- --- ------- 1321994 285578 78% 2 files

~120 kB compared the 18 kB you mentioned.

You could release a bitboard generation program without GPL and the engine under GPL.

Cheers,
Dieter

by **Alessandro Scotti** » 18 Jun 2005, 19:44

Hi Dieter,
I have coded almost all of the most important compression algorithms in the past, but currently I feel the biggest issue are patents. ZIP as implemented in the zlib library is supposedly patent free (author said he spent more time searching for patents than coding the library) and compresses well, so it's a good candidate. (I've also tried WinRAR and it compresses much better with no need for the RLE step, but I don't know the details of its compression.)
Decompression time for ZIP and RLE is almost negligible but the predictor takes some time, probably because of the slow access to the bit array. At any rate it takes maybe half a second to process a bitbase, and I think there is room for improvement. That excludes decompressing on the fly though, so I think it must be done at startup or at some later moment (e.g. when pondering).
The compression figure you mention for KBPK is similar to what I got at the beginning. So either I improved a lot or I have a bug! :-)

My program reports the following stats for KBPK:

Code: Select all: White to move: failed = 0 won = 8283622 = 65.83% (95.95% of valid) draw = 349608 = 02.78% (04.05% of valid) lost = 0 = 00.00% (00.00% of valid) invalid= 3949682 = 31.39% Black to move (score is relative to black): failed = 0 won = 0 = 00.00% (00.00% of valid) draw = 1719846 = 13.67% (16.78% of valid) lost = 8529618 = 67.79% (83.22% of valid) invalid= 2333448 = 18.54%

So in my index space there are 31.39% invalid positions: a huge number! Of those, about 18.5% are caused by broken positions where the pieces overlap and so on (those should be already out from your index space), and the rest is due to black being in check. I hope you can verify this with Yace, that would also decide how much I will sleep this night! :-)

At any rate for good compression it is absolutely important that these "bad" positions are scored as wins, so they contribute to make longer runs of the most probable value. When I did this change, I got the order of magnitude better compression.

by **Anonymous** » 18 Jun 2005, 20:42

Hi Alessandro,
after many stupid posts to CCC, I fear I should not post anymore, today.
(I am not drunk - not yet, I could assume :-)

But when it could help you sleeping, output of Yace for kbpk:

[/code]
kbpk.bbw
Try to create KBPK white to move bitbase in W/D format
Try to allocate space for 5287968 positions 660997 bytes: ok

npos 5124732 legal 4316615, broken 0 won 4141811 draw 174804 lost 0
Draw: 174804
M1 : 3118
M2 : 9838
M3 : 29622
M4 : 68023
M5 : 157067
M6 : 303115
M7 : 406369
M8 : 436186
M9 : 440135
M10 : 442601
M11 : 460888
M12 : 406655
M13 : 283840
M14 : 209994
M15 : 186502
M16 : 150419
M17 : 85135
M18 : 36103
M19 : 17489
M20 : 6110
M21 : 1917
M22 : 476
M23 : 167
M24 : 25
M25 : 4
M26 : 11
M28 : 1
M31 : 1
max 32736 (mate in 31)
8/3P4/KBk5/8/8/8/8/8 w - - 0 1
kbpk.bbw: length 660997 CRC32 = 0x6c3276deUL
kbpk.bbb
Try to create KBPK black to move bitbase in W/D format
Try to allocate space for 5287968 positions 660997 bytes: ok

npos 5124732 legal 5124732, broken 0 won 0 draw 859923 lost 4264809
-M31 : 2
-M28 : 24
-M27 : 48
-M26 : 129
-M25 : 384
-M24 : 944
-M23 : 2492
-M22 : 4168
-M21 : 7010
-M20 : 13627
-M19 : 33990
-M18 : 70856
-M17 : 137192
-M16 : 209974
-M15 : 243794
-M14 : 282260
-M13 : 338834
-M12 : 394004
-M11 : 422716
-M10 : 430805
-M9 : 447616
-M8 : 455096
-M7 : 415728
-M6 : 226693
-M5 : 84157
-M4 : 29591
-M3 : 8892
-M2 : 2793
-M1 : 873
-M0 : 117
Draw: 859923
min -32735 (mated in 31)
8/3P4/KB6/2k5/8/8/8/8 b - - 0 1
kbpk.bbb: length 660997 CRC32 = 0x52d702e8UL
[/code]

You just show the double of these, it seems (most easily spotted by looking at draws).

I'll try to set all broken positions to won here and see, how it will compress, then. But not today anymore.

I think, I can still have overlap of squares with this material configuration. Kings can overlap with pawns. Seems this is not easily avoidable - even Nalimov has this (?). Reason is - we prefer to only use 48 squares for pawns and not 64. This does not go well with the idea for having a lookup table for legal K-K positions.

In the past, I compared zip with datacomp (Andrew Kadatch's compression used in Nalimov's TBs). IIRC, datacomp was significantly better.

In TB code (accessing from disk), decompression time does not matter much (I have measured it in the past). You expect slow access anyway. For bitbases, that you want to use even in the deepest qsearch nodes, it could matter much. At least, when not an efficient caching scheme is used, too. Does not look as too much fun, to implement such a caching scheme. So for bitbases intended for use from RAM and for really fast access. demands on decompression times would be much higher, than for TB access from disk (that typically will not be done too close to the leafs).

Regards,
Dieter

PS. My wish for this forum, have optionally a very traditional quoting mechanism, like it is used in usenet news groups. Or similar to Parsimony (which IMHO is a bit worse than news groups type).

by **Daniel Shawul** » 21 Jun 2005, 14:06

Hi all
I want to add only KPK bitbase. If i understand correctly i have to
1) get results of positions from Nalimove tb's
2) put it in an array
3) calculate index and fetch result
how to calculate index?? Or do i have to use hashing?
daniel

by **Tord Romstad** » 21 Jun 2005, 14:34

Daniel Shawul wrote:Hi all
I want to add only KPK bitbase. If i understand correctly i have to
1) get results of positions from Nalimove tb's
2) put it in an array
3) calculate index and fetch result
how to calculate index?? Or do i have to use hashing?
daniel

For simple endgames like KPK, I wouldn't bother to use Nalimov TBs. It is only slightly more difficult, and much more fun, to generate the bitbase yourself. Because performance is not important, you don't have to do it in C/C++, but work with your favorite high-level programming language instead. I used Common Lisp when generating Glaurung's kpk.bin file.

My indexing scheme is simple and straightforward. First, I orient the board in such a way that white has the pawn and the white pawn is located on the left half of the board. This leaves us with 24 possible squares for the pawn, and 64 squares for each king. From these three numbers and the side to move I obtain an index in the range 0 ... (24*64*64*2 - 1).

Here is the complete code for probing the KPK bitbase in Glaurung:

Code: Select all: uint8 KPK_bitbase[24576]; void init_kpk(void) { FILE *f = fopen("kpk.bin", "rb"); int i; if(f == NULL) { printf("kpk.bin not found!\n"); quit(); } for(i = 0; i < 24576; i++) KPK_bitbase[i] = fgetc(f); fclose(f); } int probe_kpk(int wksq, int wpsq, int bksq, int side) { int index; wksq = COMPRESS(wksq); bksq = COMPRESS(bksq); wpsq = (wpsq & 3) + ((wpsq / 16) - 1) * 4; index = side + 2*bksq + 128*wksq + 8192*wpsq; return (KPK_bitbase[index/8] & (1<<(index&7))); }

The probe_kpk() function assumes that the white pawn is located somewhere on the left half of the board, and that all square indices are for a 0x88 board (a1=0, b1=1, ..., a2=16, ...). The COMPRESS macro simply converts 0x88 board indices to the range 0..63 (a1=0, b1=1, ..., a2=8, ...). If you use a 64-square board, just delete the line with COMPRESS.

probe_kpk() returns 0 if the position is a draw, and a nonzero value if white wins.

Feel free to steal Glaurung's kpk.bin file and the above code if you don't want or don't have time to do this yourself.

Tord

by **Daniel Shawul** » 22 Jun 2005, 06:05

Hi Tord
Thanks for the explanation!
I have plan to implement 3-4 piece table base access in RAM(E.Heinz paper) in the very very long term. The one i need most right now is
KPK bitbases. Others can be covered by a rough endgame eval, or are not that important.
I will try to generate KPK bitbase them my self (i might need more of those later),
if i fail i will steal Glaurung's :wink:

best
daniel

by **Fabien Letouzey** » 22 Jun 2005, 08:51

Dieter B?r?ner wrote:Delfi uses bitbases compressed in RAM - yes? Do you have much expertize in compression? What would be a fast method, that is easily understandable, but still compresses reasonably? Your smart suggestions sound a bit too heave for "on the way" processing. Or do you intend to use this during probing? Just RLE will for general files probably not compress very well.

Hi Dieter,

I have some expertise in data compression, although I have not experimented with win/loss/draw TB compression yet. I think Shaeffer's approach in the last Chinook paper (I think the one on 10-man DB) is in the right direction. I call this kind of algorithm "RLH" = RLE + Huffman.

Of course for quick experiments one can start with zlib on small blocks. I'd be curious what the overhead is.

Fabien.

by **Anonymous** » 22 Jun 2005, 23:26

Fabien Letouzey wrote:I have some expertise in data compression, although I have not experimented with win/loss/draw TB compression yet. I think Shaeffer's approach in the last Chinook paper (I think the one on 10-man DB) is in the right direction. I call this kind of algorithm "RLH" = RLE + Huffman.

Of course for quick experiments one can start with zlib on small blocks. I'd be curious what the overhead is.

Fabien, thanks for the hint. I will try to read Schaeffer's article soon. I guess it is online available (as many of his papers).

For usage of bitbases in RAM, I want very fast access, because I obviously want to use it also inside the deepest qsearch nodes. One could perhaps use a block size of 1 kB (not much less). One would hope, that uncompressing that block (compressed it might be only 100 bytes) should not take much longer than a typical call to eval. Of course, adding caching mechanisms will avoid to uncompress a block at each access. But it does not sound really attractive to implement.

With disk based TBs, things are very different. In general, you won't expect them to be usable inside the deepest leafs. Decompression time is much faster than disk access.

Another reason for my interest - I have my own TB generator (but not yet integrated inside the engine for searching). The easiest 7-men TBs can already be generated (Mark Bourzutschky has shown it - I probably have misspelled his name - sorry). It turns out, that KNNNNKQ just has about as many positions, as there are bits in memory, that can be adressed by a 32-bit computer. Positions with lower "symmetry" will fail here. It might be possible, to still generate them with compression (again Mark suggested this). Here, too, one would need very fast compression/decompression. (For accessing bitbases from RAM, only decompression times are important - compression time could be very slow without any real problem).

Regards,
Dieter

by **Dann Corbit** » 22 Jun 2005, 23:52

64 bit computers will make addressing of 7 man bitbase files simple.

Assume that you have a very large address space and also the memory to address. It will be true before long.

by **diepeveen** » 03 Jul 2005, 18:45

Dann Corbit wrote:64 bit computers will make addressing of 7 man bitbase files simple.

Assume that you have a very large address space and also the memory to address. It will be true before long.

Who is going to rewrite Kadatch code to 64 bits?

Winboard Forum

Endgame bitbases

Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Dieter questions about bitbase

Re: Endgame bitbases

Re: Dieter questions about bitbase

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Re: Endgame bitbases

Who is online