Winboard Forum

by **Reinhard Scharnagl** » 08 Dec 2004, 13:33

Hi Uri,

The point is that it is possible that some variables are updated not correctly during the process of making moves so the program can generate correct moves in the first time that it get position X but generate not correct moves in the second time that it gets it.

the situation you are describing is NOT related to a specific position. Thus the bad payload will get concrete something later anywhere. And so it will work even with caching. That is, a failure will happen, but probably not at the same position within the tree walk.

Reinhard.

by **milix** » 08 Dec 2004, 13:36

The point is that it is possible that some variables are updated not correctly during the process of making moves so the program can generate correct moves in the first time that it get position X but generate not correct moves in the second time that it gets it.
If you skip the second time you may not discover bug.

Note that I expect in this case to have errors also with caching because I cannot think of a bug that is relevant only when you go back to position that you visited but the point is that if you got correct result for perft(n) with caching you generated and made less moves so you have smaller probability that some bug will cause wrong results.

Exactly my point. Suspicious moves are promotions, ep-captures, castles. I want to generate, make, unmake all of them in perft(n) and in perft(n-1).

Reinhard, to make myself more clear, I'll give a second example:
Suppose we have exactly one pawn promoting in the last ply of perft(4). Our code for promotions is ok but unmake() has a bug and doesn't restore the situation properly when we promote to a knight. The board is ok but some promotions flags are not restored correctly. Maybe with your code of unmake() there isn't a possibility of such a bug that will influence the next generate_promotions() but lets assume that in my code or Uri's code there is. The next perft(5) will use the result of perft(4) and will not generate promotions, so the bug will not be noticied. I hope this helps!

by **Reinhard Scharnagl** » 08 Dec 2004, 13:51

Hi milix,

the type of errors you try to describe is such, that they even might not be detected in a conventional perft. It would be a kind of random behaviour. Then a cached Perft would not be able to secure its detection whereas a conventional Perft itself would not be also. Indeed because of a cached Perft is examining less nodes each ply and will be finished earlier with that level, such errors could be overseen. But having a cached Perft run about the same time, it will face a lot more of distinct positions than the conventional approach. So the chances will be even greater to detect such randomizing errors during the same calculating time than within the traditional approach.

Reinhard.

by **Uri Blass** » 08 Dec 2004, 17:51

Reinhard Scharnagl wrote:Hi Brian,
Is your definition of a collision a duplicate hash table index (subset of full key) or the full hash key (typically 64 bits)?

my position identification consists of two 32 bit numbers. One is generated to compute the storing address, the other combined with the unused part of the first is stored within the cached data as a verification key. Thus the total length of my position identification is 64 bit.

Hi Reinhard,

If I understand correctly you have solution that is good when the number of hash entries is not bigger than 2^32.

Of course in the near future there will be no need of hash tables with more than 2^32 entries.

I use different hash in movei and I store 64 bit number as the key in my hash tables and not only 32 bit number.

part of these 64 bit are not needed because I store the address as part of the hash key.

The way that I use hash is in theory more safe but I need more memory.

In case that I have 2^22 entries I can practically save 22 bits with no price by storing only 42 bits as the key but saving 32 bits mean some loss in safety that I do not know if it is important.

Uri

by **Sune Fischer** » 08 Dec 2004, 17:53

But having a cached Perft run about the same time, it will face a lot more of distinct positions than the conventional approach. So the chances will be even greater to detect such randomizing errors during the same calculating time than within the traditional approach.

Yes precisely. If there is a bug there is a much higher probability of finding it with hashing because instead of a normal perft 5-6 you would generally be running perft 6-7 in the same time and see a much more varied tree instead of testing the same nodes again and again.

For a hashed perft to work you can't have zobrist errors either, so that's a bit of added debugging for free.

I can certainly understand why some wouldn't want to implement a hashed perft since it's kinda overkill for a debug routine, but to say it is worse than a non-hashed perft is outright false, IMO.

-S.

by **Reinhard Scharnagl** » 08 Dec 2004, 20:00

Hi Uri,

If I understand correctly you have solution that is good when the number of hash entries is not bigger than 2^32

there seems to be a misunderstanding. Probably our identification method might be very similar. The position identification has a total length of 64 bit. But because of some reasons (one is that I am still using a 32 bit system) that identification consists of two 32 bit parts. From one the (initial!) storing address is derived and the non used part of that plus the other 32 bit word are stored within the cached data as a verificaton stuff. Thus the identification is potentially 64 bit.

But I use the cache multi-associative. So the storage place could vary. That tactic costs me about two bits. Additionally (as in my cached Perft example) there are more than one entry used for a single position, e.g. to store data for distinct calculation levels.

Reinhard.

by **Dan Honeycutt** » 08 Dec 2004, 21:04

Reinhard Scharnagl wrote:It has always been said that hash collisions would be generated only rarely and therefore simply could be ignored.

Hi Reinhard:

I'd like veer from the perft hashing and go back to your statement above at the start of the thread. If I'm testing a replacement strategy, like you said, I don't want to ignore hash collisions. But for normal play is there any reason not to ignore them?

I play the hash move, if I have one, before I generate moves. I used to worry about the consequences of playing an invalid move, say "empty square captures king" To that end I put in some debug code to look up the hash move once I generated moves and write to an error file if it was not found. From hundreds of games I never got an error. Then I thought, what's the worst thing that could happen?

To be sure, I'd likely get some strange scores searching the branch after "empty square captures king". No big deal - worst that could lead to is I play a stupid move. Somewhat worse is if I am unable to correctly unmake the move. Then I end up with a corrupt position, 2 kings for one side or some such. In that case it's certain that I will soon play an illegal move or my opponent will play a move that appears illegal to me. We get a dispute about legality which the TD will have to resolve. But I couldn't see it causing the program to crash.

So I removed the debugging code and went on my merry way. I've got enough to worry about with things that happen all the time to worry about things that happen once in a blue moon.

Dan H.

by **Reinhard Scharnagl** » 08 Dec 2004, 21:29

Hi Dan,

you are arguing for simply ignoring hash collisions during a normal game. My opinion to that is: it depends ...

a) I do not recommend to regard moves taken from cache as always existing. I still use that data to enforce the selection of fresh generated moves.

b) Hash collisions seems to happen only very rarely, so ignoring them could be a "solution". But is that really true?. The probability of them is tightly related to the quality of used random numbers to encode piece positions into an accumulated position key. Where ever I have read about that, there has been stated that conventional random numbers would be sufficient. I disagree absolutely with that assumption.

c) To verify that assumption a cached Perft will do a good service. As you can see from my calculated data, there is no such collision at all (in a practical move distance to the initial position). Avoiding hash collisions of course will not be worse to any program. So why should I use the traditional method of generating key-elements?

Reinhard.

by **Dan Honeycutt** » 08 Dec 2004, 22:44

Hi Reinhard:

I still use that data to enforce the selection of fresh generated moves.

If you do that a collision should cause no problem at all. I'm of course trying to save a bit of time by not generating moves if I can get a cutoff from the hash move.

The probability of them is tightly related to the quality of used random numbers

I've read a lot of debate on this, including the inadequacy of rand() but I must still confess ignorance. Bruce Moreland, who has a lot more experience in this than I, says he has always used rand() with no problem.

So why should I use the traditional method of generating key-elements?

You lost me here. Are you talking about generating random numbers for keys or generating the keys themselves from the random numbers?

What do you think about using true random numbers? When I converted Bruja's book to hash key based I needed fixed random numbers. I went to some site where you can download true random numbers that they generate with a lava lamp or whatever. I wrote a little script to format them into a C array (they came as bytes) and in 15 minutes I had true random numbers. I figured, since I had them for the book I might as well use them for my position keys as well. It was quick and easy - if true random numbers have an advantage over psudo-random why doesn't everybody use them?

Dan H.

by **Reinhard Scharnagl** » 08 Dec 2004, 23:03

Hi Dan,

You lost me here. Are you talking about generating random numbers for keys or generating the keys themselves from the random numbers?

probably I expressed myself again that it could be misunderstood. I make that experience when ever I do things probably another way than been done by many others. Why is that so? It seems to be strange and not to be avoided.

The point is not the way of generating pseudorandom numbers. Here I of course prosume that a stable method would be used. I moreover have written a 32 bit pseudo random number generator myself, when I saw that even good routines are only 31 bit generators.

The point is not to use each generated random number but a special subset of all generated candidates. The "switching quality" is not independent from the binary structure of the created numbers. Some extrem examples would be a generated zero or a generated -1. Would you want to use such numbers?

Reinhard.

by **Dan Honeycutt** » 08 Dec 2004, 23:48

Reinhard Scharnagl wrote:Some extrem examples would be a generated zero or a generated -1. Would you want to use such numbers?

Hi Reinhard.

No. But why would I ever expect to generate a 0 or -1?

Dan H.

by **Uri Blass** » 09 Dec 2004, 00:36

Hi Dan,

The point of Reinhard is that real random numbers mean that you may have some bad luck with part of them and it is better to used fixed number when there is no luck that is involved.

I totally agree with him but I simply did not do the effort of trying to think about generating better numbers when it probably worth less than 1 elo.

My opinion that it is not worth much is based on experiments when programs found the best move inspite of significant number of hash collisions.

Uri

by **Dan Honeycutt** » 09 Dec 2004, 04:52

Uri Blass wrote:The point of Reinhard is that real random numbers mean that you may have some bad luck with part of them...

Hi Uri:

I don't know. It sounded to me like he is talking about data structures or some means to detect if you generate a bad number (whether using true or psudo-random numbers). Reinhard has many interesting ideas but it can take some work to coax them out of him.

I'm sure when he reads this he will weigh in and add some clarification.

Dan H.

by **Reinhard Scharnagl** » 09 Dec 2004, 10:37

Hi Dan,

... but it can take some work to coax them out of him

well, it will need only to supply some comparable testing results here. It does not make much sense to talk about details and hear then that it would be regarded to be a irrelevant idea (though it is indeed not at all an ingenious approach).

Reinhard.

by **Uri Blass** » 09 Dec 2004, 11:10

Reinhard Scharnagl wrote:Hi Uri,
If I understand correctly you have solution that is good when the number of hash entries is not bigger than 2^32

there seems to be a misunderstanding. Probably our identification method might be very similar. The position identification has a total length of 64 bit. But because of some reasons (one is that I am still using a 32 bit system) that identification consists of two 32 bit parts. From one the (initial!) storing address is derived and the non used part of that plus the other 32 bit word are stored within the cached data as a verificaton stuff. Thus the identification is potentially 64 bit.

But I use the cache multi-associative. So the storage place could vary. That tactic costs me about two bits. Additionally (as in my cached Perft example) there are more than one entry used for a single position, e.g. to store data for distinct calculation levels.

Reinhard.

Hi again Reinhard.

I started to write my hash data and
I think that we need the same structure for storing hash in order to compare with your results.

You say that there is more than one entry for different calculation levels but how exactly you do it.

I can think about more than one way to do it.

Suppose for the discussion that you have 2^22 entries.

I can think about the following definition to store only one perft calculation per position and not more than perft 7
perftnums may include all the 4 numbers that you store in the relevant depth and the only condition is that perft<65536

typedef unsigned __int64 BitBoard;
typedef struct tagperftHASHE
{
BitBoard key:42;
int depth:3;
BitBoard perftnums
} perftHASHE;

If I want to store more than one perft calculation then there is some ways to do it.

Way 1:
storing perft 1 perft 2 perft 3 in the same hash key and you have

typedef unsigned __int64 BitBoard;
typedef struct tagperftHASHE
{
BitBoard key:42;
BitBoard perftnum1;
BitBoard perftnum2;
BitBoard perftnum3;
} perftHASHE;

The problem is that you waste memory when you store only one number.

Way 2:storing the depth inside the address.

In that case you may have longer key because instead of using 22 bit of the key position to calculate the storing address you use 3 bits of the calculation level and 19 bits of the key position so you are left with 45 unused bits that you can use and you have the following structure

typedef struct tagperftHASHE
{
BitBoard key:45;
BitBoard perftnums;
} perftHASHE;

The disadvantage is that you can store only 2^19 different positions instead of being able to stroring 2^22 positions.

practically in my chess program when I use hash not to calculate perft I chose the solution of storing all the 64 bits inspite of knowing that
I spend bits that are not needed because the alternatives meant different structure depends on the number of entries in my hash tables and I do not like to release different version for every size of hash.

There may be a better programming solution that will define different structure dependent on the hash size but I do not know how to do it.

Uri

by **Uri Blass** » 09 Dec 2004, 11:18

Uri Blass wrote:
Reinhard Scharnagl wrote:Hi Uri,
If I understand correctly you have solution that is good when the number of hash entries is not bigger than 2^32

there seems to be a misunderstanding. Probably our identification method might be very similar. The position identification has a total length of 64 bit. But because of some reasons (one is that I am still using a 32 bit system) that identification consists of two 32 bit parts. From one the (initial!) storing address is derived and the non used part of that plus the other 32 bit word are stored within the cached data as a verificaton stuff. Thus the identification is potentially 64 bit.

But I use the cache multi-associative. So the storage place could vary. That tactic costs me about two bits. Additionally (as in my cached Perft example) there are more than one entry used for a single position, e.g. to store data for distinct calculation levels.

Reinhard.

Hi again Reinhard.

I started to write my hash data and
I think that we need the same structure for storing hash in order to compare with your results.

You say that there is more than one entry for different calculation levels but how exactly you do it.

I can think about more than one way to do it.

Suppose for the discussion that you have 2^22 entries.

I can think about the following definition to store only one perft calculation per position and not more than perft 7
perftnums may include all the 4 numbers that you store in the relevant depth and the only condition is that perft<65536

typedef unsigned __int64 BitBoard;
typedef struct tagperftHASHE
{
BitBoard key:42;
int depth:3;
BitBoard perftnums
} perftHASHE;

If I want to store more than one perft calculation then there is some ways to do it.

Way 1:
storing perft 1 perft 2 perft 3 in the same hash key and you have

typedef unsigned __int64 BitBoard;
typedef struct tagperftHASHE
{
BitBoard key:42;
BitBoard perftnum1;
BitBoard perftnum2;
BitBoard perftnum3;
} perftHASHE;

The problem is that you waste memory when you store only one number.

Way 2:storing the depth inside the address.

In that case you may have longer key because instead of using 22 bit of the key position to calculate the storing address you use 3 bits of the calculation level and 19 bits of the key position so you are left with 45 unused bits that you can use and you have the following structure

typedef struct tagperftHASHE
{
BitBoard key:45;
BitBoard perftnums;
} perftHASHE;

The disadvantage is that you can store only 2^19 different positions instead of being able to stroring 2^22 positions.

practically in my chess program when I use hash not to calculate perft I chose the solution of storing all the 64 bits inspite of knowing that
I spend bits that are not needed because the alternatives meant different structure depends on the number of entries in my hash tables and I do not like to release different version for every size of hash.

There may be a better programming solution that will define different structure dependent on the hash size but I do not know how to do it.

Uri

Note that the disadvantage that I mentioned of being able to store only 2^19 positions is not exactly correct because with the same size of hash you may be able to have more entries but still there is disadvantage of storing the same hash key of the position many times relative to storing the hash key only once and after it storing many number for perft depths
and if you often get many perft result positions very often then the first way is better.

Uri

by **Reinhard Scharnagl** » 09 Dec 2004, 12:19

Hi Uri,

so I will give you the struct of my cache-entry as used in the cached Perft routine of Smirf:

Code: Select all: struct TBlock { unsigned Key; // hash-key for to verify (4 bytes) ushort TopShort; // partially unused part of the hash-address // also to be verified (2 bytes) ubyte Level; // quality / importance of information ubyte Mix; // special purpose flags union { struct TPerft { ushort Cnt[4]; // Perft counters (2 bytes): // moves, captures, checks, castlings } P; }; };

Reinhard

by **Reinhard Scharnagl** » 09 Dec 2004, 13:19

Hi again Uri,

it might be, my posting would not answer all of your questions.

The "Level" in my struct is NOT used for the identification of cache entries of matching level or depth. Its purpose simply is to tune the replacement scheme.

To create distinct entries for different level data concerning one position I modify the hash-address by XORing a predefined (for each level) 32-bit (special) random number.

I hope these informations would be sufficient now.

Reinhard.

by **Reinhard Scharnagl** » 09 Dec 2004, 18:26

Hi again all,

was there something in my last postings, which would need more explanation?

Reinhard.

by **Uri Blass** » 09 Dec 2004, 18:58

Reinhard Scharnagl wrote:Hi again all,

was there something in my last postings, which would need more explanation?

Reinhard.

Hi Reinhard,

Yes

If I understand correctly the only place that you can store the depth is in the special purpose flags.

1)Do the special purpose flag contain more information except the question if you store perft 1 or perft 2 or perft 3(I guess that with only 16 bits you usually cannot store perft 4)?

2)Does the level only tell you the importance of the data (I guess that you have bigger level for bigger perfts because it is more important to store bigger perfts)?

Note that I did not start my perft calculation with hash and at this moment I also did not implement perft that counts number of checks(it is on my todo list but did not get a priority until now).

I hope that some perft that also count all the data that you calculate will be ready next week but no promise about it.

Uri

Winboard Forum

The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Re: The Quality of Caching and Hash Keys

Who is online