Good Zorbist keys set

Programming Topics (Computer Chess) and technical aspects as test techniques, book building, program tuning etc

Moderator: Andres Valverde

Re: Good Zorbist keys set

Postby Reinhard Scharnagl » 13 Jul 2005, 07:31

Sune,
The question is how do we figure out which has the better properties ...
I repeatedly pointed to the idea of using a Perft variant which makes use of the transposition table to store accumulated sums and comparing results to traditional Perft. If there were differences, the transposition table is to be corrected or there have been some hard collisions. Until now there have not been comparable attempts or results presented here.

Reinhard.
Reinhard Scharnagl
 
Posts: 608
Joined: 01 Oct 2004, 08:36
Location: Klein-Gerau, Germany

Re: Good Zorbist keys set

Postby Anonymous » 14 Jul 2005, 23:30

A better test for the quality of the numbers constructing the hash key could be:

Store the whole board in your HT
Store only a small part of the hash key in the HT - perhaps 20 bits
Do some searches, don't use the HT to cutoff. At each probe access, where the stored small key fits, compare with the whole board. If the board is not the same, you got a collision. Note the rate of collisions. Compare different sets of "random numbers".

When I did such a test (cannot remember all details), I did not find any differences between differently generated random numbers. But I did not try to maximize "signal distance".

Reinhard, your definition is still not enough. It only defines the distance of 2 numbers. But we need many. There could still be various definitions, of how to maximize the distances for the whole set. For example, trying to keep the minimum distance as high as possible. Or trying to keep the average distance as high as possible. Or trying to get a rather high average, but also having a small variance. And probably more ...

This would be a much better test, IMHO, than the perft test you suggest. The perft test with 64 bit stored key will never show a collision (when everything is slightly ok). I tried it. It is possible to estimate the collision rate (Dennis Breuker's thesis shows it) - and with todays computers we just won't get any.

Regards,
Dieter
Anonymous
 

Re: Good Zorbist keys set

Postby Sune Fischer » 15 Jul 2005, 01:31

I think there are really two things that should be tested.

One is if the collision frequency is acceptable, which it probably always will be even with half bad 64 bit keys.

More importantly is if the keys manage to distribute themself evenly in space. Imagine for instance if bit 4 has only 1% chance of being set, the hash would function badly with half the entries so seldomly used.
That is why I personally prefer to use random numbers, I fear otherwise to introduce some hard to see artifact.

BTW, an easy test to measure collision frequency with just a few lines of code, is to compare the full 64 bit key with a shorter n-bit key. If the first 20 bits match but not the full 64 bits, then there would have been a collision with a 20 bit key.

IIRC I found that collisions became rare around 40 bits (it will depend on hashsize). Fifty bits is already pretty good and adding 14 bits on top of that will make collisions something like 2^14 more unlikely, which is why we practicly never see 64 bit collisions.

-S.
User avatar
Sune Fischer
 
Posts: 126
Joined: 07 Oct 2004, 11:12
Location: Denmark

Re: Good Zorbist keys set

Postby Reinhard Scharnagl » 15 Jul 2005, 06:47

Hi Sune,

define a test and publish results.

Reinhard.
Reinhard Scharnagl
 
Posts: 608
Joined: 01 Oct 2004, 08:36
Location: Klein-Gerau, Germany

Re: Good Zorbist keys set

Postby Sune Fischer » 15 Jul 2005, 11:07

Ok, here is a quick test to get some feel for the frequency of collisions.

Declare a global array

int gCollisions[64];

Insert the following in your ProbeHash function just before you normally check the keys for a hashhit.

Code: Select all
      for (int i=1;i<64;i++) {
         uint64 bits = (((uint64)1)<<i)-1;
         if ( (current_key & bits) == (hash_key & bits) )  {
            if (current_key != hash_key)  {
               gCollisions[i]++;
            }
         }
      }



Reset the collisions counter before search and print it out afterwards.
I get the following:

Code: Select all
1 bit collisions = 30358654
2 bit collisions = 23861378
3 bit collisions = 20609741
4 bit collisions = 18977098
5 bit collisions = 18164480
6 bit collisions = 17759254
7 bit collisions = 17554819
8 bit collisions = 17451913
9 bit collisions = 17401356
10 bit collisions = 17376486
11 bit collisions = 17363591
12 bit collisions = 17357351
13 bit collisions = 17354028
14 bit collisions = 17352395
15 bit collisions = 17351608
16 bit collisions = 17351103
17 bit collisions = 17350899
18 bit collisions = 17350803
19 bit collisions = 8672683
20 bit collisions = 4337032
21 bit collisions = 2168784
22 bit collisions = 1081667
23 bit collisions = 540813
24 bit collisions = 270064
25 bit collisions = 133985
26 bit collisions = 65665
27 bit collisions = 33160
28 bit collisions = 16940
29 bit collisions = 8239
30 bit collisions = 4304
31 bit collisions = 2235
32 bit collisions = 1098
33 bit collisions = 568
34 bit collisions = 272
35 bit collisions = 154
36 bit collisions = 87
37 bit collisions = 50
38 bit collisions = 15
39 bit collisions = 10
40 bit collisions = 4
41 bit collisions = 1
42 bit collisions = 1
43 bit collisions = 1
44 bit collisions = 1
45 bit collisions = 1
46 bit collisions = 0
47 bit collisions = 0
48 bit collisions = 0
49 bit collisions = 0
50 bit collisions = 0
51 bit collisions = 0
52 bit collisions = 0
53 bit collisions = 0
54 bit collisions = 0
55 bit collisions = 0
56 bit collisions = 0
57 bit collisions = 0
58 bit collisions = 0
59 bit collisions = 0
60 bit collisions = 0
61 bit collisions = 0
62 bit collisions = 0
63 bit collisions = 0

total nodes searched 11029444


So it seems around 50 bits is enough to stop collisions.

-S.
User avatar
Sune Fischer
 
Posts: 126
Joined: 07 Oct 2004, 11:12
Location: Denmark

Re: Good Zorbist keys set

Postby Zach Wegner » 15 Jul 2005, 19:31

Hello Sune,

Seems like a good test. Just a quick comment--these numbers seem pretty close:
Sune Fischer wrote:
Code: Select all
7 bit collisions = 17554819
8 bit collisions = 17451913
9 bit collisions = 17401356
10 bit collisions = 17376486
11 bit collisions = 17363591
12 bit collisions = 17357351
13 bit collisions = 17354028
14 bit collisions = 17352395
15 bit collisions = 17351608
16 bit collisions = 17351103
17 bit collisions = 17350899
18 bit collisions = 17350803


Perhaps bits 8-18 aren't set as much (or are set much more) in your random numbers?

Regards,
Zach
User avatar
Zach Wegner
 
Posts: 182
Joined: 26 Sep 2004, 22:02
Location: Austin, Texas, USA

Re: Good Zorbist keys set

Postby Zach Wegner » 15 Jul 2005, 19:33

Ah, I just realized why. So your hash is 2^19 entries, I assume?
User avatar
Zach Wegner
 
Posts: 182
Joined: 26 Sep 2004, 22:02
Location: Austin, Texas, USA

Re: Good Zorbist keys set

Postby Sune Fischer » 15 Jul 2005, 20:10

Hi Zach

Yeah I noticed that as well after I posted, and I've come to the same conclusion - it's the hash size of course because the bits are also used for indicing.

It fits rather well actually, because if I use a larger hash more bits seem to be needed than with a smaller size hash.

I ran that test too :)

Very small hash (2 MB) ~40 bits needed

Code: Select all
1th bit collisions = 84036096
2th bit collisions = 83825685
3th bit collisions = 83718933
4th bit collisions = 83664898
5th bit collisions = 83638613
6th bit collisions = 83625887
7th bit collisions = 83618982
8th bit collisions = 83615744
9th bit collisions = 83613960
10th bit collisions = 83613175
11th bit collisions = 83612718
12th bit collisions = 83612561
13th bit collisions = 83612449
14th bit collisions = 41776602
15th bit collisions = 20889139
16th bit collisions = 10440363
17th bit collisions = 5225158
18th bit collisions = 2622300
19th bit collisions = 1313023
20th bit collisions = 658057
21th bit collisions = 328700
22th bit collisions = 163969
23th bit collisions = 81900
24th bit collisions = 41173
25th bit collisions = 20766
26th bit collisions = 10125
27th bit collisions = 5241
28th bit collisions = 2710
29th bit collisions = 1308
30th bit collisions = 587
31th bit collisions = 290
32th bit collisions = 111
33th bit collisions = 61
34th bit collisions = 29
35th bit collisions = 18
36th bit collisions = 7
37th bit collisions = 3
38th bit collisions = 1
39th bit collisions = 1
40th bit collisions = 1
41th bit collisions = 0
42th bit collisions = 0
...
total nodes 20424661


Very large hash (200 MB) ~44 bits needed

Code: Select all
1th bit collisions = 68011053
2th bit collisions = 47220104
3th bit collisions = 36825586
4th bit collisions = 31614457
5th bit collisions = 29016929
6th bit collisions = 27717794
7th bit collisions = 27067777
8th bit collisions = 26740790
9th bit collisions = 26579642
10th bit collisions = 26499701
11th bit collisions = 26459037
12th bit collisions = 26438309
13th bit collisions = 26427893
14th bit collisions = 26422708
15th bit collisions = 26420243
16th bit collisions = 26418793
17th bit collisions = 26418202
18th bit collisions = 26417908
19th bit collisions = 26417722
20th bit collisions = 26417644
21th bit collisions = 13227232
22th bit collisions = 6612346
23th bit collisions = 3306734
24th bit collisions = 1654428
25th bit collisions = 824725
26th bit collisions = 410876
27th bit collisions = 205487
28th bit collisions = 103168
29th bit collisions = 51017
30th bit collisions = 25410
31th bit collisions = 12781
32th bit collisions = 6026
33th bit collisions = 3056
34th bit collisions = 1552
35th bit collisions = 787
36th bit collisions = 390
37th bit collisions = 206
38th bit collisions = 89
39th bit collisions = 50
40th bit collisions = 27
41th bit collisions = 8
42th bit collisions = 4
43th bit collisions = 3
44th bit collisions = 1
45th bit collisions = 1
46th bit collisions = 0
47th bit collisions = 0
...
total nodes 26776742


Extrapolating from this I found, that at current CPU speeds and hash sizes we can expect it to take about 60-66 days for one 64 bit collision to occur.

But I don't think 64 bits will ever be insuffcient for this. Even if computers were 1000 times faster than today and had a handful of collisions an hour, the ratio of colliding nodes / good nodes would still be a very small number.

-S
User avatar
Sune Fischer
 
Posts: 126
Joined: 07 Oct 2004, 11:12
Location: Denmark

Previous

Return to Programming and Technical Discussions

Who is online

Users browsing this forum: No registered users and 11 guests