Testing and debugging chess engines

Programming Topics (Computer Chess) and technical aspects as test techniques, book building, program tuning etc

Moderator: Andres Valverde

Testing and debugging chess engines

Postby Patrice Duhamel » 03 Dec 2006, 14:54

Hello,

I'm new to chess programming, and I'm working on my own chess engine,
I'm using bitboards, and already have alpha-beta with transposition table (zobrist hash key using mersenne twister),
move ordering (hash move, killer move, MVV/VLA), opening book,
UCI + Winboard engine (but have my own interface).

I'm looking for methods, advices, to test if my engine works and help to find bugs.

I found the perft function, and the results on Reinhard Scharnagl website very usefull,
(I have the same result now, except for mate count on depth 7 for the default position)
but I don't know how to test if there are bugs in alpha-beta, move ordering, or transposition table...

my engine is very slow compared to other engines, I have 350 Kn/s where other engines show at least 1000 Kn/s :?
and I don't see a big change using transposition table and move ordering.

I compared the number of nodes searched at each depth (output by UCI engines) and it seems I'm searching a lot more nodes than other engines.


Thanks.
User avatar
Patrice Duhamel
 
Posts: 49
Joined: 03 Dec 2006, 14:27
Location: France

Re: Testing and debugging chess engines

Postby Reinhard Scharnagl » 03 Dec 2006, 20:43

Hi Patrice,

there are modifications of Perft which enumerate the partial counts for every possible move. That could help to navigate to errors.

Move ordering has no relation to Perft. Nevertheless using a cache and a transposition table could be tested by comparing results to an uncached Perft, if intermediate results are stored and reused at repeated positions.

Reinhard.
Reinhard Scharnagl
 
Posts: 608
Joined: 01 Oct 2004, 08:36
Location: Klein-Gerau, Germany

Re: Testing and debugging chess engines

Postby Patrice Duhamel » 05 Dec 2006, 08:56

Thanks for your answer,

I have a "divide" function to enumerate the partial counts for each moves, and I solved some bugs with it.

But, how could I check that transposition table and other engine features are working ?
User avatar
Patrice Duhamel
 
Posts: 49
Joined: 03 Dec 2006, 14:27
Location: France

Re: Testing and debugging chess engines

Postby Reinhard Scharnagl » 05 Dec 2006, 13:19

Simply compare your results. One calculated by counting each node, the other by calculating and storing once if a node is unknown or reusing the stored information else.

Reinhard.
Reinhard Scharnagl
 
Posts: 608
Joined: 01 Oct 2004, 08:36
Location: Klein-Gerau, Germany

Re: Testing and debugging chess engines

Postby H.G.Muller » 05 Dec 2006, 18:17

Unfortunately there are no standard node counts available for an alpha-beta tree. The problem is that the number of nodes visited depends on the evaluation function and on the move ordering.

Perft can also be used to test the hashing, just store the perft count in stead of the score. That will tell you if you properly include castling rights and e.p. captures in the key. (I found out that some of my Zobrist randoms were initialized to 0, that way...)

To test the alpha-beta logic and corresponding score bounds in hashing, I usually try deep searches in end-games like KPK or KRK. If it smoothly plays those endings, I trust that the hashing works OK.
User avatar
H.G.Muller
 
Posts: 3453
Joined: 16 Nov 2005, 12:02
Location: Diemen, NL

Re: Testing and debugging chess engines

Postby Tord Romstad » 05 Dec 2006, 19:22

Patrice Duhamel wrote:Thanks for your answer,

I have a "divide" function to enumerate the partial counts for each moves, and I solved some bugs with it.

But, how could I check that transposition table and other engine features are working ?

Hello Patrice,

Here are a few suggestions:
  • Extend the UCI/XBoard command set with a few commands of your own for use in debugging. In particular, it is useful to have a command for looking up the current position in the hash table and printing the information (score, score type, best move, etc). to the standard output. You can use this to browse the search tree after a search is finished. When you want to know why the program discarded some move, you can make the move and inspect the hash table entry for the corresponding position to find the score and refutation. I've found this to be a very valuable debugging technique, and even have a simple GUI app for browsing the tree (the GUI app communicates with the engine through pipes connected to the standard input and output).
  • The technique above can be further enhanced by including lots of additional information in the hash table when debugging the program. I sometimes store complete move lists with information about extension, reduction and pruning decisions for each move in every transposition table entry. Of course this makes each entry huge and greatly slows down the search, but it can be useful when chasing bugs or looking for ways to make the search more efficient.
  • Implement an MTD(f) search, even if you intend to use PVS. MTD(f) is great for debugging the hash table; all sorts of obscure bugs which are very tricky to find in PVS or other conventional searches suddenly become easy to spot.
  • Whenever you add some non-trivial new function to your program, try to write two versions: One which is very slow and stupid, but almost certainly correct, and one which is highly optimised. Verify on a huge number of positions that they give the same results. Remove the slow version only when you feel 100% sure that the fast version is correct.
  • Always make symmetry tests when you add a new term to your evaluation function.
  • Run through a simple tactical test like WAC at 5 seconds/move every time you change something important in your search. Don't try to optimize the results, but just make sure that the score doesn't suddenly drop dramatically.
  • Check the quality of your move ordering by measuring how often a beta cutoff occurs on the first move, and the frequencies with which the 1st, 2nd, 3rd, ... move turns out to be best at PV nodes.


Tord
User avatar
Tord Romstad
 
Posts: 639
Joined: 09 Oct 2004, 12:49
Location: Oslo, Norway

Re: Testing and debugging chess engines

Postby Reinhard Scharnagl » 05 Dec 2006, 23:03

H.G.Muller wrote:Unfortunately there are no standard node counts available for an alpha-beta tree. The problem is that the number of nodes visited depends on the evaluation function and on the move ordering. ...

Thus I suggest that procedure not for such trees, instead I did it myself for an unpruned Perft tree. It is very unimportant, which results are stored for each node (I did it for several move type counts), but it is important to compare the results to direct sums not built by using values stored in the cache.

Reinhard.
Reinhard Scharnagl
 
Posts: 608
Joined: 01 Oct 2004, 08:36
Location: Klein-Gerau, Germany

Re: Testing and debugging chess engines

Postby Ryan Benitez » 06 Dec 2006, 07:33

Tord Romstad wrote:[*]Extend the UCI/XBoard command set with a few commands of your own for use in debugging.


Yes, this has been very helpful to me. I also think every program should have a showboard command.

Tord Romstad wrote:[*] Whenever you add some non-trivial new function to your program, try to write two versions: One which is very slow and stupid, but almost certainly correct, and one which is highly optimised. Verify on a huge number of positions that they give the same results. Remove the slow version only when you feel 100% sure that the fast version is correct.


What I do is have both functions trigger but the value of only the stable version is used in the program and the differences (hopefully none!) in the two functions gets logged.

Thanks,
Ryan
Ryan Benitez
 
Posts: 19
Joined: 03 Nov 2005, 23:27

Re: Testing and debugging chess engines

Postby Tord Romstad » 06 Dec 2006, 09:34

Ryan Benitez wrote:
Tord Romstad wrote:[*]Extend the UCI/XBoard command set with a few commands of your own for use in debugging.


Yes, this has been very helpful to me.


Recently I have been thinking about taking this one step further, and embed a small Forth or Scheme interpreter in my program, and to use this interpreter instead of the regular UCI interface during development. Such an interpreter would be useful not only when debugging, but also when experimenting and trying out new algorithms. I could prototype all new algorithms in my Forth/Scheme without suffering through the C/C++ edit-compile-run-debug loop, and write the actual C code only when I am satisfied with the design and am 100% sure there are no bugs.

I also think every program should have a showboard command.


Yes, either that, or a separate GUI debugging utility (I have both). Commands for printing other information about the current position (hash keys, piece lists, bitboards, incrementally updated material scores, piece counts, legal move list, checking move list, etc.) are also useful.

Tord
User avatar
Tord Romstad
 
Posts: 639
Joined: 09 Oct 2004, 12:49
Location: Oslo, Norway

Re: Testing and debugging chess engines

Postby Patrice Duhamel » 15 Dec 2006, 11:54

I compare the hashkey with the current board, in the alphabeta function, is this sufficient to test the transposition table ?

Where could I find an example of perft that uses a transposition table ?
I'm trying to add the transposition table to my perft function,
but I have wrong result after depth 3.


Patrice.
User avatar
Patrice Duhamel
 
Posts: 49
Joined: 03 Dec 2006, 14:27
Location: France

Re: Testing and debugging chess engines

Postby H.G.Muller » 16 Dec 2006, 09:45

Depth 3 is the first depth where you have transpositions, so obviously your transposition table is not OK. There are many things that could go wrong besides differentially updating the hash key.

The best way to zoom in on the error is by making MakMove() keep track of the current path (by something like Path[Ply] = Move;), and just before the return of every perft call put a print statement to print the perft count of that subtree under an if() with the condition

if(Ply==1 || Ply==N && Path[1]==M1 && Path[2]==M2 && ... && Path[N]==MN) ...

where N is the depth for which you want to see the breakdown per move, and M1, ... MN the moves that lead to the node with the error. By comparing with the version that doesn't hash you can zoom in on a node that gives a wrong count, and this wrong count will probably have been taken from the hash table. You can then print the number of the entry and its contents, and rerun the program with a print statement that executes when the entry with this number is filled.

By the way, did you make sure that you only count hash hits that match both in position and depth? Unlike in search, a larger depth is not upward compatible in perft!
User avatar
H.G.Muller
 
Posts: 3453
Joined: 16 Nov 2005, 12:02
Location: Diemen, NL

Re: Testing and debugging chess engines

Postby Patrice Duhamel » 16 Dec 2006, 11:25

H.G.Muller wrote:By the way, did you make sure that you only count hash hits that match both in position and depth? Unlike in search, a larger depth is not upward compatible in perft!


I don't know if my problem is with the transposition table,
or in the use of the transposition table in my perft function.

I'm not sure where I should add the count in the transposition table for perft,

someone can tell me what's wrong in my code ?
(perft function gives good results without the transposition table)

Code: Select all
//////////////////////////////////////////////////////////////////////////////////
int findPerftHashNode(HashKey key, int depth, ChessPerftHashNode *pn)
{   
   ChessPerftHashNode *hash = &perftHashTable[key % HASH_TABLE_SIZE];

   if (hash->key == key) {      
      if (hash->depth >= depth) {
         pn = hash;
         return 1;      
      }   
   }

   return 0;
}

//////////////////////////////////////////////////////////////////////////////////
void addPerftHashNode(HashKey key, unsigned char depth)
{
   ChessPerftHashNode *hash = &perftHashTable[key % HASH_TABLE_SIZE];
            
   hash->depth     = depth;
   hash->key       = key;
   hash->node      = node;
}


//////////////////////////////////////////////////////////////////////////////////
void ChessBoard::perft(int depth, int startmove)
{
   // find score in hash table
   ChessPerftHashNode   phnode;

   if (findPerftHashNode(_board.hashKey, depth, &phnode)) {
      perftNode[depth].node += phnode.node;
      return;
   }

   // generate moves
   unsigned int      nbMoves = 0;   
   _board.generateMoves(possibleMoves + startmove, &nbMoves);      
      
   for (int n=0; n<nbMoves; n++) {         
   
      // make move
      if (_board.makeMove(possibleMoves[startmove + n])) {
            
         perftNode[depth].node++;            

         if (depth > 1) {
            perft(depth - 1, startmove + nbMoves);
         }       
      }
      
      // unmake move
      _board.unMakeMove(possibleMoves[startmove + n]);
   }

   // add score in hash table
   addPerftHashNode(_board.hashKey, depth, perftNode[depth].node);
}
User avatar
Patrice Duhamel
 
Posts: 49
Joined: 03 Dec 2006, 14:27
Location: France

Re: Testing and debugging chess engines

Postby Tord Romstad » 16 Dec 2006, 11:37

Patrice Duhamel wrote:
H.G.Muller wrote:By the way, did you make sure that you only count hash hits that match both in position and depth? Unlike in search, a larger depth is not upward compatible in perft!


I don't know if my problem is with the transposition table,
or in the use of the transposition table in my perft function.

I'm not sure where I should add the count in the transposition table for perft,

someone can tell me what's wrong in my code ?
(perft function gives good results without the transposition table)

Code: Select all
//////////////////////////////////////////////////////////////////////////////////
int findPerftHashNode(HashKey key, int depth, ChessPerftHashNode *pn)
{   
   ChessPerftHashNode *hash = &perftHashTable[key % HASH_TABLE_SIZE];

   if (hash->key == key) {      
      if (hash->depth >= depth) {
         pn = hash;
         return 1;      
      }   
   }

   return 0;
}


In the code above, you do exactly the mistake HGM warns you against in the quote at the top of this post: You use "if(hash->depth >= depth) ..." instead of the correct "if(hash->depth == depth) ...".

Tord
User avatar
Tord Romstad
 
Posts: 639
Joined: 09 Oct 2004, 12:49
Location: Oslo, Norway

Re: Testing and debugging chess engines

Postby Patrice Duhamel » 16 Dec 2006, 12:48

Sorry, I had not understood.

but I still have wrong result.
User avatar
Patrice Duhamel
 
Posts: 49
Joined: 03 Dec 2006, 14:27
Location: France

Re: Testing and debugging chess engines

Postby H.G.Muller » 16 Dec 2006, 13:24

It seems that you store in the hash table not only the count for the subtree from the current node, but for the total of all subtrees from nodes at that same level visited so far.

You should remember perftNode[depth].node when you enter the perft routine, and only store in the hash how much it was increased when returning.

[edit: actually your way of counting nodes for each depth separately makes the hashing hard: In a subtree the counts of all depths larger than the current depth increase. So if you want to prune that subtree and make the same change to the perft counts from the hash, the hash table would have to store the counts for all depths! I only count leaf nodes while perfting, so my hash table only stores the increase of the leaf-node counter.]
User avatar
H.G.Muller
 
Posts: 3453
Joined: 16 Nov 2005, 12:02
Location: Diemen, NL

Re: Testing and debugging chess engines

Postby Patrice Duhamel » 20 Dec 2006, 08:44

I solved my problem, storing in the hash table after the recursive call, when depth = 0, now I have good results.

Thanks for your help.
User avatar
Patrice Duhamel
 
Posts: 49
Joined: 03 Dec 2006, 14:27
Location: France

Re: Testing and debugging chess engines

Postby H.G.Muller » 20 Dec 2006, 11:43

I am not sure I completely understand what you are saying. Is it that you only store in the hash table in calls to perft with a remaining depth of 0? It would give you a much larger speedup if you would store in every node (so that you could also cut off very large subtrees if you have a hit later), but you would have to store the count increase for depth=0 since the start of the node. But perhaps this is what you are already doing.
User avatar
H.G.Muller
 
Posts: 3453
Joined: 16 Nov 2005, 12:02
Location: Diemen, NL

Re: Testing and debugging chess engines

Postby Patrice Duhamel » 20 Dec 2006, 12:42

You are right, I thought it was working,
but I'm only storing at depth 0, and it never use values stored in the hash...
User avatar
Patrice Duhamel
 
Posts: 49
Joined: 03 Dec 2006, 14:27
Location: France

Re: Testing and debugging chess engines

Postby H.G.Muller » 20 Dec 2006, 18:36

What you need is something like:
Code: Select all
u64 count;

void perft(int depth)
{
    u64 oldcount=count;

    if(HASH_HIT && HASH->depth == depth)
    {count += HASH->count; return;}
    for(ALL_MOVES)
    {
        if(depth==1) count++; else
        {
            Make();
            perft(depth-1);
            UnMake();
        }
    }
    HASH->count = count - oldcount;
    HASH->depth = depth;
}
User avatar
H.G.Muller
 
Posts: 3453
Joined: 16 Nov 2005, 12:02
Location: Diemen, NL

Re: Testing and debugging chess engines

Postby Patrice Duhamel » 05 Jan 2007, 11:25

Thank you, With this function, it works.

I found the random numbers I'm using for the hash keys were bad,
I'm using a version of mersenne twister found on the book "Game Programming Gems 4", and I had to call the regenerate function
many times before generating my hash keys to have a good result.
User avatar
Patrice Duhamel
 
Posts: 49
Joined: 03 Dec 2006, 14:27
Location: France

Next

Return to Programming and Technical Discussions

Who is online

Users browsing this forum: No registered users and 41 guests