difference in structure of chess engines

Programming Topics (Computer Chess) and technical aspects as test techniques, book building, program tuning etc

Moderator: Andres Valverde

difference in structure of chess engines

Postby Uri Blass » 25 Jun 2005, 13:53

When I look at the structure of fruit and glaurung I find one interesting difference.
Glaurung has only one .h file glaurung.h
Fruit has a lot of .h file and for every cpp file except main.cpp it has .h file
with the same name.

In Movei I have similiar structure to tscp and I have only
data.h defs.h and protos.h (and some tablebase code from nalimov that is also .h) when data.h is for global variables,defs.h for definitions of constants and structures and protos.h for functions.
I also have few classes because Dann Corbit adviced me to do it but I did not divide most of the variables to classes and I think that it was probably not a good idea to have classes because I find that fruit has no classes.

The fact that fruit has no classes suggest that using classes is not a good idea for chess programming because I guess that Fabien is the best programmer and he knows what he is doing.

I am considering to modify Movei and write it in the fruit way so every file knows only varaibles that are relevant to it by the right include.
First step in that case is to mark for every variable or definition or function in which files it is used.
Later I can decide to divide the definitions, variables and functions to different files

What is your opinion.
Is it a good idea?

Uri
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Re: difference in structure of chess engines

Postby Sven Schüle » 25 Jun 2005, 14:56

Hi Uri,

first of all, you should not focus on the "physical" partitioning of your program into source files with highest priority but start with "logical" partitioning. As soon as you are done with it, moving code into separate files will be most obvious in many cases.

As always, there is no general rule telling what's best. Professional software developers will of course suggest to use classes to obtain a well-structured system for which it is possible to understand parts of it independent from other parts.

You can achieve the same goal partly with the classical "structured programming" approach where one of the key points is to divide the system into entities, say "modules". Each entity consists of functions and/or data belonging conceptually together. The object-oriented programming approach, often based on classes, serves for the same target but has the advantage that you are talking about "things" (objects) having a "state" (data, represented by variables valid only for one type of objects) and a set of operations defined on each object's data.

In chess programming, there are surely a couple of concepts for which classes may be useful. Of course there are many ways of designing, I can only tell you how I did it in Surprise. This is an extract of all classes from Surprise (only to give you an impression); note that this design does not (yet) support using multiple processors:

a) Classes with one instance only (or sometimes two):
WinBoardEngine
Game
Board
PieceList (part of Board, two instances)
Evaluator
MoveGenerator
Searcher
SearchAssistant (maintains PV for each depth)
SearchInfo (maintains node counting etc.)
MainHash
OpeningBook
ChessClock (two instances)

b) Classes with an arbitrary number of instances:
Position (represents a node, is stored on the Searcher's position stack)
MoveList (part of Position, represents segment of Searcher' move stack)
Move
MoveListEntry (a Move with additional data: flags and value)
Square (part of Board, all operations "inline" for efficiency)

Movei is much, much stronger than Surprise (roughly 2500 ELO against only 2000, just to give any numbers), but I guess that this is completely unrelated to the fact that Surprise uses more classes than Movei. Movei definitely has better algorithms, better pruning tricks and probably also better data structures than Surprise.

But I expect that you wouldn't have too big problems to understand my source code.

It would be interesting to know if there are other chess programmers using an approach similiar to that of Surprise.

Regards,
Sven
Last edited by Sven Schüle on 25 Jun 2005, 18:16, edited 1 time in total.
User avatar
Sven Schüle
 
Posts: 240
Joined: 26 Sep 2004, 20:19
Location: Berlin, Germany

Re: difference in structure of chess engines

Postby Reinhard Scharnagl » 25 Jun 2005, 16:19

Hi Uri,

sorry, I cannot tell a lot about other's chess engines, simply on mine.

In Smirf I use some classes, but mostly to encapsulate static routines.
Most of my time critic constructs fit into a 32-Bit integer. I use a lot of
separated files to make changings easier. Actually I am intending also
to split again some of my current bigger files.

The Smirf sources moreover do not contain any prepared tables. All is done
from the scratch, when the engine starts its work. Smirf's size still < 54K.

Reinhard.
Reinhard Scharnagl
 
Posts: 608
Joined: 01 Oct 2004, 08:36
Location: Klein-Gerau, Germany

Re: difference in structure of chess engines

Postby Volker Böhm » 25 Jun 2005, 19:28

Hi Uri,

I don?t know if you whant to produce a "deep movei" sometimes. If yes, then organize your source to be able to do so.

To gain the maximal amount of speed make every variable global. You can seperate them in different files if you like. You can use classes with static variables an static methods - this is equivalent to a structure with every variable made global. It does?nt matter if you seperate the variables and code in different .h and .cpp files. Modern compilers have the availiability to "late-compile". They can put all .cpp and .h together and generate the same code as if it never had been seperated.

If you whant to get a deep movei later, this is not a good idea. Take a look at crafty. It puts all variables in one big structure and passes a pointer to this structure to most functions. With this "trick" he can use different copies of this structure for different threads.
In my opignion classes will help here very much. If you use non static mebers and methods in classes you can create an instance of this classes for every thread.

Example for clearity: a board


1. Possibility, global variables
int board[BOARDSIZE];
int movelist[MAXMOVES];
void GenerateMoves()
{
// access board for current position, add moves to movelist
...
}

2. Crafy - way
struct Global
{
int board[BOARDSIZE];
int movelist[MAXMOVES];
}

void GenerateMoves(Global* pGlobal)
{
// access pGlobal->board and store moves to pGlobal->movelist
...
}

3. With objects equivalent to 1:
class Board
{
static int board[BOARDSIZE];
static int movelist[MAXMOVES];
static void GenerateMoves()
{
// access board and store moves to movelist
}
};

4. With objects equivalent to 2:
Class Board()
{
int board[BOARDSIZE];
int movelist[MAXMOVES];
void GenerateMoves()
{
// access board and store moves to movelist
}
}

Spike is using the approach 4. You?ll be able to create one board for every thread.

Another thing you can think about:
In IceSpell I used a class for every primitive data structure. I had a "piece" class, a "move" class, a "position" class, ...
This is a good object oriented design, but not good for speed. We have another approch in Spike. There is one basic "Anything" class that contains many constants and basic methods. This "Anything" has no own members thus you can derivate every class from it without any speed loss. Every primitive data structure is now an integer and no longer a class. A move is an integer, a position is an integer a piece is an integer, ...
The "Anything" class contains the access functions for these "integer-structures". Examples:
GetFrom(int aMove);
GetTo(int aMove);
...
I am very happy with this approach. It prevents the Macros many engines have without loosing speed. And it doesn?t prevent Spike to go to a parallel engine in future.

With this the board class will get to something like:

class Board public Anything
{
int board[BoardSize];
};

In a discussion thread I read that crafty looses about 10% of speed because of this pointer to a global structure instead of accessing global variables directly. I believe that the non-static-objecet approach will loose the same speed compared to the fully static approach.


Greetings Volker
Volker B?hm, Spike Team
Volker Böhm
 
Posts: 66
Joined: 27 Sep 2004, 18:52
Location: Germany

Re: difference in structure of chess engines

Postby Sven Schüle » 25 Jun 2005, 23:10

Mangar wrote:In IceSpell I used a class for every primitive data structure. I had a "piece" class, a "move" class, a "position" class, ...
This is a good object oriented design, but not good for speed.

Hi Volker,

defining classes for primitive data structures is not at all equivalent to losing speed. It's just a matter of how you implement it. If you use "inline", take care about the size of your class instances (e.g. there will be absolutely no relevant speed problems with objects of 32 bits size), and avoid to have classes with only one small data member, then speed will be comparable to an implementation without a class.

I'm not convinced at all that this "speed hunting" is productive for more than 5% of all existing chess engines (the "world class", where 20% speed improvements really make a difference).

For example, take my lower-class engine Surprise. It has about 2000 ELO only and searches about 500 knps on a 2,4 GHz PIV. It is quite fast (despite its object-oriented design ;-) ) but has defects in positional evaluation and misses some better pruning techniques. If I take my code and heavily optimize it for speed, I will perhaps be able to double the speed. Throw away some classes and return to global variables here and there, and so on.

It would be good luck if this would take me one full ply deeper. Perhaps this tuned engine would have 2100-2150 ELO. But then I would stop and throw it away because the code would have become unreadable and error prone.

Recently I added the formerly missing nullmove implementation to Surprise. This was a big improvement, and it was nothing about speed but about number of nodes needed for a search. That's the area where time should be spent for, and perhaps also for improvements of the positional evaluation.

Just my 0,02? ...

Regards,
Sven
User avatar
Sven Schüle
 
Posts: 240
Joined: 26 Sep 2004, 20:19
Location: Berlin, Germany

Re: difference in structure of chess engines

Postby Volker Böhm » 25 Jun 2005, 23:37

Hi Sven,

I fully agree in your statement "clarity" before speed and bugfixing and clever algorithm does more than speed does. Nevertheless speed adds some elo.

But I don?t aggree on your statement that classes that encapsulate integers don?t loose speed. At last my compilter (msvc) often wouldn?t inline simple methods in very long codings. Even if I use __forceinline. Take a look at the assembly code generated.

I saw myself to reorganize my code in IceSpell to make my compiler to inline the methods.

Greetings Volker
Volker B?hm, Spike Team
Volker Böhm
 
Posts: 66
Joined: 27 Sep 2004, 18:52
Location: Germany

Re: difference in structure of chess engines

Postby Dann Corbit » 27 Jun 2005, 20:08

Uri Blass wrote:When I look at the structure of fruit and glaurung I find one interesting difference.
Glaurung has only one .h file glaurung.h
Fruit has a lot of .h file and for every cpp file except main.cpp it has .h file
with the same name.

In Movei I have similiar structure to tscp and I have only
data.h defs.h and protos.h (and some tablebase code from nalimov that is also .h) when data.h is for global variables,defs.h for definitions of constants and structures and protos.h for functions.
I also have few classes because Dann Corbit adviced me to do it but I did not divide most of the variables to classes and I think that it was probably not a good idea to have classes because I find that fruit has no classes.


It is trivial to change your classes back into functions.
Fabian is using C++ as "A better C". You will find that he does make extensive use of structs, and a struct is the same thing as a class, but it has different default visibility (public). Also, he does not use member functions at all.

C++ classes will not make your code faster. They will make it easier to understand and also much easier to write a SMP version.

The fact that fruit has no classes suggest that using classes is not a good idea for chess programming because I guess that Fabien is the best programmer and he knows what he is doing.

I am considering to modify Movei and write it in the fruit way so every file knows only varaibles that are relevant to it by the right include.
First step in that case is to mark for every variable or definition or function in which files it is used.
Later I can decide to divide the definitions, variables and functions to different files

What is your opinion.
Is it a good idea?

Uri


I cannot comment on your idea because I do not understand it.

If there are variables needed only in a single file, they should not be put into any include file. It is clearly a mistake to do it in that case.

So what you are trying to do is unclear to me.
Dann Corbit
 

Re: difference in structure of chess engines

Postby Anonymous » 27 Jun 2005, 21:11

Dann Corbit wrote:If there are variables needed only in a single file, they should not be put into any include file. It is clearly a mistake to do it in that case.


Sure! Perhaps Uri is also thinking of the typical "Crafty-layout" with all (global) data in one file. And of course one include file, with all those "extern ..." declarations.

IMHO, it is not trivial, to make a chess engine rather modular. It is not difficult either, but may be lots of work.

Once, I thought to have a seperate "book utiltities" program, for generating the opening book, adding moves to it, and more such things. Sounds easy - but I soon came to the conclusion that is (by a huge margine) too much work. I won't doubt, that with more disciplined programming, it should be easy. What was the case here: book generation code must read PGN. PGN reading code needs a move generator. For storing games in "internal format", I reuse structures actually used in searching. While generating the book, I want to give some progress indicators. These must be displayed. At some time, I was too lazy, to communicate all things that have to be displayed to the actual display routines (say middle level display routines - something like show_pv()). Instead, they will use some global vars like the node counter, all sorts of counters about hash statistics, TB accesses, the current position, .... . It turns out, that I practically would need functions from every module of my program.

I believe, I am aware, how to fix all this, and really make things modular in practice. I fear it comes with a rather high prize (for a hobby programmer). Instead of just adding a new variable unsigned long new_counter in some C and some header file, and just adding "... %lu ..." to some printing code, some new_counter=0 probably at the start of the search, and some new_counter++ at some appropriate place, you would additionally need to pass down this new_counter to many functions. (Classes and inheritance may help, to do it behind the scenes). For the hobbyist, that want's to try something new fast (inside an older, and probably not too well thought through design), just adding a global var will be much easier. The size of the typical chess engine might be just small enough, that such programming will work.

OTOH: after a while I saw, that what I thought of ages ago was a nice and clear design, basically becomes broken. Too many dependencies between different modules/variable are visable. When you work on the code on a daily basis, a chess engine will still be small enough - and all the programmer can be very efficient. When you visit your code after some weeks or months again, you easily forget about some dependencies, and my have a hard time.

Certainly, many have designed their engines better than me. One little challenge: How long would you need, to write a standalone program (with more or less minimal code), that can do a simple task, like reading a PGN and write a score sheet in coordinate notation? When I can use all of the engines framework, I am conficent I can do it in 15 minutes (I would add a command to yace - something like convert_pgn_to_coordinate input.pgn output.txt). For a stand-alone program, that would not include most code of the engine (but could use any code, it wants), it probably would take several days. To clarify the challenge: use as much code as you want, that you already have written. But don't use too much (one or the other routine included, that will never be called will not matter) of it, or even all the code in your engine.

What I have in mind: Can do it in a way like this?

I write a routine, that outputs a game in coordinate notation
I add some user interface to call this

I need PGN parsing module
I need move generator
I need basic input output routines
... some more
I don't need my search module
I don't need my eval module
....

Of those modules I need, I just throw out many things.


Regards,
Dieter
Anonymous
 

Re: difference in structure of chess engines

Postby Uri Blass » 27 Jun 2005, 23:29

Dann Corbit wrote:
Uri Blass wrote:When I look at the structure of fruit and glaurung I find one interesting difference.
Glaurung has only one .h file glaurung.h
Fruit has a lot of .h file and for every cpp file except main.cpp it has .h file
with the same name.

In Movei I have similiar structure to tscp and I have only
data.h defs.h and protos.h (and some tablebase code from nalimov that is also .h) when data.h is for global variables,defs.h for definitions of constants and structures and protos.h for functions.
I also have few classes because Dann Corbit adviced me to do it but I did not divide most of the variables to classes and I think that it was probably not a good idea to have classes because I find that fruit has no classes.


It is trivial to change your classes back into functions.
Fabian is using C++ as "A better C". You will find that he does make extensive use of structs, and a struct is the same thing as a class, but it has different default visibility (public). Also, he does not use member functions at all.

C++ classes will not make your code faster. They will make it easier to understand and also much easier to write a SMP version.

The fact that fruit has no classes suggest that using classes is not a good idea for chess programming because I guess that Fabien is the best programmer and he knows what he is doing.

I am considering to modify Movei and write it in the fruit way so every file knows only varaibles that are relevant to it by the right include.
First step in that case is to mark for every variable or definition or function in which files it is used.
Later I can decide to divide the definitions, variables and functions to different files

What is your opinion.
Is it a good idea?

Uri


I cannot comment on your idea because I do not understand it.

If there are variables needed only in a single file, they should not be put into any include file. It is clearly a mistake to do it in that case.

So what you are trying to do is unclear to me.


The point is that there are global variables that are used in few files but not in all the files.

Example:In fruit the variables in pawn.h are not global variables but they are used only in part of the files and only part of the files include pawn.h

Uri
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Re: difference in structure of chess engines

Postby Dann Corbit » 27 Jun 2005, 23:35

Uri Blass wrote:The point is that there are global variables that are used in few files but not in all the files.

Example:In fruit the variables in pawn.h are not global variables but they are used only in part of the files and only part of the files include pawn.h

Uri


Yes, then that is a very good idea.
Dann Corbit
 

Re: difference in structure of chess engines

Postby diepeveen » 29 Jun 2005, 18:22

[quote="Uri Blass"]When I look at the structure of fruit and glaurung I find one interesting difference.
Glaurung has only one .h file glaurung.h
Fruit has a lot of .h file and for every cpp file except main.cpp it has .h file
with the same name.

In Movei I have similiar structure to tscp and I have only
data.h defs.h and protos.h (and some tablebase code from nalimov that is also .h) when data.h is for global variables,defs.h for definitions of constants and structures and protos.h for functions.
I also have few classes because Dann Corbit adviced me to do it but I did not divide most of the variables to classes and I think that it was probably not a good idea to have classes because I find that fruit has no classes.

The fact that fruit has no classes suggest that using classes is not a good idea for chess programming because I guess that Fabien is the best programmer and he knows what he is doing.

I am considering to modify Movei and write it in the fruit way so every file knows only varaibles that are relevant to it by the right include.
First step in that case is to mark for every variable or definition or function in which files it is used.
Later I can decide to divide the definitions, variables and functions to different files

What is your opinion.
Is it a good idea?

Uri[/quote]

Uri, the best idea IMHO is use a clear structure.
For 100 person projects using a thousand C files and many directories is a good idea.

For 1 person projects like a chess engine i feel by far best is limited amount of files. Like 25 files at most or so if you hit a few megabytes in size.

The entire data typedef/struct declaration in 1 generic header file that all include (in diep case: diep.h)

then further function declaration i put at top of same C file. function extern declaration in 1 generic header file (function.h)

The data allocation you can put in C file where item gets declared as long as the EXTERN declaration of it you put in a generic header file (which in diep i call data.h). So that is not allocation but extern declaration.

So every C file after the non-diep includes starts with:

#include "diep.h"
#include "function.h"
#include "data.h"

then possibly if there is some datastructure that just has a local meaning i declare it *here*.

Result of this is of course you get C files of pretty big size.

Some C files in diep are hundreds of kilobytes. This is no problem at all as long as you neatly also prototype them at top of file. Most people never prototype their functions. Bad idea.

I am not a big fan of putting all declarations in 1 file at all.

Note in past it was not possible even to do it, you could not allocate too much RAM within 1 file :)
diepeveen
 
Posts: 116
Joined: 28 Jun 2005, 01:09
Location: Netherlands


Return to Programming and Technical Discussions

Who is online

Users browsing this forum: No registered users and 4 guests