Page 1 of 2
NPS
Posted:
27 Aug 2005, 12:27
by David Weller
Hi all - long time since my last post ...
I have recently written a second chess engine: Xpdnt [about 130Knps]
Assuming I am not doing anything exceedingly stupid [programming wise or data-structure choices] does this seem exceedingly slow for 700 Mhz Pentium3? gcc 3.4.3
PVS AlphaBeta with Iterative deepening and Hashing
Check, OneReply, MateThreat, and PawnTo7th extensions
Quiesent search
NullMove R=2
IID
Killers
SE = material, KS(pawn shield), pawn structure, a few patterns[development - nothing too fancy]
no SEE, no Piece lists, no Pawn Hash [all on todo list]
My branching factor is about 2-3 [iff I am measuring it right - AVG( Iteration N time / Iteration N-1 time)]
I just cant figure out how/why my engine is 1/3 to 1/2 the speed of others
My question is: Does this seem suspiciously slow, or is it just me?
Should I be looking for serious search bugs?
Thanks, and look forward to all your comments.
Re: NPS
Posted:
27 Aug 2005, 15:00
by Gian-Carlo Pascutto
It's very hard to compare NPS number sensibly since they depend on so many things.
You can check the speed of your movegen/makemove & core datastructures by comparing your "preft" numbers against other programs.
If they are a manitude slower than most other programs, you have a problem.
Re: NPS
Posted:
27 Aug 2005, 17:19
by David Weller
Thanks,
Good idea.
Turns out, I broke perft some place - am trying to fix it ....
Perft 6 was taking like 5 minutes! but then returned ~4x the correct node count
Re: NPS
Posted:
27 Aug 2005, 17:57
by David Weller
That was wierd.
My source for perft was mysteriuosly re-arranged ???
anyway,
perft 6
119060324 81070
81 secs for p3 700 [with no hashing or anything] seems ok
Any other ideas?
Re: NPS
Posted:
27 Aug 2005, 18:14
by Roman Hartmann
Maybe you should have a look at it with a profiler to see in which parts your program spends most of its time. The one from AMD (Codeanalyst) which is free available is pretty good IMO.
Roman
Re: NPS
Posted:
27 Aug 2005, 20:49
by David Weller
Thanks Roman.
I have been profiling and the standard hot spots are:
eval() and attack()
Its true, that if I swap out my eval() for material + PST version, the NPS is 2.5x faster
I tried doing more pointer stuff [rather than indexing arrays] and it actually got worse.
Do the compilers these days take care of most of that sort of thing?
I will have a look at codeanalyst - thanks again
Re: NPS
Posted:
27 Aug 2005, 20:59
by Peter Eizenhammer
Hello David,
nice too see you back
btw: is GES a finished project now as you have started a new engine?
Peter
Re: NPS
Posted:
27 Aug 2005, 22:18
by Gian-Carlo Pascutto
There's no real advantage to using pointers. In fact it will tend to introduce aliasing problems for the compiler.
Re: NPS
Posted:
27 Aug 2005, 23:10
by Uri Blass
Are you sure that there is no real advantage in using pointers(or maybe you mean only relative to indexing arrays)?
I am not sure about nothing when we talk about speed but
I will give an example from Movei's code when pointers seems to do the code faster but the speed improvement was small so I am not sure if pointers were the reason because I can get 1% speed improvement by random changes.
In movei's evaluation I had something in the following structure.
Note that phasewhite and phaseblack are not changed during the executation of the functions.
int calckingsafe2(int phasewhite,int phaseblack)
{
...
}
int evalrest(int phasewhite,int phaseblack)
{
....
score+=calckingsafe2(phasewhite,phaseblack);
...
}
I changed to the following structure because I think that without pointers the compiler needs to create copies of the phasewhite and phaseblack and it is a bad idea (unless the compiler is smart enough to understand that it does not need to generate copies of phasewhite and phaseblack
even if I do not use pointers).
int calckingsafe2(int* phasewhite,int* phaseblack)
{
...
}
int evalrest(int* phasewhite,int* phaseblack)
{
...
score+=calckingsafe2(phasewhite,phaseblack);
}
Re: NPS
Posted:
28 Aug 2005, 02:03
by David Weller
Thanks Peter,
I had 'realized' a couple of things when writing Xpdnt which would probably help GES a lot, and have intended to make maybe one more release, but havent got 'a-round-tuit' yet. Maybe some day i'll get one:)
Gian-Carlos & Uri, I remember [hearing at least] in days past, that pointer arithmetic and so on was so much better/faster than array indexing, but suspected that [after trying it] that the compiler was smarter than me!
Uri, I suspect your 1% estimate is even higher. that 'practical no-op' thing that Tord talked about was pretty wierd, but somehow it doesnt suprise me. Yet another reason that testing is so hard: you never can tell from preliminary tests that something isnt just some sort of fluke
Wouldnt it just be the greatest, if we had some sort of perfect oracle to test againts! I have been using Fruit 2.1
Re: NPS
Posted:
28 Aug 2005, 08:13
by mridul
I dont think I understand what Uri is takling about here ... usually address size == sizeof int ... so amount of 'work' you do for both is same in both cases.
The usual idea behind passing a ptr are (from the top of my head) :
1) Passing a block of mem - the obvious use
2) to avoid a huge block push/pop - usually relevent for struct.
3) Simulating multi valued return.
Indexing array as pointer is faster (slightly) for other reasons - more so in our (chess) case since we can use shifts to calculate the offset faster (a good compiler will do the same anyway).
If you are using pointer increment/decrement style stuff - then you could become faster (leverage instruction pipelining , etc).
Are you sure of the 1% gain/loss uri ? Could be noise.
In your particular case , it should be theoretically worse to use pointer (you have to dereference it to get the value).
Maybe I am missing something here ..
Regards,
Mridul
Re: NPS
Posted:
28 Aug 2005, 09:00
by Uri Blass
mridul wrote:I dont think I understand what Uri is takling about here ... usually address size == sizeof int ... so amount of 'work' you do for both is same in both cases.
The usual idea behind passing a ptr are (from the top of my head) :
1) Passing a block of mem - the obvious use
2) to avoid a huge block push/pop - usually relevent for struct.
3) Simulating multi valued return.
Indexing array as pointer is faster (slightly) for other reasons - more so in our (chess) case since we can use shifts to calculate the offset faster (a good compiler will do the same anyway).
If you are using pointer increment/decrement style stuff - then you could become faster (leverage instruction pipelining , etc).
Are you sure of the 1% gain/loss uri ? Could be noise.
In your particular case , it should be theoretically worse to use pointer (you have to dereference it to get the value).
Maybe I am missing something here ..
Regards,
Mridul
Thanks for your comments.
I checked again and it seems that you are right and code without pointers here is slightly faster(51.67 seconds is faster than 51.79 seconds).
I think that the 1% speed improvement was result of another change that I made at the same time.
I still do not understand the reason that pointers are supposed to be slower but I guess that you are right.
If I understand you correctly
you claim that in theory it is worse because
score=*x+ is more expensive than score=x+ but I do not understand why it should be the case.
The reason that I thought that pointers should be faster is the following reason:
Suppose function A calls to function B and give it a variable.
The computer needs to remember 2 copies of the same variable(one inside function A and one inside function B because changing the variable inside function B does not change the value of the variable inside function A.
Suppose function A calls function B and give it a pointer.
The computer needs to remember only one copy of the variable so it seems faster.
Uri
Re: NPS
Posted:
28 Aug 2005, 11:46
by Gian-Carlo Pascutto
The problems of pointers is "aliasing": for example, you are inside a function that gets an int* pointer passed. If you change the value the pointer points to, the compiler must reload all int values you use from memory again, since it does not know what the pointer points to and hence any int value could be changed.
It's very hard for compilers to deal with.
Without pointers, figuring out that one function doesn't change a value and that hence no extra copy is needed is trivial.
Re: NPS
Posted:
28 Aug 2005, 19:37
by Chan Rasjid
The problems of pointers is "aliasing": for example, you are inside a function that gets an int* pointer passed. If you change the value the pointer points to, the compiler must reload all int values you use from memory again, since it does not know what the pointer points to and hence any int value could be changed.
It's very hard for compilers to deal with.
Gian,
This is an advanced topic unknown to me.
Say I have global variables:-
- Code: Select all
int myInt1, myInt2,.... etc.
void anycall(int *ptr){
*ptr = 1;
}
The probem is that what ptr points to can only be kwown at runtime
and so at compile time such situations require special treatment - like
extra code, extra copy of "all int in the program..." and this may cause great runtime inefficiency.
Is there a simple /rediculous example to illustrate a worst case scenario.
Thanks
Rasjid
Re: NPS
Posted:
28 Aug 2005, 20:15
by Gerd Isenberg
Chan Rasjid wrote:The problems of pointers is "aliasing": for example, you are inside a function that gets an int* pointer passed. If you change the value the pointer points to, the compiler must reload all int values you use from memory again, since it does not know what the pointer points to and hence any int value could be changed.
It's very hard for compilers to deal with.
Gian,
This is an advanced topic unknown to me.
Say I have global variables:-
- Code: Select all
int myInt1, myInt2,.... etc.
void anycall(int *ptr){
*ptr = 1;
}
The probem is that what ptr points to can only be kwown at runtime
and so at compile time such situations require special treatment - like
extra code, extra copy of "all int in the program..." and this may cause great runtime inefficiency.
Is there a simple /rediculous example to illustrate a worst case scenario.
Thanks
Rasjid
Hi Rasjid,
this one is horrible with msc2005 beta 64-bit.
The
this pointer is passed to the virtual member "deBruijnFound".
Even declared as const, the compiler considers the routine may change
this object and saves a lot of registers - and therefor is not able to keep members in registers. Inspect the horrorfull assembly.
The class implements a De Bruijn generator.
You may overload deBruijnFound in a derived class, to look for special properties of De Bruijn constants for your purpose. One interesting application is to hash bitboards with two bits set.
Gerd
- Code: Select all
BitBoard pow2[64] =
{
0x0000000000000001,0x0000000000000002,0x0000000000000004,0x0000000000000008,
0x0000000000000010,0x0000000000000020,0x0000000000000040,0x0000000000000080,
0x0000000000000100,0x0000000000000200,0x0000000000000400,0x0000000000000800,
0x0000000000001000,0x0000000000002000,0x0000000000004000,0x0000000000008000,
0x0000000000010000,0x0000000000020000,0x0000000000040000,0x0000000000080000,
0x0000000000100000,0x0000000000200000,0x0000000000400000,0x0000000000800000,
0x0000000001000000,0x0000000002000000,0x0000000004000000,0x0000000008000000,
0x0000000010000000,0x0000000020000000,0x0000000040000000,0x0000000080000000,
0x0000000100000000,0x0000000200000000,0x0000000400000000,0x0000000800000000,
0x0000001000000000,0x0000002000000000,0x0000004000000000,0x0000008000000000,
0x0000010000000000,0x0000020000000000,0x0000040000000000,0x0000080000000000,
0x0000100000000000,0x0000200000000000,0x0000400000000000,0x0000800000000000,
0x0001000000000000,0x0002000000000000,0x0004000000000000,0x0008000000000000,
0x0010000000000000,0x0020000000000000,0x0040000000000000,0x0080000000000000,
0x0100000000000000,0x0200000000000000,0x0400000000000000,0x0800000000000000,
0x1000000000000000,0x2000000000000000,0x4000000000000000,0x8000000000000000
};
/* usage:
DeBruijnGenerator db;
db.genDeBruijn(6); // 1..6 2**i-i sequences
*/
class DeBruijnGenerator
{
protected:
BitBoard lck, seq;
unsigned int depth, mask;
int count;
void searchDeBruijn(unsigned int i) { // i = (int)(seq >> depth) & mask;
if ( depth > 1 ) {
unsigned int j = (i+i)&mask; // next even index
lck ^= pow2[i]; // lock current index
depth--; // sequence index right
if ( (lck & pow2[j]) == 0 ) { // next even index unlocked?
seq &= ~pow2[depth]; // "append" zero
searchDeBruijn(j); // do recursive search with next even index
}
if ( (lck & pow2[j+1]) == 0 ) { // next odd index unlocked?
seq |= pow2[depth]; // "append" one
searchDeBruijn(j+1); // do recursive search with next odd index
}
depth++; // sequence index left
lck ^= pow2[i]; // unlock current index
} else if ( (lck & pow2[(i+i+1)&mask]) == 0 ) {
deBruijnFound(seq); // bit zero was already set during initialization
count++;
}
}
virtual void deBruijnFound(BitBoard deBruijn) const {};
public:
DeBruijnGenerator() {count = 0;}
int getNoofFound() const {return count;}
void genDeBruijn(unsigned int expOfTwo) {
if ( expOfTwo > 6 ) {
printf ("%d exceeds max exponent 6 for 64/6 sequence\n", expOfTwo);
expOfTwo = 6;
}
unsigned int powOfTwo = 1 << expOfTwo;
seq = 1; // the initial sequence
lck = pow2[powOfTwo/2]; // lock last index for performance
depth = powOfTwo-expOfTwo; // initial depth for expOfTwo leading zeros
mask = powOfTwo-1; // mask
searchDeBruijn(0); // start with index 0 due to expOfTwo leading zeros
}
};
Re: NPS
Posted:
29 Aug 2005, 05:42
by Chan Rasjid
Gerd,
I barely know assembly so I don't expect to know the problem as others do.
I have some idea that in your example, it is a sort of bug or inefficiency as you mentioned mscv2005beta64, etc. different compilers and option flags produce better or worst assembly codes.
From a layman's way of understanding, it is like the hardware processor people who make things too good to be true and therefore the designer of compilers have to invent better and more generic technique in compiler design.
By the way, is there some sort of simple rule of thumb that can alert
non expert to such problems.
I can mention one case with no problem - passing a pointer to a unique (even huge) array of large structure.
Thanks
Rasjid
Re: NPS
Posted:
29 Aug 2005, 09:18
by Gian-Carlo Pascutto
Chan Rasjid wrote:The problems of pointers is "aliasing": for example, you are inside a function that gets an int* pointer passed. If you change the value the pointer points to, the compiler must reload all int values you use from memory again, since it does not know what the pointer points to and hence any int value could be changed.
It's very hard for compilers to deal with.
Gian,
This is an advanced topic unknown to me.
Say I have global variables:-
- Code: Select all
int myInt1, myInt2,.... etc.
void anycall(int *ptr){
*ptr = 1;
}
The probem is that what ptr points to can only be kwown at runtime
and so at compile time such situations require special treatment - like
extra code, extra copy of "all int in the program..." and this may cause great runtime inefficiency.
Is there a simple /rediculous example to illustrate a worst case scenario.
Thanks
Rasjid
In your example, the calling code may know what the ptr passed to anycall points to, so it's not so evil that everything is affected. But for example.
- Code: Select all
int myInt1, myInt2,.... etc.
void anycall(int *ptr){
int a, b;
a = myInt1;
*ptr = 1;
b = myInt1;
}
The b = myInt1 statement requires a memory load, and cannot be done from a register.
Re: NPS
Posted:
01 Sep 2005, 14:06
by David Weller
I forgot to mention I am using int _board[256] with int * board = _board+64
and apply 0x88 stuff with 'border' squares
Any way, I am wondering maybe I should withdraw my original assumptions
But even if I were not using the most efficient approaches to things, it still didnt seem I should suffer 100-200% slow-down.
Am I wrong?
If I could have managed to spend 3x longer than others in my prob-hash() routine, eg., that is still relatively insignificant [ie., not near 100-200%]
My InCheck() routine is similar in kind to TSCP's.
Now, InCheck() is in the top 2 routines when profiling [ IIRC 30-40%]
But even this, If I were doing it less than perfectly, couldnt account for 100-200% slow down in over-all NPS
I have tried rewriting these, but it hardly seems to matter.
InCheck() Gen() Eval()
Just looking for some help as to where I might look next. I ran out of ideas.....
I dont even collect attack information!
Thanks
Re: NPS
Posted:
01 Sep 2005, 14:20
by Richard Pijl
My InCheck() routine is similar in kind to TSCP's.
Now, InCheck() is in the top 2 routines when profiling [ IIRC 30-40%]
There's your problem. TSCP's InCheck() is highly inefficient. IIRC it will loop over all pieces to find out whether it attacks the king square.
An easy improvement is to start with the square of the king and find out if there is an enemy piece that attacks that square. So that would mean generating the rays from the king's square, not from the pieces squares.
Richard.
Re: NPS
Posted:
01 Sep 2005, 17:21
by David Weller
Hi Richard,
Thank you.
But I must be mistaken again. I thought for some reason, that that was how he did it.
I _DO_ trace out from square seeking for pieces which attack it
Darn! When you said, 'there is your problem', I got so excited
Thanks for trying though.