NPS

Programming Topics (Computer Chess) and technical aspects as test techniques, book building, program tuning etc

Moderator: Andres Valverde

NPS

Postby David Weller » 27 Aug 2005, 12:27

Hi all - long time since my last post ...

I have recently written a second chess engine: Xpdnt [about 130Knps]

Assuming I am not doing anything exceedingly stupid [programming wise or data-structure choices] does this seem exceedingly slow for 700 Mhz Pentium3? gcc 3.4.3

PVS AlphaBeta with Iterative deepening and Hashing
Check, OneReply, MateThreat, and PawnTo7th extensions
Quiesent search
NullMove R=2
IID
Killers

SE = material, KS(pawn shield), pawn structure, a few patterns[development - nothing too fancy]

no SEE, no Piece lists, no Pawn Hash [all on todo list]

My branching factor is about 2-3 [iff I am measuring it right - AVG( Iteration N time / Iteration N-1 time)]

I just cant figure out how/why my engine is 1/3 to 1/2 the speed of others

My question is: Does this seem suspiciously slow, or is it just me?

Should I be looking for serious search bugs?

Thanks, and look forward to all your comments.
User avatar
David Weller
 
Posts: 135
Joined: 26 Sep 2004, 20:30
Location: USA

Re: NPS

Postby Gian-Carlo Pascutto » 27 Aug 2005, 15:00

It's very hard to compare NPS number sensibly since they depend on so many things.

You can check the speed of your movegen/makemove & core datastructures by comparing your "preft" numbers against other programs.

If they are a manitude slower than most other programs, you have a problem.
Gian-Carlo Pascutto
 
Posts: 42
Joined: 04 Jul 2005, 13:24
Location: Belgium

Re: NPS

Postby David Weller » 27 Aug 2005, 17:19

Thanks,

Good idea.

Turns out, I broke perft some place - am trying to fix it ....

Perft 6 was taking like 5 minutes! but then returned ~4x the correct node count
User avatar
David Weller
 
Posts: 135
Joined: 26 Sep 2004, 20:30
Location: USA

Re: NPS

Postby David Weller » 27 Aug 2005, 17:57

That was wierd.

My source for perft was mysteriuosly re-arranged ???

anyway,

perft 6
119060324 81070

81 secs for p3 700 [with no hashing or anything] seems ok

Any other ideas?
User avatar
David Weller
 
Posts: 135
Joined: 26 Sep 2004, 20:30
Location: USA

Re: NPS

Postby Roman Hartmann » 27 Aug 2005, 18:14

Maybe you should have a look at it with a profiler to see in which parts your program spends most of its time. The one from AMD (Codeanalyst) which is free available is pretty good IMO.

Roman
User avatar
Roman Hartmann
 
Posts: 155
Joined: 11 Oct 2004, 14:21

Re: NPS

Postby David Weller » 27 Aug 2005, 20:49

Thanks Roman.

I have been profiling and the standard hot spots are:

eval() and attack()

Its true, that if I swap out my eval() for material + PST version, the NPS is 2.5x faster


I tried doing more pointer stuff [rather than indexing arrays] and it actually got worse.

Do the compilers these days take care of most of that sort of thing?

I will have a look at codeanalyst - thanks again
User avatar
David Weller
 
Posts: 135
Joined: 26 Sep 2004, 20:30
Location: USA

Re: NPS

Postby Peter Eizenhammer » 27 Aug 2005, 20:59

Hello David,

nice too see you back :D

btw: is GES a finished project now as you have started a new engine?

Peter
Peter Eizenhammer
 
Posts: 63
Joined: 28 Sep 2004, 14:36

Re: NPS

Postby Gian-Carlo Pascutto » 27 Aug 2005, 22:18

There's no real advantage to using pointers. In fact it will tend to introduce aliasing problems for the compiler.
Gian-Carlo Pascutto
 
Posts: 42
Joined: 04 Jul 2005, 13:24
Location: Belgium

Re: NPS

Postby Uri Blass » 27 Aug 2005, 23:10

Are you sure that there is no real advantage in using pointers(or maybe you mean only relative to indexing arrays)?

I am not sure about nothing when we talk about speed but
I will give an example from Movei's code when pointers seems to do the code faster but the speed improvement was small so I am not sure if pointers were the reason because I can get 1% speed improvement by random changes.

In movei's evaluation I had something in the following structure.
Note that phasewhite and phaseblack are not changed during the executation of the functions.

int calckingsafe2(int phasewhite,int phaseblack)
{
...
}
int evalrest(int phasewhite,int phaseblack)
{
....
score+=calckingsafe2(phasewhite,phaseblack);
...
}

I changed to the following structure because I think that without pointers the compiler needs to create copies of the phasewhite and phaseblack and it is a bad idea (unless the compiler is smart enough to understand that it does not need to generate copies of phasewhite and phaseblack
even if I do not use pointers).


int calckingsafe2(int* phasewhite,int* phaseblack)
{
...
}
int evalrest(int* phasewhite,int* phaseblack)
{
...
score+=calckingsafe2(phasewhite,phaseblack);
}
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Re: NPS

Postby David Weller » 28 Aug 2005, 02:03

Thanks Peter,

I had 'realized' a couple of things when writing Xpdnt which would probably help GES a lot, and have intended to make maybe one more release, but havent got 'a-round-tuit' yet. Maybe some day i'll get one:)

Gian-Carlos & Uri, I remember [hearing at least] in days past, that pointer arithmetic and so on was so much better/faster than array indexing, but suspected that [after trying it] that the compiler was smarter than me!

Uri, I suspect your 1% estimate is even higher. that 'practical no-op' thing that Tord talked about was pretty wierd, but somehow it doesnt suprise me. Yet another reason that testing is so hard: you never can tell from preliminary tests that something isnt just some sort of fluke

Wouldnt it just be the greatest, if we had some sort of perfect oracle to test againts! I have been using Fruit 2.1 :)
User avatar
David Weller
 
Posts: 135
Joined: 26 Sep 2004, 20:30
Location: USA

Re: NPS

Postby mridul » 28 Aug 2005, 08:13

I dont think I understand what Uri is takling about here ... usually address size == sizeof int ... so amount of 'work' you do for both is same in both cases.

The usual idea behind passing a ptr are (from the top of my head) :

1) Passing a block of mem - the obvious use :)
2) to avoid a huge block push/pop - usually relevent for struct.
3) Simulating multi valued return.

Indexing array as pointer is faster (slightly) for other reasons - more so in our (chess) case since we can use shifts to calculate the offset faster (a good compiler will do the same anyway).
If you are using pointer increment/decrement style stuff - then you could become faster (leverage instruction pipelining , etc).

Are you sure of the 1% gain/loss uri ? Could be noise.
In your particular case , it should be theoretically worse to use pointer (you have to dereference it to get the value).
Maybe I am missing something here ..

Regards,
Mridul
mridul
 
Posts: 48
Joined: 09 Dec 2004, 11:34
Location: Bangalore , India

Re: NPS

Postby Uri Blass » 28 Aug 2005, 09:00

mridul wrote:I dont think I understand what Uri is takling about here ... usually address size == sizeof int ... so amount of 'work' you do for both is same in both cases.

The usual idea behind passing a ptr are (from the top of my head) :

1) Passing a block of mem - the obvious use :)
2) to avoid a huge block push/pop - usually relevent for struct.
3) Simulating multi valued return.

Indexing array as pointer is faster (slightly) for other reasons - more so in our (chess) case since we can use shifts to calculate the offset faster (a good compiler will do the same anyway).
If you are using pointer increment/decrement style stuff - then you could become faster (leverage instruction pipelining , etc).

Are you sure of the 1% gain/loss uri ? Could be noise.
In your particular case , it should be theoretically worse to use pointer (you have to dereference it to get the value).
Maybe I am missing something here ..

Regards,
Mridul


Thanks for your comments.

I checked again and it seems that you are right and code without pointers here is slightly faster(51.67 seconds is faster than 51.79 seconds).

I think that the 1% speed improvement was result of another change that I made at the same time.

I still do not understand the reason that pointers are supposed to be slower but I guess that you are right.

If I understand you correctly
you claim that in theory it is worse because
score=*x+ is more expensive than score=x+ but I do not understand why it should be the case.

The reason that I thought that pointers should be faster is the following reason:
Suppose function A calls to function B and give it a variable.

The computer needs to remember 2 copies of the same variable(one inside function A and one inside function B because changing the variable inside function B does not change the value of the variable inside function A.

Suppose function A calls function B and give it a pointer.
The computer needs to remember only one copy of the variable so it seems faster.

Uri
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Re: NPS

Postby Gian-Carlo Pascutto » 28 Aug 2005, 11:46

The problems of pointers is "aliasing": for example, you are inside a function that gets an int* pointer passed. If you change the value the pointer points to, the compiler must reload all int values you use from memory again, since it does not know what the pointer points to and hence any int value could be changed.

It's very hard for compilers to deal with.

Without pointers, figuring out that one function doesn't change a value and that hence no extra copy is needed is trivial.
Gian-Carlo Pascutto
 
Posts: 42
Joined: 04 Jul 2005, 13:24
Location: Belgium

Re: NPS

Postby Chan Rasjid » 28 Aug 2005, 19:37

The problems of pointers is "aliasing": for example, you are inside a function that gets an int* pointer passed. If you change the value the pointer points to, the compiler must reload all int values you use from memory again, since it does not know what the pointer points to and hence any int value could be changed.

It's very hard for compilers to deal with.


Gian,

This is an advanced topic unknown to me.

Say I have global variables:-
Code: Select all
int  myInt1, myInt2,.... etc.
void anycall(int  *ptr){
*ptr = 1;
}


The probem is that what ptr points to can only be kwown at runtime
and so at compile time such situations require special treatment - like
extra code, extra copy of "all int in the program..." and this may cause great runtime inefficiency.

Is there a simple /rediculous example to illustrate a worst case scenario.

Thanks
Rasjid
Chan Rasjid
 
Posts: 73
Joined: 23 Feb 2005, 16:36
Location: Singapore

Re: NPS

Postby Gerd Isenberg » 28 Aug 2005, 20:15

Chan Rasjid wrote:
The problems of pointers is "aliasing": for example, you are inside a function that gets an int* pointer passed. If you change the value the pointer points to, the compiler must reload all int values you use from memory again, since it does not know what the pointer points to and hence any int value could be changed.

It's very hard for compilers to deal with.


Gian,

This is an advanced topic unknown to me.

Say I have global variables:-
Code: Select all
int  myInt1, myInt2,.... etc.
void anycall(int  *ptr){
*ptr = 1;
}


The probem is that what ptr points to can only be kwown at runtime
and so at compile time such situations require special treatment - like
extra code, extra copy of "all int in the program..." and this may cause great runtime inefficiency.

Is there a simple /rediculous example to illustrate a worst case scenario.

Thanks
Rasjid


Hi Rasjid,
this one is horrible with msc2005 beta 64-bit.
The this pointer is passed to the virtual member "deBruijnFound".
Even declared as const, the compiler considers the routine may change this object and saves a lot of registers - and therefor is not able to keep members in registers. Inspect the horrorfull assembly.

The class implements a De Bruijn generator.
You may overload deBruijnFound in a derived class, to look for special properties of De Bruijn constants for your purpose. One interesting application is to hash bitboards with two bits set.

Gerd


Code: Select all

BitBoard pow2[64] =
{
  0x0000000000000001,0x0000000000000002,0x0000000000000004,0x0000000000000008,
  0x0000000000000010,0x0000000000000020,0x0000000000000040,0x0000000000000080,
  0x0000000000000100,0x0000000000000200,0x0000000000000400,0x0000000000000800,
  0x0000000000001000,0x0000000000002000,0x0000000000004000,0x0000000000008000,
  0x0000000000010000,0x0000000000020000,0x0000000000040000,0x0000000000080000,
  0x0000000000100000,0x0000000000200000,0x0000000000400000,0x0000000000800000,
  0x0000000001000000,0x0000000002000000,0x0000000004000000,0x0000000008000000,
  0x0000000010000000,0x0000000020000000,0x0000000040000000,0x0000000080000000,
  0x0000000100000000,0x0000000200000000,0x0000000400000000,0x0000000800000000,
  0x0000001000000000,0x0000002000000000,0x0000004000000000,0x0000008000000000,
  0x0000010000000000,0x0000020000000000,0x0000040000000000,0x0000080000000000,
  0x0000100000000000,0x0000200000000000,0x0000400000000000,0x0000800000000000,
  0x0001000000000000,0x0002000000000000,0x0004000000000000,0x0008000000000000,
  0x0010000000000000,0x0020000000000000,0x0040000000000000,0x0080000000000000,
  0x0100000000000000,0x0200000000000000,0x0400000000000000,0x0800000000000000,
  0x1000000000000000,0x2000000000000000,0x4000000000000000,0x8000000000000000
};



/* usage:

   DeBruijnGenerator db;
   db.genDeBruijn(6); // 1..6 2**i-i sequences
*/

class DeBruijnGenerator
{
protected:
  BitBoard lck, seq;
  unsigned int depth, mask;
  int count;

  void  searchDeBruijn(unsigned int i) { // i = (int)(seq >> depth) & mask;
    if ( depth > 1 ) {
      unsigned int j = (i+i)&mask; // next even index
      lck ^= pow2[i];   // lock current index
      depth--;              // sequence index right
      if ( (lck & pow2[j]) == 0 ) { // next even index unlocked?
        seq &= ~pow2[depth]; // "append" zero
        searchDeBruijn(j);       // do recursive search with next even index
      }
      if ( (lck & pow2[j+1]) == 0 ) { // next odd index unlocked?
        seq |=  pow2[depth];   // "append" one
        searchDeBruijn(j+1);   // do recursive search with next odd index
      }
      depth++;           // sequence index left
      lck ^= pow2[i];  // unlock current index
    } else if ( (lck & pow2[(i+i+1)&mask]) == 0 ) {
      deBruijnFound(seq); // bit zero was already set during initialization
      count++;
    }
  }
 
  virtual void deBruijnFound(BitBoard deBruijn) const {};

public:

  DeBruijnGenerator() {count = 0;}
   
  int getNoofFound() const {return count;}

  void genDeBruijn(unsigned int expOfTwo) {
    if ( expOfTwo > 6 ) {
      printf ("%d exceeds max exponent 6 for 64/6 sequence\n", expOfTwo);
      expOfTwo = 6;
    }
    unsigned int powOfTwo = 1 << expOfTwo;
    seq  = 1;                      // the initial sequence
    lck   = pow2[powOfTwo/2];      // lock last index for performance
    depth = powOfTwo-expOfTwo;  // initial depth for expOfTwo leading zeros
    mask  = powOfTwo-1;       // mask
    searchDeBruijn(0);       // start with index 0 due to expOfTwo leading zeros
  }

};

Gerd Isenberg
 
Posts: 285
Joined: 31 Jan 2005, 20:31
Location: Hattingen, Germany

Re: NPS

Postby Chan Rasjid » 29 Aug 2005, 05:42

Gerd,

I barely know assembly so I don't expect to know the problem as others do.

I have some idea that in your example, it is a sort of bug or inefficiency as you mentioned mscv2005beta64, etc. different compilers and option flags produce better or worst assembly codes.

From a layman's way of understanding, it is like the hardware processor people who make things too good to be true and therefore the designer of compilers have to invent better and more generic technique in compiler design.

By the way, is there some sort of simple rule of thumb that can alert
non expert to such problems.

I can mention one case with no problem - passing a pointer to a unique (even huge) array of large structure.

Thanks
Rasjid
Chan Rasjid
 
Posts: 73
Joined: 23 Feb 2005, 16:36
Location: Singapore

Re: NPS

Postby Gian-Carlo Pascutto » 29 Aug 2005, 09:18

Chan Rasjid wrote:
The problems of pointers is "aliasing": for example, you are inside a function that gets an int* pointer passed. If you change the value the pointer points to, the compiler must reload all int values you use from memory again, since it does not know what the pointer points to and hence any int value could be changed.

It's very hard for compilers to deal with.


Gian,

This is an advanced topic unknown to me.

Say I have global variables:-
Code: Select all
int  myInt1, myInt2,.... etc.
void anycall(int  *ptr){
*ptr = 1;
}


The probem is that what ptr points to can only be kwown at runtime
and so at compile time such situations require special treatment - like
extra code, extra copy of "all int in the program..." and this may cause great runtime inefficiency.

Is there a simple /rediculous example to illustrate a worst case scenario.

Thanks
Rasjid


In your example, the calling code may know what the ptr passed to anycall points to, so it's not so evil that everything is affected. But for example.

Code: Select all
int  myInt1, myInt2,.... etc.
void anycall(int  *ptr){
int a, b;
a = myInt1;
*ptr = 1;
b = myInt1;
}


The b = myInt1 statement requires a memory load, and cannot be done from a register.
Gian-Carlo Pascutto
 
Posts: 42
Joined: 04 Jul 2005, 13:24
Location: Belgium

Re: NPS

Postby David Weller » 01 Sep 2005, 14:06

I forgot to mention I am using int _board[256] with int * board = _board+64

and apply 0x88 stuff with 'border' squares

Any way, I am wondering maybe I should withdraw my original assumptions

But even if I were not using the most efficient approaches to things, it still didnt seem I should suffer 100-200% slow-down.

Am I wrong?

If I could have managed to spend 3x longer than others in my prob-hash() routine, eg., that is still relatively insignificant [ie., not near 100-200%]

My InCheck() routine is similar in kind to TSCP's.

Now, InCheck() is in the top 2 routines when profiling [ IIRC 30-40%]

But even this, If I were doing it less than perfectly, couldnt account for 100-200% slow down in over-all NPS

I have tried rewriting these, but it hardly seems to matter.

InCheck() Gen() Eval()

Just looking for some help as to where I might look next. I ran out of ideas.....

I dont even collect attack information!

Thanks
User avatar
David Weller
 
Posts: 135
Joined: 26 Sep 2004, 20:30
Location: USA

Re: NPS

Postby Richard Pijl » 01 Sep 2005, 14:20

My InCheck() routine is similar in kind to TSCP's.

Now, InCheck() is in the top 2 routines when profiling [ IIRC 30-40%]

There's your problem. TSCP's InCheck() is highly inefficient. IIRC it will loop over all pieces to find out whether it attacks the king square.
An easy improvement is to start with the square of the king and find out if there is an enemy piece that attacks that square. So that would mean generating the rays from the king's square, not from the pieces squares.
Richard.
User avatar
Richard Pijl
 
Posts: 105
Joined: 26 Sep 2004, 21:09
Location: Minderhout, Belgium

Re: NPS

Postby David Weller » 01 Sep 2005, 17:21

Hi Richard,

Thank you.

But I must be mistaken again. I thought for some reason, that that was how he did it.

I _DO_ trace out from square seeking for pieces which attack it

Darn! When you said, 'there is your problem', I got so excited :)

Thanks for trying though.
User avatar
David Weller
 
Posts: 135
Joined: 26 Sep 2004, 20:30
Location: USA

Next

Return to Programming and Technical Discussions

Who is online

Users browsing this forum: No registered users and 28 guests