OliThink 5

Programming Topics (Computer Chess) and technical aspects as test techniques, book building, program tuning etc

Moderator: Andres Valverde

OliThink 5

Postby OliverBr » 29 Oct 2004, 23:05

Perhaps this could interest someone, so here the readme file:

=======================================================
OliThink 5 preAlpha Version. (c) 2004 Oliver Brausch

Do not expect a well playing engine yet!
The focus on this realease was to produce a move generator
most compact and very fast.

It has only the most needed features to play chess,
a movegenerator, a search and a evaluation based just
on material.
It plays almost complete chess, except of castling and enpassant.
Even there is no tree cutting at all, so *all* moves
are searched. This yields in an average of 20 million nodes
in a search depth of 5 halfmoves.

The perfomance of 4 million nodes per second on a 2 GHz Pentium 4
is the impressive aspect of it. On a modern PC it should reach
8 million nodes, on a 64bit machine even 15 million nodes per second.

It will be interesting to see in the further development how this
number will decrease by implementing stuff like castling, tree cutting,
complex evaluation....etc.
Again the code is quite short, hardly 500 lines of code .

For comments etc: ob112@web.de
Download: http://home.arcor.de/dreamlike


Chess - OliThink 5
feature done=1

-----------------
|r|n|b|q|k|b|n|r|
-----------------
|p|p|p|p|p|p|p|p|
-----------------
| | | | | | | | |
-----------------
| | | | | | | | |
-----------------
| | | | | | | | |
-----------------
| | | | | | | | |
-----------------
|P|P|P|P|P|P|P|P|
-----------------
|R|N|B|Q|K|B|N|R|
-----------------0
e2e4

-----------------
|r|n|b|q|k|b|n|r|
-----------------
|p|p|p|p|p|p|p|p|
-----------------
| | | | | | | | |
-----------------
| | | | | | | | |
-----------------
| | | | |P| | | |
-----------------
| | | | | | | | |
-----------------
|P|P|P|P| |P|P|P|
-----------------
|R|N|B|Q|K|B|N|R|
-----------------0
1. ... b7-b6
100 nodes: 10565822, Time: 2734, knps: 3864

-----------------
|r|n|b|q|k|b|n|r|
-----------------
|p| |p|p|p|p|p|p|
-----------------
| |p| | | | | | |
-----------------
| | | | | | | | |
-----------------
| | | | |P| | | |
-----------------
| | | | | | | | |
-----------------
|P|P|P|P| |P|P|P|
-----------------
|R|N|B|Q|K|B|N|R|
-----------------0
OliverBr
 

Re: OliThink 5

Postby Zach Wegner » 29 Oct 2004, 23:30

That is an impressive speed. Your make and unmake functions are very elegant, although I would not like to have 4 bitboards for each piecetype, and no board[] array.

Regards,
Zach
User avatar
Zach Wegner
 
Posts: 182
Joined: 26 Sep 2004, 22:02
Location: Austin, Texas, USA

Re: OliThink 5

Postby OliverBr » 30 Oct 2004, 01:40

Thank you for looking such fast.
In OliThink 4 I had the board[] array and it was always bothering
me as this is not a clean bitboard implementation. I got rid off it
and I think I won speed and better code structure.

-Oliver

PS: preAlpha2 is finished, with enpassant, but this costed
already 3% of speed :( not to think about the castlestuff...

Zach Wegner wrote:That is an impressive speed. Your make and unmake functions are very elegant, although I would not like to have 4 bitboards for each piecetype, and no board[] array.

Regards,
Zach
OliverBr
 

Re: OliThink 5

Postby OliverBr » 30 Oct 2004, 02:47

Another thing:
Like my search is perft style, I did it now exactly a perft
after e2e4 for depth 6 (means no moving and eval at all
at leaf)

value: 0 nodes: 335995874, Time: 20797, knps: 16155

the number 335 million is slightly to big because I allow
king captures ;o) but nevertheless I like the 16 Mnps on
a 2GHz P4


Zach Wegner wrote:That is an impressive speed.
OliverBr
 

Re: OliThink 5

Postby Uri Blass » 30 Oct 2004, 09:24

I think that with king captures 16M nodes in perft are not a lot.

Movei in perft mode with only legal move get the result 309478263 in 8.188 seconds on A3000(2.001 ghz).

Uri
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Something to challenge thinker, maybe

Postby Dann Corbit » 30 Oct 2004, 09:26

Not yet, of course.

But ultra-compact and very strong. Maybe there can be a contest of Elo/Kb of binary.

;-)

I got 5202 KNps on 64 bit Athlon, but with a 32 bit compile.

I suspect that you can get a very large increase with a true 64 bit compiler on a 64 bit OS.
Dann Corbit
 

Re: OliThink 5

Postby OliverBr » 30 Oct 2004, 09:48

Movei is based on bitboards, right? And the A3000 is 64bit?
Ok, but your 37 Mnps is astonishing, did I miss a trick doing perft?


Uri Blass wrote:I think that with king captures 16M nodes in perft are not a lot.

Movei in perft mode with only legal move get the result 309478263 in 8.188 seconds on A3000(2.001 ghz).

Uri
OliverBr
 

Re: Something to challenge thinker, maybe

Postby OliverBr » 30 Oct 2004, 09:50

Hi Dann, nice to talk with you again :D
I am wondering what a 64bit compile brings... a factor 2 should be
the result

Dann Corbit wrote:Not yet, of course.

But ultra-compact and very strong. Maybe there can be a contest of Elo/Kb of binary.

;-)

I got 5202 KNps on 64 bit Athlon, but with a 32 bit compile.

I suspect that you can get a very large increase with a true 64 bit compiler on a 64 bit OS.
OliverBr
 

Re: OliThink 5

Postby Uri Blass » 30 Oct 2004, 10:25

OliverBr wrote:Movei is based on bitboards, right? And the A3000 is 64bit?
Ok, but your 37 Mnps is astonishing, did I miss a trick doing perft?


Uri Blass wrote:I think that with king captures 16M nodes in perft are not a lot.

Movei in perft mode with only legal move get the result 309478263 in 8.188 seconds on A3000(2.001 ghz).

Uri


Movei is not based on bitboard but it has a very big move generator(some thousands of lines of C).

Maybe the trick is to write long code because with less than hundreds of lines you cannot be very fast.


I also use legal move generator and attack tables that helped me to get things faster but Smirf shows that it is possible to get even better results than movei without attack tables.

Note that for me makemove is expensive because it updates information like attack tables so I do not make the last move.

Uri
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Re: OliThink 5

Postby Reinhard Scharnagl » 30 Oct 2004, 10:25

Hi OliverBr,

if you want to compare Perft results, Smirf gets (on P4, 2.8 GHz, 32-Bit):
Code: Select all
FEN: rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

  +-*--b--c--d--*--f--g--*-+ MS Vis. Studio C++ Vers. 13.10
8 |[r][n][b][q][k][b][n][r]| (Compilation: Oct 27 2004)
7 |[p][p][p][p][p][p][p][p]|
6 |   :::   :::   :::   :::| Perft Testseries
5 |:::   :::   :::   :::   |
4 |   :::   :::   :::   :::| (without caching)
3 |:::   :::   :::   :::   |
2 |<P><P><P><P><P><P><P><P>| Smirf Test No.: 00
1 |<R><N><B><Q><K><B><N><R>|
=>+-*--b--c--d--*--f--g--*-+ Break Time 75.0 Sec.

Ply     Nodes    all (x)   (e.p.)    all (+) all (++)    Promot.    Castl. Sec.
-------------------------------------------------------------------------------
1          20          0        0          0        0          0         0    0
2         400          0        0          0        0          0         0    0
3        8902         34        0         12        0          0         0    0
4      197281       1576        0        469        0          0         0    0
5     4865609      82719      258      27351        0          0         0  0.2
6   119060324    2812008     5248     809099       46          0         0  4.8
7  3195901860  108329926   319617   33103848     1628          0    883453  122
-------------------------------------------------------------------------------

(Legal moves only, no BitBoards, supporting a lot of information (slightly more than shown in the statistic above)).

Regards from Munich, too,

Reinhard.
Reinhard Scharnagl
 
Posts: 608
Joined: 01 Oct 2004, 08:36
Location: Klein-Gerau, Germany

Re: OliThink 5

Postby Uri Blass » 30 Oct 2004, 10:35

Hi Reinhard,

What is the size of the functions that are needed to calculate your perft
information?

I cannot give it in kbytes because movei in perft mode supports also a lot of irrelevant stuff like the winboard commands but I can say that I have thousands of lines of C that are only used for updating my attack tables.

Uri
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Re: OliThink 5

Postby Reinhard Scharnagl » 30 Oct 2004, 10:52

Hi Uri,

well, the lines of my C++ code (containing about 1/3 comments) mostly are broken not to extend 50 to 60 chars a line. Having such lines the move generator has about 2000 lines, the move and remove part about 700 lines (because a lot of additional calculations is added there as: keeping my double linked piece lists actual, updating hash key and verification code, material balance, handling special moves like e.p. and castling (FRC also), being aware for 8x8 and 10x8 boards).

Regards, Reinhard.

P.S.: handling FRC castling rights during move and remove is a lot more of work than in traditional chess game because the positions of king and rook being able to castle are dynamicly defined then.
Reinhard Scharnagl
 
Posts: 608
Joined: 01 Oct 2004, 08:36
Location: Klein-Gerau, Germany

Re: OliThink 5

Postby Tord Romstad » 30 Oct 2004, 11:28

Hi Oliver,

I tested the new OliThink on a 64-bit computer; a G5 1.8 GHz. A 32-bit executable gave the following numbers:
Code: Select all
100 nodes: 10565822, Time: 2793, knps: 3782

The 64-bit executable was very slightly faster:
Code: Select all
100 nodes: 10565822, Time: 2529, knps: 4177

The 32-bit binary was compiled with 'gcc -O3 -funroll-loops -fomit-frame-pointer -o olithink', while the 64-bit binary was compiled with 'gcc -O3 -funroll-loops -fomit-frame-pointer -fast -mcpu=G5 -mpowerpc64 -o olithink64'.

Thanks a lot for the new version. The source code looks really neat, as always in OliThink. It is definitely worth a closer look for me, as I started experimenting with bitboards in my own program this morning. The first results are rather disappointing, but there are certainly lots of things to improve.

From my perspective, the timing for the release of this new version of my favorite bitboard chess engine couldn't have been better! :D

Tord
User avatar
Tord Romstad
 
Posts: 639
Joined: 09 Oct 2004, 12:49
Location: Oslo, Norway

Re: OliThink 5

Postby Uri Blass » 30 Oct 2004, 11:49

Reinhard Scharnagl wrote:Hi Uri,

well, the lines of my C++ code (containing about 1/3 comments) mostly are broken not to extend 50 to 60 chars a line. Having such lines the move generator has about 2000 lines, the move and remove part about 700 lines (because a lot of additional calculations is added there as: keeping my double linked piece lists actual, updating hash key and verification code, material balance, handling special moves like e.p. and castling (FRC also), being aware for 8x8 and 10x8 boards).

Regards, Reinhard.

P.S.: handling FRC castling rights during move and remove is a lot more of work than in traditional chess game because the positions of king and rook being able to castle are dynamicly defined then.


Hi Reinhard.

It seems that you also have a big move generator.

I have in most cases less than 50-60 chars in a line and it could be even less than it in case of choosing smaller names for my varaibles

Here is a typical line
squaresee[target][2]=see;

There are of course longer lines but they are also shorter lines like the following 3 lines
target+=1;
else
break;

Uri
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Re: OliThink 5

Postby Reinhard Scharnagl » 30 Oct 2004, 12:04

Hi Uri,
It seems that you also have a big move generator.

like you would have recognized one has to chose either to consume time using node expansion to legalize moves or to invest some thoughts into code to get the same effect more productive.
I have in most cases less than 50-60 chars in a line and it could be even less than it in case of choosing smaller names for my varaibles

Thus the coding styles seem not to differ much (I am counting also spaces), even when not knowing your commenting rate.

Is your A3000/2 GHz a 32 bit (I prosume) or 64 bit CPU? The programming style of my solution would highly profit from the additionally present CPU registers of a 64 bit CPU. Though this moment it is not that important it could become relevant some day.

Reinhard.
Reinhard Scharnagl
 
Posts: 608
Joined: 01 Oct 2004, 08:36
Location: Klein-Gerau, Germany

Re: OliThink 5

Postby Uri Blass » 30 Oct 2004, 12:37

64 bits-CPU.
It is not athlon XP that is a slower athlon.

Uri
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Re: OliThink 5

Postby OliverBr » 30 Oct 2004, 12:45

Hello Tord,

Tord Romstad wrote:Hi Oliver,
The 64-bit executable was very slightly faster:


This is too strange. If I look at my compiled 32bit machine
code, I see a huge amount of instruction twice, because
of the 32 <-> 64 "myregistersaretooshort" - problem.

Compiled for 64bit there should be half instructions, so half
time, thus double speed. Have you looked to the compiled
code, if it is done correctly?

BTW, I am happy that the Persians when they invented chess,
did consider the problem of alignement with computer registers,
so they choose wisely a 8x8 board.

Tord Romstad wrote:Thanks a lot for the new version. The source code looks really neat, as always in OliThink.
Tord


Great that I found the right time. It is more the weather and
the season do find the right reasons to stay at home and do some
chess programming.
The challeng for olithink will again be a short and compact code
with best performance. From Olithink 3 to 4 it improves 200 elo
points. I hope this step again, even this will be very hard.

-Oliver
OliverBr
 

Re: OliThink 5

Postby Michael Yee » 30 Oct 2004, 13:22

OliverBr wrote:Ok, but your 37 Mnps is astonishing, did I miss a trick doing perft?

One trick you can use for perft() is to not make/undo the last ply of moves. That is, instead of the base case for perft() being :

when depth == 0, return 1

it becomes

when depth == 1, return number of legal moves

This could save a lot if your make/unmake are complex.
Michael Yee
 
Posts: 51
Joined: 26 Sep 2004, 19:49

Re: OliThink 5

Postby OliverBr » 30 Oct 2004, 13:36

Look at the significant part of my current PreAlpha Version:

/* if (depth == 0) return eval(on_move); // Non Perft */
if (depth == 1) { // Perft
nodes += generate(on_move, ply);
return 0;
}

With this I get 16 Mnps on a 32bit 2Ghz P4.

When the 37 Mnps are really true, I have to give up
programming... :-(

Michael Yee wrote:One trick you can use for perft() is to not make/undo the last ply of moves. That is, instead of the base case for perft() being :

when depth == 0, return 1

it becomes

when depth == 1, return number of legal moves

This could save a lot if your make/unmake are complex.
OliverBr
 

Re: OliThink 5

Postby Tord Romstad » 30 Oct 2004, 13:39

OliverBr wrote:Compiled for 64bit there should be half instructions, so half time, thus double speed. Have you looked to the compiled code, if it is done correctly?

Unfortunately, I understand almost nothing of PowerPC assembly language. The length of the assembly language output for the 32-bit and the 64-bit compile is similar, but the code looks quite different. Here is the getShiftMask function, first in the 32-bit compile, then in the 64-bit compile:
Code: Select all
_getShiftMask:
   cmpwi cr7,r3,1
   mflr r2
   bcl 20,31,"L00000000006$pb"
"L00000000006$pb":
   mflr r10
   mtlr r2
   beq- cr7,L788
   bgt- cr7,L795
   cmpwi cr0,r3,0
   rlwinm r3,r4,0,26,28
   beqlr- cr0
   b L786
L795:
   cmpwi cr1,r3,2
   beq- cr1,L789
   cmpwi cr6,r3,3
   beq- cr6,L791
   b L786
L788:
   addis r3,r10,ha16(_r045shift-"L00000000006$pb")
   slwi r6,r4,2
   la r5,lo16(_r045shift-"L00000000006$pb")(r3)
   lwzx r3,r6,r5
   blr
L789:
   nor r2,r4,r4
   rlwinm r3,r2,3,26,28
   blr
L791:
   nor r12,r4,r4
   rlwinm r11,r4,0,26,28
   rlwinm r9,r12,0,29,31
   addis r8,r10,ha16(_r045shift-"L00000000006$pb")
   add r7,r11,r9
   la r4,lo16(_r045shift-"L00000000006$pb")(r8)
   slwi r0,r7,2
   lwzx r3,r4,r0
   blr
L786:
   li r3,0
   blr

Code: Select all
_getShiftMask:
   cmpwi cr7,r3,1
   beq cr7,L20
   bgt cr7,L27
   cmpwi cr0,r3,0
   rlwinm r3,r4,0,26,28
   beqlr cr0
   b L18
   .p2align 4,,15
L27:
   cmpwi cr0,r3,2
   beq cr0,L21
   cmpwi cr1,r3,3
   beq cr1,L23
   b L18
   .p2align 4,,15
L20:
   lis r3,ha16(_r045shift)
   slwi r5,r4,2
   la r4,lo16(_r045shift)(r3)
   lwzx r3,r5,r4
   blr
   .p2align 4,,15
L21:
   nor r2,r4,r4
   rldicl r8,r4,61,61
   rlwinm r7,r2,3,26,28
   add r6,r7,r8
   rlwinm r3,r6,0,26,28
   blr
L23:
   nor r5,r4,r4
   rlwinm r9,r4,0,26,28
   lis r12,ha16(_r045shift)
   rlwinm r4,r5,0,29,31
   la r10,lo16(_r045shift)(r12)
   add r11,r9,r4
   slwi r0,r11,2
   lwzx r3,r10,r0
   blr
   .p2align 4,,15
L18:
   li r3,0
   blr

The most striking difference is that the 32-bit version contains lines like
Code: Select all
   addis r3,r10,ha16(_r045shift-"L00000000006$pb")

scattered throughout the program. Such lines do not occur anywhere in the 64-bit version.

Tord
User avatar
Tord Romstad
 
Posts: 639
Joined: 09 Oct 2004, 12:49
Location: Oslo, Norway

Next

Return to Programming and Technical Discussions

Who is online

Users browsing this forum: No registered users and 31 guests