Winboard Forum

by **BrettVsop** » 07 Nov 2007, 17:11

I have been reading alot lately about bitboards x88 boards and others. I am writing my first chess engine, and I have tried all of them except the bitboard so far.

My engine is not fast. Even with the x88 board, i could not look 4 moves ahead (2 for each side) without having to wait ten or twenty seconds. Is this normal or incredibly slow?

I am confused because my code is not that bad, or at leasts looks pretty decent. I know areas that I can improve, but I don't think the improvement would be substantial. I am using a alpha-beta search algorithm too, btw.

I am now coding a bitboard to test the speed changes on that, but I am starting to think that it will make it slower. I do not do a very complicated evaluate, and I do not have 64bit registers. So the benefits will be minimal.

I am just looking for some feedback or ideas to speed this engine up. I am trying to avoid anything too complicated like hashtables, killer moves, and thinking on opponents time. If thats what I need to do, well then that might be what I will do, but I am sure there are simpler things I can improve first.

There is one thing about my engine which is probably slowing it down a little. I used alot of object oriented programming. I have all of my code separated into about five or ten classes. I did this because it greatly improves readability, but there are function calls, which could be, instead, memory references.

Thanks,
Brett

by **Roman Hartmann** » 07 Nov 2007, 17:51

Hi,
what kind of positions are you talking about when you have to wait 10 to 20 seconds till it finishes a 8 ply search? If you mean a position with a lot of pieces on the board with a lot of captures it seems not completely out of order if your move ordering isn't that great yet.

You don't mention a quiescent search. Is there any? Out of the blue I would guess that your move ordering isn't very good. A cheap way to correct this would be to add IID.
And what I'm also not sure is what you mean with using function calls instead of memory references? If you call alphabeta recursively without using pointers you will get a huge slowdown.

Roman

by **H.G.Muller** » 07 Nov 2007, 19:23

You should count the number of nodes per second that you search. Pretty sure there is something wrong with your alpha-beta implementation. 4 ply in 10 seconds is ridiculously slow, by about a factor 1000 from what it should be!

by **Aleks Peshkov** » 07 Nov 2007, 20:38

H.G.Muller wrote:You should count the number of nodes per second that you search. Pretty sure there is something wrong with your alpha-beta implementation. 4 ply in 10 seconds is ridiculously slow, by about a factor 1000 from what it should be!

Perft 4 in 10 seconds is very slow too. It looks like allocating and resizing std::vector of moves each time. Or passing large object copies as arguments instead of references.

by **Roman Hartmann** » 07 Nov 2007, 20:56

H.G.Muller wrote:You should count the number of nodes per second that you search. Pretty sure there is something wrong with your alpha-beta implementation. 4 ply in 10 seconds is ridiculously slow, by about a factor 1000 from what it should be!

I thought he talked about 4 moves which would mean 8 ply and not 4 ply only. 4 ply would indeed be very low.

Roman

by **BrettVsop** » 07 Nov 2007, 21:35

I am talking about 4ply search. I am not using quiescent search. I am not sure how to do that yet, although I saw it mentioned many times. I will look into that.

As far as the positions, I am talking about from the opening position. I don't plan on using an opening book at the moment.

My move ordering is not great, but I am doing minor ordering. I move all captures to the start of the list.

I am only passing a pointer to the board each time. I could remove that without much trouble, if that will improve it. I am not sure if that would make a big difference. I figured you might be interested in my code. So below is the alphabeta and evaluate function that I am using.

One thing that I found very confusing is when this function searches all legal moves, it runs faster then when it checks illegal moves as well. The function that checks for all moves is significantly faster then the function that looks for legal moves only, and a four ply deep search should not reveal too many illegal positions (from the starting position). So I have no idea why that is. I intend to look into that further this weekend.

Code: Select all: int CAlphaBeta::AlphaBeta(CReferee *chessboard, int iTime, int iAlpha, int iBeta) { CEval HndEval; CMovement64 HndMoves; SMove current_move; CMoveList tmp_move_list; int iCurrentValue; if (iTime == 0) return HndEval.Evaluate(chessboard,chessboard->IsWhitesTurn()); tmp_move_list.ClearAllMoves(); HndMoves.FindAllLegalMoves(chessboard,&tmp_move_list); int iMoveCount = tmp_move_list.GetMoveCount(); int i = 0; while (i < iMoveCount) { current_move = tmp_move_list.GetMove(i); chessboard->Move_Piece(current_move); iCurrentValue = -AlphaBeta(chessboard,iTime - 1,-iBeta,-iAlpha); chessboard->Undo_Move(); if (iCurrentValue > iAlpha) { iAlpha = iCurrentValue; } if (iCurrentValue >= iBeta) { return iBeta; } i++; } if (iMoveCount == 0) { int iStatus = chessboard->GetGameStatus(); if (iStatus == STALEMATE) return 0; return NEG_INFINITY; } return iAlpha; }

Code: Select all: int CEval::Evaluate(CReferee *chessboard, bool check_white) { CMovement64 HndMoves; CMoveList our_move_list; CMoveList opp_move_list; int iOurMaterialValue = chessboard->pieces.GetMaterialCount(check_white); int iOpponentsMaterialValue = chessboard->pieces.GetMaterialCount(!check_white); bool bWhitesTurn = chessboard->bWhitesTurn; chessboard->bWhitesTurn = check_white; HndMoves.FindAllMoves(chessboard,&our_move_list); chessboard->bWhitesTurn = !check_white; HndMoves.FindAllMoves(chessboard,&opp_move_list); chessboard->bWhitesTurn = bWhitesTurn; int iOurMoveCount = our_move_list.GetMoveCount(); int iOppMoveCount = opp_move_list.GetMoveCount(); return (iOurMaterialValue - iOpponentsMaterialValue) + (iOurMoveCount - iOppMoveCount); }

[/code]

by **BrettVsop** » 07 Nov 2007, 21:45

Roman Hartmann wrote:And what I'm also not sure is what you mean with using function calls instead of memory references?

Everything is an object, and I restricted the access of variables. Here is an example snippet

Code: Select all: if (chessboard->board.GetPieceType(iSquare) == N_KNIGHT)

Here the function does not have access to the board array, so instead it must ask the chessboard object to retrieve the value for it, by calling GetPieceType. If I did not limit the access, then I could do something like this.

Code: Select all: if ((chessboard->board.board[square] & N_KNIGHT) != 0)

I mention this, because I have looked through several popular chess program's code, and I noticed that most of them did it differently then I did.

by **Aleks Peshkov** » 08 Nov 2007, 00:45

Do not know how you implemented CMoveList, but your Evaluate is slow because of it. You create empty CMoveList, let it shrink in size and delete it bazillions of time. Test the speed of search with only material evaluation.

by **Dann Corbit** » 08 Nov 2007, 01:21

If I understand your eval() correctly, it is wood count + tempo.

It seems to me that tempo calculation would be easier by just using who's turn it is to move.

by **Dann Corbit** » 08 Nov 2007, 01:23

Dann Corbit wrote:If I understand your eval() correctly, it is wood count + tempo.

It seems to me that tempo calculation would be easier by just using who's turn it is to move.

I guess that maybe you subtract a tempo when the opponent backtracks?

by **Dann Corbit** » 08 Nov 2007, 01:25

When you run your code through a profiler, which functions dominate?

by **BrettVsop** » 08 Nov 2007, 01:46

Aleks Peshkov wrote:Do not know how you implemented CMoveList, but your Evaluate is slow because of it. You create empty CMoveList, let it shrink in size and delete it bazillions of time. Test the speed of search with only material evaluation.

I have already done that. Much faster. The move quality suffers a little though.

As far as the CMoveList Implementation, It never shrinks. The only time moves are removed are when they are found to be illegal. Of course though it is deallocated everytime AlphaBeta returns.

The code for CMoveList is not complicated. In fact it doesn't need a class. I hold the moves in a vector. The code gets several warnings whenever I compile, so I put it into a class. This way I only hear the warnings when I change CMoveList, which I never do.

Aleks Peshkov wrote:You create empty CMoveList, let it shrink in size and delete it bazillions of time

Is there another way? Should I only use one move list in the evaluate?

Dann Corbit wrote:If I understand your eval() correctly, it is wood count + tempo.

It seems to me that tempo calculation would be easier by just using who's turn it is to move.

No not tempo, the number of moves. I am looking at which side has a greater number of moves. That person has greater mobility and has an advantage.
The Eval is

Wood Count Advantage + Move Count Advantage.

So if we are losing on material, the first term will be negative.

Dann Corbit wrote:When you run your code through a profiler, which functions dominate?

I don't use a profiler. I can't say I really know anything about them. They sound useful though.

by **Aleks Peshkov** » 08 Nov 2007, 06:08

BrettVsop wrote:The code for CMoveList is not complicated. In fact it doesn't need a class. I hold the moves in a vector.

Standard library vector implementation is to allocate a small memory chunk for several elements, wait when it overflows then allocate another bigger one, copy all previous vector data to a new location and delete an old memory allocation... You can save some overhead if reserve a space for say 50 moves before filling move vector inside AlphaBeta(). Most programs store moves in a single huge shared array together and never allocate any dynamic memory during search.

Aleks Peshkov wrote:You create empty CMoveList, let it shrink in size and delete it bazillions of time
Is there another way? Should I only use one move list in the evaluate?

It will be better to precreate a single vector, and reuse it in every evaluation (vector will grow quickly to sufficient size and not overflow later). Faster to implement a special function in the move generator, that just counts number of moves it generates without any move list creation.

by **BrettVsop** » 08 Nov 2007, 06:40

Thats curious. One object to hold all of the moves?

I guess if I made a multi dimension array it would work. I would need the second dimension to separate move lists of different depths. I didn't think of that.

How big of an array do you use? I am thinking 150 for each depth.

If I were to stick with vectors and use your idea. Would the clear function deallocate memory? Or would that still keep a reserve for when I fill it?

by **Aleks Peshkov** » 08 Nov 2007, 08:26

BrettVsop wrote:One object to hold all of the moves? I guess if I made a multi dimension array it would work. I would need the second dimension to separate move lists of different depths.

Move lists are always sequential, you only need to keep end (move count, array index or iterator) of each depth move list -- the next list will begin just after the end of the previous.

How big of an array do you use? I am thinking 150 for each depth.

There is a known legal chess position with 218 legal moves, but allocating more then possibly needed in average chess game will create unused holes in memory that is generally not recommended for performance, but you may try. Hardware cache efficiency is a main reason to glue different move arrays together. Well, you have to think about target computer architecture when design a competitive chess program.

If I were to stick with vectors and use your idea. Would the clear function deallocate memory? Or would that still keep a reserve for when I fill it?

Clear function only claim allocated space empty without any memory operations, so it is very fast. If you decide to use dynamic sized lists I suggest you to try std::deque instead of std::vector.

by **Sven Schüle** » 08 Nov 2007, 14:46

BrettVsop wrote:I am talking about 4ply search. [...] My move ordering is not great, but I am doing minor ordering. I move all captures to the start of the list.

Hi,

your case does not sound like an optimization issue for me, so I would postpone the whole speed stuff until you got your algorithms right. All comments on speed are still valid, of course, but I think the key point is different because even with slow data structures you should not get these results.

As already proposed, I would add a node counter and print number of nodes, elapsed time and nodes per second at the end of the search. This may help to understand your search better. For a 4-ply search from the initial position without quiescence search, even 10000 nodes would be very much but I expect your engine does not need more than 0.5 seconds for 10000 nodes, even with an object-oriented approach which is IMO far from being the root cause of your problems. (Less than 0.1 seconds would be normal, though.) If you need much more nodes, something is wrong.

Also I would have a very close look on the move ordering. Have you implemented iterative deepening, or do you call CAlphaBeta::AlphaBeta() just once with the desired final search depth (e.g. iTime=4)? In the latter case, your results are not unexpected for me when searching the initial position because there are no legal captures during the first two (in most variations three) plies, so you have a very high branching factor, probably more than 15.

Iterative deeping will reduce the number of nodes in most cases. Without it, even with AlphaBeta you search a huge amount of useless nodes if the first moves at root are bad according to your eval. I would not be surprised if your search from the initial position would do it like this:

Code: Select all: Na3 Na6 ... Na3 Nc6 ... ... Nc3 ... ... Nf3 ... ... Nh3 ... ... a3 Na6 ... a3 Nc6 ... ... h4 h6 ... h4 h5 ...

But your eval after 1 ply would prefer 1.e4 due to additional moves for Qd1 and Bf1. So starting a 2-ply search with 1.e4 will result in more cutoffs than starting with 1.Na3, for example.

In case you already knew this, please apologize for my long explanations :-)

Sven

by **Sven Schüle** » 08 Nov 2007, 16:02

BrettVsop wrote:
Code: Select all
chessboard->Undo_Move();

One more note, I assume that an instance of your CReferee class stores the whole game history, is that right? Otherwise my question would be how you retrieve the information to undo the last move that happened.

Sven

by **YvesLejeail** » 08 Nov 2007, 18:15

Another ideas :
- how good is you move ordering? If it is very bad, your search result
would be at the speed of minimax (so that 4 plies is slow, but not so slow at this stage). In that case, having the best move ordering would give you a x2 in depth for the same time (depth 8 in 10 seconds).
- note that adding the quiescence would increase the nps of your engine. Cause it will also help in the move ordering in iterative deepening
Hope it helps,
Yves

by **BrettVsop** » 09 Nov 2007, 08:25

Sven Schüle wrote:
BrettVsop wrote:
Code: Select all
chessboard->Undo_Move();
One more note, I assume that an instance of your CReferee class stores the whole game history, is that right? Otherwise my question would be how you retrieve the information to undo the last move that happened.

Sven

It does. It contains additional information, such as captured piece type as well.

by **BrettVsop** » 09 Nov 2007, 08:30

YvesLejeail wrote:Another ideas :
- how good is you move ordering? If it is very bad, your search result
would be at the speed of minimax (so that 4 plies is slow, but not so slow at this stage). In that case, having the best move ordering would give you a x2 in depth for the same time (depth 8 in 10 seconds).
- note that adding the quiescence would increase the nps of your engine. Cause it will also help in the move ordering in iterative deepening
Hope it helps,
Yves

Well I guess you will need to judge for yourself how good it is. I do not know enough about move ordering to compare it to an average. I consider captures, then non captures. I am really interested in storing the principal variation after this conversation. I plan on adding that information into the move ordering.

It seems to do well though, I see early cutoffs. I am looking into the Quiescence ordering as well.

Winboard Forum

Speed and Board Storing Choices

Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Re: Speed and Board Storing Choices

Who is online