Winboard Forum

by **Reinhard Scharnagl** » 30 Jun 2005, 09:55

There have been discussions switching to about this theme.

So by what could a good data structure be proved, without making the source code public? I always thought, the speed of generating legal moves with additional information whether the move is a check threat, capture, promotion or not could be a representative figure.

Reinhard.

by **Stan Arts** » 30 Jun 2005, 10:26

Hi Reinhard,

Speed is an important factor. But imho, not as important as a clear and easy to understand and use datastructure.
A difficult to understand and improve datastructure will give a difficult to understand and improve program, and having the fastest datastructure is only usefull if you managed to write a 2600 Elo program in the first place.

Otherwise, I figure you should always go for the simplest, and a simple 8x8 or 10x12 array like I use, (Oooh, I gave in for speed.) does just fine.

Stan

by **Reinhard Scharnagl** » 30 Jun 2005, 11:35

Hi Stan,

because of special reasons I use a linear structure interpreted as a 14x15 geometry e.g. to support 10x8 as 8x8 chess as well in one engine.

Though I have not yet implemented all of my ideas, especially a performant passed pawn detection, I am convinced this would not be slower than a bitboard approch, but much clearer, which is of course important.

Reinhard.

by **Uri Blass** » 30 Jun 2005, 14:17

Reinhard Scharnagl wrote:There have been discussions switching to about this theme.

So by what could a good data structure be proved, without making the source code public? I always thought, the speed of generating legal moves with additional information whether the move is a check threat, capture, promotion or not could be a representative figure.

Reinhard.

I agree that the speed of generating information can be representive figure but I think that we can add information in the additional information.

possible additional information:
1)number of open pawns(pawns that there is no opponent pawns that block them to promote in the same file).

2)number of passed pawns(open pawns that cannot be captured by a pawn in their path to promotion).

3)number of isolated pawns(pawns that have no friendly pawn in a file near them like pawn at b file when there is no pawns in a or c files).

4)number of pinned pieces for both sides(pinned to the king).

5)number of good captures(captures that the opponent cannot reply by capturing a bigger piece).

Unfortuantely I do not expect many engines to have functions to generate all the information in perft and I even do not know about a single engine that generates the information that smirf generates today(including the information if the move is a check).

My legal move generator today mark every move if it is a capture but does not give the information if the move is check and I do not know about a single move generator that does it except smirf(it is probably a good idea but it will not be implemented in movei in the near future and counting checks by making moves and undoing it is slow so I am not interested in implementing it in that way).

Uri

by **Tord Romstad** » 30 Jun 2005, 14:27

Hi Reinhard!

Reinhard Scharnagl wrote:There have been discussions switching to about this theme.

So by what could a good data structure be proved, without making the source code public? I always thought, the speed of generating legal moves with additional information whether the move is a check threat, capture, promotion or not could be a representative figure.

In my opinion, there is no general answer to this question. It all depends on what you want to do in your engine, and in particular on what kind of questions you want to ask in your evaluation function. A data structure which is elegant and efficient in engine A can often be clumsy and impractical in engine B.

Tord

by **Reinhard Scharnagl** » 30 Jun 2005, 14:40

Hi Uri,

it does of course make sense to also show the ability of the data structure to support detail evaluating of positions. This is a part of Smirf currently being miles behind. Thus my detail evaluation is very weak and slow. In fact it is the worst part of the whole Smirf package.

What already is supporting evaluation within my data structure is, that there is no need to find pieces on the board to enumerate, because all pieces automatically constitute two sorted double linked lists by themself.

What will be done next is establishing an efficient way of supporting cheap passed pawn assisting information, to become able to decide a pawn to be a passed one by checking a single byte only. There already is a concept, which will work that way - I am convinced. But it is necessary to compatibly extend my data structure, which usually is based on 32 Bit integers.

Pinned pieces in Smirf actually are detected during move generation. There is no assisting table yet. But I am about to think it over, because it might also support evaluation. On the other side it is very easy to compute pinning information for a piece because of my data structure, thus the benefit might be irrelevant, same is valid for detecting check threats.

But for all this it is difficult to compare its efficiency.

Reinhard.

by **Richard Pijl** » 30 Jun 2005, 14:47

Every datastructure (board represenatation?) has its strong points. Basically, there is no real need to restrict yourself to only one internal board in a program. In the Baron I have several bitboards alongside an 8x8 array. For some tasks the bitboards are convenient, for others the 8x8 array. An additional board structure just makes making/unmaking a move slightly more costly (as the logic to update the board should be roughly the same).

So if the performance of different structures is really different, just implement both and use the most convenient one with whatever you're doing.
Richard.

by **Reinhard Scharnagl** » 30 Jun 2005, 15:42

Hi Richard,

... just implement both and use the most convenient one ...

hmm... , I think, this would be very dangerous. Over the time you would have spread some information into different best fitting representations. The actuality of changes will hardly to be kept synchroneous. So during the time you might have an intransparent hybrid solution, making it much harder to perform substantial changes.

Thus I intend to keep my data concept clean. This is leading to difficulties now, when I try to extend the model ... but they have to be solved, not be delegated to a concurrent model.

Reinhard.

by **Michael Yee** » 01 Jul 2005, 01:33

Hi Reinhard,

As others have said in the past, having fast move generation doesn't buy you much if you determine (using profiling) that only a small percentage of time is spent generating moves anyway.

So I agree with the other posters that the goals of the data structures should be either:

(1) transparency and easy maintainability

(2) efficiency in areas where your program spends the most time (perhaps static eval)

or a mixture of both

Michael

by **Dann Corbit** » 01 Jul 2005, 03:14

(Agreeing with Michael Yee)
Look at what Fabian did.

Lots of clear, easy to understand data structures.

Notice that there is not a single line of assembly in Fruit. According to Vincent, he could get 2x speed (and hence 50-70 Elo according to some estimates). Actually, I think that [except for tiny snippets] only a superb expert will get better results than a C compiler using assembly and so when used as a last resort, it should be confined to tiny hot spots.

Every big project I have ever heard of that was written in assembly language aged very poorly. Look at the trauma you would go through if you had 100K lines in assembly langauge and then a marvelous new chip comes out (which happens about once a year, every year).

Anyway, I don't think that Fabian has spent very much time making things fast. He has spent a lot of time making things clever. He spent even more time making things correct (these are my estimates). At any rate, the general rule of thumb for all programming projects is:
1. Make it right
2. Make it fast
A. Profile
B. Improve the algorithm
C. If {and only if} desperation warrants, resort to tweaky hacks for the tiny hot spots where it matters. These places must be determined by profile.

Now, if step 1 is never achieved, we should never attempt step 2. And if you do advance to step 2, part C is pure desperation.

A few years ago Assembly language had more importance than it does now. It can also foil the compiler, so don't just benchmark the little snippet against a tiny bit of C you want to replace, but benchmark the speedup for the whole program once it is in place.

Wirth's old equation:
Data Structures + Algorithms = Programs
still works well for the C paradigm.

by **Dan Honeycutt** » 01 Jul 2005, 03:56

Reinhard Scharnagl wrote:There have been discussions switching to about this theme.

So by what could a good data structure be proved, without making the source code public? I always thought, the speed of generating legal moves with additional information whether the move is a check threat, capture, promotion or not could be a representative figure.

Reinhard.

Hi Reinhard

I think you're out of luck. Some move generators only generate moves. Others, like yours if I understand correctly, also generate additional information in the process. Unless you happen to find a program that does exactly the same work that yours does, you have nothing to compare with.

Of course, you would need the source code to determine this

Best
Dan H.

by **Reinhard Scharnagl** » 01 Jul 2005, 06:33

Hi Dan,

... Unless you happen to find a program that does exactly the same work that yours does, you have nothing to compare with. ...

comparing to programs doing not that much and nonetheless being faster would establish a result.

Of course, you would need the source code to determine this.

It would be sufficient for me to believe in the autor's statement on that.

But, as me and others already have mentioned, move generation is only one indicator of data structure's quality and potential. Anyway it would be interesting to know, whether a kind of hitlist would exist for that.

Reinhard.

by **Dan Honeycutt** » 02 Jul 2005, 01:57

Reinhard Scharnagl wrote:comparing to programs doing not that much and nonetheless being faster would establish a result.

Hi Reinhard

Yes, if you can do more in the same time that is a good indication.

Another thought - have a means to switch off the extra information your move generator produces. You can the compare perft results with any other engine and yours should be as fast or faster if your structure is efficient.

Best
Dan H.

by **Reinhard Scharnagl** » 02 Jul 2005, 08:26

Hi Dan,

concerning Perft results, there already has been a silly discussion. The point has been, whether (when the speed is relevant and not the correctness of the results) it would be cheating to skip a final expansion because you already have sufficient legality information.

I have argued, that it does not matter, when and where you perform those validy checks. And Smirf does it during the move generation. Moreover than this it produces additional results and is busy with adapting to 8x8 or 10x8 geometry requests, and covering two more piece types: Archbishop and Chancellor (even in 8x8). And without using other (more specialized) routines than in normal engine use, I am already very happy with my Perft results, even with generating all those additional informations.

Moreover for testing purposes - concerning the effectivity of the transposition table - I also implemented a Perft variant using cache, which works fine. I recommend it for to check out the quality and stability of an engine's hash key generating.

Reinhard.

by **Sune Fischer** » 02 Jul 2005, 09:28

Hi Reinhard

One way to make a chess engine is to break it into seperate tasks, there is a move generator, a search and an evaluation.
Then code these to run as independently as possible in a modular fasion.

This is good for perft, and this design makes it relative easy to compare e.g speed of move generators.

But a completely different approach is also possible.

The guiding principle here is to never do anything that might be wasted, you do everything "on the fly" when and where you need it.

If you use this principle the individual tasks might take a bit longer, but you make up for it by cutting corners and saving work elsewhere.
I.e. don't generate more moves than you need, don't do legality checks on moves until you needed them, don't do a full eval if a lazy eval is enough to enure the value falls outside the window.

Like I wrote in a different thread, you could see the speed drop by 50% in tests that require full info and where no corners can be cut, like in perft, but if it saves you 85% of the work in the real program it is still a good deal

I'm not saying which is a better design, because there are other and more important things to consider than just speed, but the "generate all info all the time" method is never going to be the fastest in raw nps.

-S.

by **Reinhard Scharnagl** » 02 Jul 2005, 10:28

Hi Sune,

... but the "generate all info all the time" method is never going to be the fastest in raw nps.

I hope you will be not too sorry, when I cannot share this opinion that generally.

I originally (when not working on a really playing engine) have started with a sequential move generator on demand, just because of very similar thoughts. But I have learned, that you cannot separate it from probability estimations. Creation of information could be cheaper doing it at once.

If it would cost 33% to generate all moves at once compared to do this sequentially and the probability to use that information (in this case the move) would be 50%, then 33%*100% < 100%*50%. In that example you would win by creating all moves at once, especially when additional information could lead to a better move preordering. Moreover it enables me to tune some pruning methods like nullmove by knowing the exact number of existing moves.

A comparable situation is by the decision whether to sort preveluated moves at once or search the remaining optimum by demand. My decision has been now to sort at once using a performant shell sort derived method. Generating moves on demand will hardly create them all in a good order.

Always there, where a doing-all-at-once method is far more productive than the sequential task on demand, probability calculations should apply and, if the whole information at hand would additionally allow more heuristics to be applied.

Reinhard.

by **Sune Fischer** » 02 Jul 2005, 13:39

Reinhard Scharnagl wrote:Hi Sune,
... but the "generate all info all the time" method is never going to be the fastest in raw nps.
I hope you will be not too sorry, when I cannot share this opinion that generally.

I originally (when not working on a really playing engine) have started with a sequential move generator on demand, just because of very similar thoughts. But I have learned, that you cannot separate it from probability estimations. Creation of information could be cheaper doing it at once.

If it would cost 33% to generate all moves at once compared to do this sequentially and the probability to use that information (in this case the move) would be 50%, then 33%*100% < 100%*50%. In that example you would win by creating all moves at once, especially when additional information could lead to a better move preordering. Moreover it enables me to tune some pruning methods like nullmove by knowing the exact number of existing moves.

Hmm I do not see where you get your 33% and 50% from.

The overhead in generating all 40 moves compared to generating only 3-5 captures (if that gives you a cutoff) is a lot more expensive than 33%.

It is also more expensive than 33% to afterwards sort 40 moves compared to sorting 5.

The 50% doesn't sound right either. If you include qsearch its about 80% of the cases where you don't need non-captures.

If you do this probability you will find that in 80% of the cases you can save 80% work. Only in the last 20% of the cases you will have to be 150% inefficient.

If doing everything at once has a "time cost" of 1, then doing it incremented has a relative cost of

0.20*0.80 + 1.50*0.20 = 0.46 < 1

According to this rough estimate it should speed up a factor 2 to do incremented generation.

Of course 2 times faster is not that much in a real program where the profile indicates that generating moves is only 10% of total cpu time (it's 6-8% in Frenzee including making and unmaking), so there could be some other reasons not do it.

Reinhard Scharnagl wrote:A comparable situation is by the decision whether to sort preveluated moves at once or search the remaining optimum by demand. My decision has been now to sort at once using a performant shell sort derived method. Generating moves on demand will hardly create them all in a good order.

Always there, where a doing-all-at-once method is far more productive than the sequential task on demand, probability calculations should apply and, if the whole information at hand would additionally allow more heuristics to be applied.

I search the hash move and null killer move before even generating captures.

If I generated all the moves at once I would still be searching the winning captures next because I have not found any other way to pick a move with higher probability of a cutoff.

If I was able to pick a better third best move by comparing statical info on all the moves, then indeed it might not be worth while to use a incremental generator.
Specialized SEE is hard to beat by generalized statistics though

-S.

by **Reinhard Scharnagl** » 02 Jul 2005, 15:26

Hi Sune,

I search the hash move and null killer move before even generating captures.

that is your philosophy of applying nullmove cut offs, which of course is shared by others, not mine.

Some days again I have had a discussion with Vincent, who has been a little bit upset, when I tried to tell him, that double nullmove would not change that much in my engine concerning Zugzwang. But such wrong conclusions on other chess program's behaviour is proving me, that it is not easy to imagine that other programmers might do things in an other way.

I search the hash move and null killer move before even generating captures.

Even that move might be refusable by a nullmove or by other pruning methods. And one win of having a fast all move generator is to become able to make some decisions cheaper like having nullmove applied or not e.g. to avoid Zugzwang or a nullmove with probably bad (non pruning) result.

So it is does not make sense to see those details isolated.

Reinhard.

by **Sune Fischer** » 02 Jul 2005, 16:31

Reinhard Scharnagl wrote:Hi Sune,
I search the hash move and null killer move before even generating captures.
that is your philosophy of applying nullmove cut offs, which of course is shared by others, not mine.

Yes this is a trick to see what is hanging or threatening.
There are probably other and cheaper ways to get that info if you don't want to nullmove.
The nullkiller is only half good, because it doesn't tell you what to move at the previous ply where you fail-low.
For instance if PxQ is threatening, we should try and move the queen perhaps - but where to, and should we order all queen moves first?
Maybe we should capture the pawn first instead of that tasty rook over there..?

There must be room for some optimization at the previous ply in some way..

Reinhard Scharnagl wrote:Some days again I have had a discussion with Vincent, who has been a little bit upset, when I tried to tell him, that double nullmove would not change that much in my engine concerning Zugzwang. But such wrong conclusions on other chess program's behaviour is proving me, that it is not easy to imagine that other programmers might do things in an other way.

I couldn't get double nullmove to really work, maybe it had a bug but I think losing effectively 6 plies before zugzwang can be detected is too much anyway in most cases.

And one win of having a fast all move generator is to become able to make some decisions cheaper like having nullmove applied or not e.g. to avoid Zugzwang or a nullmove with probably bad (non pruning) result.

So it is does not make sense to see those details isolated.
Reinhard.

Yes if I had to generate all moves I too would try and figure out ways of using the extra info, and this is one of the things that could be handy to know.
I have tried generating all moves at the interior nodes where speed is not important and used this trick as well, I couldn't register any improvement so I guess the effect isn't all that large.
I'll probably keep the all node generator for the interior nodes though, I want to try some ETC as well, especially ETC in the endgames.

A more general nullmove detector is however still needed I think.

-S.

Winboard Forum

How to prove a good data structure?

How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Re: How to prove a good data structure?

Who is online