Tord Romstad wrote:Onno Garms wrote:with the release of Onno 1.0 in near future
Cool!
Will it be GPLed?
No. I consider writing chess engines as competitive sports, like playing human chess. Making GPLed releases is like handing all your elaborated preparations to your opponent. It's OK to exchange ideas like it's OK to analyze openings with a fellow human player, but releasing the source is too much for me. Sorry.
I will try to go commercial. But I will give out many free copies by some criterium that I do not yet know precisely. Authors of strong open source engines certainly get one. Also people with whom I had discussions that helped significantly. I don't expect to sell many licenses but I find that way more tempting.
I'm in more or less the same situation as you: I've made the mistake of optimizing prematurely for 64-bit CPUs by making heavy use of bitboards and burning too many bridges, only to discover too late that performance on 32-bit CPUs is what really matters. I've pretty much concluded that rewriting with a mailbox representation is the way to go. It's technically easy, and not extremely time consuming, but so awfully boring...
But I think rewriting as mailbox will slow down the engine on 64 bit. So this approach would mean to write most parts of your engine twice: mailbox for 32 bit and bitboard for 64 bit computers. And I do think that this would be extremely time consuming because it's not just rewriting once but maintaining two programs simultaniously.
I haven't tried this, but it looks like it will get very messy in practice: If you want to test for a white king on e1, you will probably also want to test for a black king on e8. With the above approach, how would you do that with the same code for both colors?
When you have the color as a variable, this is in deed a problem. For this reason there were many places where I could not simplify. But for example where evaluating trapped bishops (as in 8/B1pk4/1p6/8/8/8/8/7K w - -) I had unrolled the loop over colors anyway, so I could easily only use the relevant parts of bitboards. The eval code is only marginally obfuscated by that and all 32/64-bit differences are in the file bitboard.h
Use piece lists for each piece type and color, to avoid looping through too many bitboards in your move generators and evaluation functions. I've found this to improve the speed even in 64-bit mode, but the difference is bigger in 32-bit.
When starting my engine I had such lists. But I found them not useful for speed in 64 bit mode, so I eventually removed them. I don't think I will reintroduce them.
Write 32-bit optimized count_1s() functions. I use this code:
Thanks. I have put this on my todo list.
One more idea I had is a branchless lsb function for 32 bit. Obviously it needs more instructions then an lsb function with a conditional jump, I'm not sure if it is worth is. Are there any rules of thumb how many additional instructions equate to a branch misprediction?