32 bit optimization
Posted: 24 Feb 2009, 17:38
Hello everybody,
with the release of Onno 1.0 in near future, I'm currently trying to do some 32 bit optimizations. I wrote Onno as a 64 bit engine, without caring about 32 bit performance. This results in a - think positive - speedup factor of about 1.8 for 64 bit over 32 bit. As several beta testers asked for a 32 bit version, I am now trying to make it faster. The most efficient thing to do might be to have two versions of the engine: mailbox for 32 bit and bitboard for 64 bit. But as a performant encapsulation of such different board representations seems to be impossible, this would mean to do most of the work twice. So I won't do that.
What else can be done?
What I am planning to do is:
1. For bitboards that live only on one side of the board, avoid to use the other. E.g. to test for a king on e1, use
instead of
2. Replace magic (64 bit) multiplication by two 32 bit multiplications as described in other threads.
Unfortunately I don't see much potential in these ideas. For 1, there are not many places where to apply it. 2 might be a little more effective. But still according to my profiler output I would expect an overall speedup around 5%. Any more ideas?
Onno
with the release of Onno 1.0 in near future, I'm currently trying to do some 32 bit optimizations. I wrote Onno as a 64 bit engine, without caring about 32 bit performance. This results in a - think positive - speedup factor of about 1.8 for 64 bit over 32 bit. As several beta testers asked for a 32 bit version, I am now trying to make it faster. The most efficient thing to do might be to have two versions of the engine: mailbox for 32 bit and bitboard for 64 bit. But as a performant encapsulation of such different board representations seems to be impossible, this would mean to do most of the work twice. So I won't do that.
What else can be done?
What I am planning to do is:
1. For bitboards that live only on one side of the board, avoid to use the other. E.g. to test for a king on e1, use
- Code: Select all
lo(bb) & 5
instead of
- Code: Select all
bb & 5ui64
2. Replace magic (64 bit) multiplication by two 32 bit multiplications as described in other threads.
Unfortunately I don't see much potential in these ideas. For 1, there are not many places where to apply it. 2 might be a little more effective. But still according to my profiler output I would expect an overall speedup around 5%. Any more ideas?
Onno