Quicktest with WB. engines' results
Posted: 26 Oct 2004, 22:56
This is about the tactical strength of WinBoard engines and other engines. Tactics is still the main component of a chess program's strength & weaknesses profile. Nowadays - assuming that an engine has reached a certain level of that which I call "tactical basis" - it is not as decisive anymore as it was i.e. in the 1980ies (when chess progs were primitive, compared to today). But when an engine does not reach that kind of "mediocre" (at least) combinative speed, it cannot prevail, just for that reason. So, with a tactical test like the Quicktest ? M.Scheidl 2002 we may not be able to tell the top engines from the near-to-top engines, but we can tell if an engine has the tactical basis required for good to best general performance, or if there is concern that it may fail often due to combinative weakness even when knowledge and positional play are good.
I admit that I've often tested the UCI versions of many (WB.) engines.
My observation during testing many engines in the Quicktest is, that there is a "typical" number of solutions which many very good freeware engines, and also commercial oldies reach. On an AthlonXP@1200 MHz that is 15 or 16 solutions from a total of 24, IOW ~2/3. Engines which do not achieve 14 solutions, may still be quite good in general gameplay, but I'd say it are rare exceptions when they would reach the top-20. On the other end of the bandwidth, top freeware and commercial engines solved up to 21 on that hardware.
I have a rating system for the QT. which includes the solving times too, but I don't insist that it is perfect or even logical I'll give these ratings in brackets. I didn't test most of the current top commercial engines, i.e. none of the "8" versions. From all those I've tested, Hiarcs 9 performed best with 21(434). From the oldies, Nimzo 8 19(380) or even Hiarcs 7.32 19(366) have good results. The best freeware engine among my results so far, has Ruffian 1.0.0 (WB) with 19(373). For comparison, Tiger 14 achieved only 17(333) (not surprisingly, Gambit Tiger 2.0/aggressiv reached a better rating although solving one less, 16(363)). From the newer releases: Pro Deo 1.0 (Rebel stile) 18(368), or Pharon 3.1 15(302). Pharaon solved two more shortly after the time limit which is 1 minute per position.
An interesting result recingly was LiGo Revival 1.0.0.13 and the update ..14, when a forward pruning bug was fixed. That immediatly resulted in two more solutions and +26 QT. ratings points. So, a test like that can also be used to tell when the search has been improved (when anything else has remained more or less the same). Then, more and/faster solutions will happen while the "solving profile" in general is very similar - when the improvement was successfull.
The complete results are in an XLS file:
Quicktest results (48 kb zipped)
(People without Excel can download a viewer for free, from Microsoft's website)
1n1r1rk1/ppq2ppp/3p2b1/3B1NP1/4PB1R/bP2P2P/P1P5/3KQ1R1 w - - bm Qc3; id Quick-01;
1q6/r4pbk/1r1p2pp/B2Pn3/Q2NP3/1p3P2/6PP/1R3RK1 b - - bm Rxa5; id Quick-02;
3Q4/3p4/P2p4/N2b4/8/4P3/5p1p/5Kbk w - - bm Qa8; id Quick-03;
4qrk1/3nppb1/R1Np2p1/3P2P1/1Pr5/4B3/5Q1P/5R1K w - - bm Ra8; id Quick-04;
r3r3/2R2pk1/p2p1bpp/3P4/q2pQ3/5N1P/5PP1/1R4K1 w - - bm Rxf7+; id Quick-05;
r1b1Rbk1/pp3p2/2npN2p/2qp2p1/8/1QPB3P/PP3PPB/6K1 b - - bm Bxe6; id Quick-06;
r5k1/Rb4p1/2q2pBp/1pp5/1b4QN/1P2P2P/5PP1/6K1 w - - bm Rxb7; id Quick-07;
3R4/5r1p/5ppk/8/1Q3PPq/5P2/6K1/8 w - - bm Rg8; id Quick-08;
2kr3r/ppp3pp/2pbbn2/4N3/3Pp3/2P3Pq/PP1NQP1P/R1B2RK1 w - - am Nxe4; id Quick-09;
2r3k1/pp1bpp1p/3p1npQ/q1r5/4P1P1/2NR1P2/PPP1N3/2K4R w - - bm g5; id Quick-10;
r1b2rk1/pp3p2/2p2bpQ/8/1q1P4/2N2N2/Pn3PPP/1B1RR1K1 w - - bm Bxg6; id Quick-11;
r2qk2r/1p1bbp2/1P2p3/p2pPp2/n2N1N1p/3PB3/5QPP/R4RK1 w kq - bm Rxa4; id Quick-12;
3r1n1r/1p2q1k1/p1p1P1p1/3n4/5Pp1/P5N1/1P3QP1/1BR1R1K1 w - - bm Bxg6; id Quick-13;
r2q1rk1/p1p3pp/b2bp3/2pp4/6p1/2NPPN2/PPP2PP1/R1BQR1K1 w - - bm Ne5; id Quick-14;
r2qr1k1/p2b1ppp/5n2/2pp4/5b2/NP6/PBP1NPPP/R3QRK1 b - - bm Bxh2+; id Quick-15;
3k4/p7/K3BP2/8/7p/8/2P4P/8 w - - bm Kb7; id Quick-16;
rq4k1/pp1nrppp/4bn2/6R1/3QP3/P4PN1/4B1PP/2B2RK1 w - - bm Rxg7+; id Quick-17;
2r4k/pb2q2P/1p6/3Pp3/4p3/1P2R3/PBrQ2PP/5RK1 w - - bm Qb4; id Quick-18;
5k2/6p1/2p2p2/P7/1Q6/2P1pqPP/7K/8 b - - bm c5; id Quick-19;
rnbq1b1r/ppp1p1pp/1n1p2k1/4P1N1/8/5Q2/PPPP1PPP/RNB1K2R b KQ - bm Qe8; id Quick-20;
r1b1kb1r/2q2ppp/p2ppP2/1pn3P1/3NP3/2N2Q2/PPP4P/2KR1B1R w kq - bm Bxb5+; id Quick-21;
4r2k/3n3p/2q3p1/2p1p1Q1/1pP1P3/1P6/5PP1/R2B2K1 b - - am Qxe4; id Quick-22;
r3r1k1/1Bp1qppp/3p1n2/pNb5/2P5/PQ6/1P3PPP/R2R2K1 b - - bm Ng4; id Quick-23;
3B4/1R3p1k/2p4p/2Pp3r/3P4/4Q1K1/6P1/3b1q2 w - - bm Bf6; id Quick-24;
(two are avoid move pos.)
Quicktest, english description
I admit that I've often tested the UCI versions of many (WB.) engines.
My observation during testing many engines in the Quicktest is, that there is a "typical" number of solutions which many very good freeware engines, and also commercial oldies reach. On an AthlonXP@1200 MHz that is 15 or 16 solutions from a total of 24, IOW ~2/3. Engines which do not achieve 14 solutions, may still be quite good in general gameplay, but I'd say it are rare exceptions when they would reach the top-20. On the other end of the bandwidth, top freeware and commercial engines solved up to 21 on that hardware.
I have a rating system for the QT. which includes the solving times too, but I don't insist that it is perfect or even logical I'll give these ratings in brackets. I didn't test most of the current top commercial engines, i.e. none of the "8" versions. From all those I've tested, Hiarcs 9 performed best with 21(434). From the oldies, Nimzo 8 19(380) or even Hiarcs 7.32 19(366) have good results. The best freeware engine among my results so far, has Ruffian 1.0.0 (WB) with 19(373). For comparison, Tiger 14 achieved only 17(333) (not surprisingly, Gambit Tiger 2.0/aggressiv reached a better rating although solving one less, 16(363)). From the newer releases: Pro Deo 1.0 (Rebel stile) 18(368), or Pharon 3.1 15(302). Pharaon solved two more shortly after the time limit which is 1 minute per position.
An interesting result recingly was LiGo Revival 1.0.0.13 and the update ..14, when a forward pruning bug was fixed. That immediatly resulted in two more solutions and +26 QT. ratings points. So, a test like that can also be used to tell when the search has been improved (when anything else has remained more or less the same). Then, more and/faster solutions will happen while the "solving profile" in general is very similar - when the improvement was successfull.
The complete results are in an XLS file:
Quicktest results (48 kb zipped)
(People without Excel can download a viewer for free, from Microsoft's website)
1n1r1rk1/ppq2ppp/3p2b1/3B1NP1/4PB1R/bP2P2P/P1P5/3KQ1R1 w - - bm Qc3; id Quick-01;
1q6/r4pbk/1r1p2pp/B2Pn3/Q2NP3/1p3P2/6PP/1R3RK1 b - - bm Rxa5; id Quick-02;
3Q4/3p4/P2p4/N2b4/8/4P3/5p1p/5Kbk w - - bm Qa8; id Quick-03;
4qrk1/3nppb1/R1Np2p1/3P2P1/1Pr5/4B3/5Q1P/5R1K w - - bm Ra8; id Quick-04;
r3r3/2R2pk1/p2p1bpp/3P4/q2pQ3/5N1P/5PP1/1R4K1 w - - bm Rxf7+; id Quick-05;
r1b1Rbk1/pp3p2/2npN2p/2qp2p1/8/1QPB3P/PP3PPB/6K1 b - - bm Bxe6; id Quick-06;
r5k1/Rb4p1/2q2pBp/1pp5/1b4QN/1P2P2P/5PP1/6K1 w - - bm Rxb7; id Quick-07;
3R4/5r1p/5ppk/8/1Q3PPq/5P2/6K1/8 w - - bm Rg8; id Quick-08;
2kr3r/ppp3pp/2pbbn2/4N3/3Pp3/2P3Pq/PP1NQP1P/R1B2RK1 w - - am Nxe4; id Quick-09;
2r3k1/pp1bpp1p/3p1npQ/q1r5/4P1P1/2NR1P2/PPP1N3/2K4R w - - bm g5; id Quick-10;
r1b2rk1/pp3p2/2p2bpQ/8/1q1P4/2N2N2/Pn3PPP/1B1RR1K1 w - - bm Bxg6; id Quick-11;
r2qk2r/1p1bbp2/1P2p3/p2pPp2/n2N1N1p/3PB3/5QPP/R4RK1 w kq - bm Rxa4; id Quick-12;
3r1n1r/1p2q1k1/p1p1P1p1/3n4/5Pp1/P5N1/1P3QP1/1BR1R1K1 w - - bm Bxg6; id Quick-13;
r2q1rk1/p1p3pp/b2bp3/2pp4/6p1/2NPPN2/PPP2PP1/R1BQR1K1 w - - bm Ne5; id Quick-14;
r2qr1k1/p2b1ppp/5n2/2pp4/5b2/NP6/PBP1NPPP/R3QRK1 b - - bm Bxh2+; id Quick-15;
3k4/p7/K3BP2/8/7p/8/2P4P/8 w - - bm Kb7; id Quick-16;
rq4k1/pp1nrppp/4bn2/6R1/3QP3/P4PN1/4B1PP/2B2RK1 w - - bm Rxg7+; id Quick-17;
2r4k/pb2q2P/1p6/3Pp3/4p3/1P2R3/PBrQ2PP/5RK1 w - - bm Qb4; id Quick-18;
5k2/6p1/2p2p2/P7/1Q6/2P1pqPP/7K/8 b - - bm c5; id Quick-19;
rnbq1b1r/ppp1p1pp/1n1p2k1/4P1N1/8/5Q2/PPPP1PPP/RNB1K2R b KQ - bm Qe8; id Quick-20;
r1b1kb1r/2q2ppp/p2ppP2/1pn3P1/3NP3/2N2Q2/PPP4P/2KR1B1R w kq - bm Bxb5+; id Quick-21;
4r2k/3n3p/2q3p1/2p1p1Q1/1pP1P3/1P6/5PP1/R2B2K1 b - - am Qxe4; id Quick-22;
r3r1k1/1Bp1qppp/3p1n2/pNb5/2P5/PQ6/1P3PPP/R2R2K1 b - - bm Ng4; id Quick-23;
3B4/1R3p1k/2p4p/2Pp3r/3P4/4Q1K1/6P1/3b1q2 w - - bm Bf6; id Quick-24;
(two are avoid move pos.)
Quicktest, english description