For my next test I include 12 engines (they all are from AEGT 2 King Class)
and compare ratings after each 11 games (ie one round robin). The goal is
to get minimal number of games when ratings get stable.
To imitate AEGT King Class, time control of 40 moves per minutes is used
but much more shorter (40 moves per 2 minutes).
Hardware is Celeron 567MHz 128MB.
- Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Ruffian 1.0.5 : 2695 204 222 11 77.3 % 2482 27.3 %
2 Pro Deo 1.0 : 2656 211 251 11 72.7 % 2486 18.2 %
3 Thinker 4.6c : 2621 219 276 11 68.2 % 2489 9.1 %
4 Delfi 4.5 : 2529 247 210 11 54.5 % 2497 18.2 %
5 Quark 2.35 : 2529 247 210 11 54.5 % 2497 18.2 %
6 Tao 5.7 b04 : 2471 164 247 11 45.5 % 2503 36.4 %
7 Aristarch 4.50 : 2471 164 247 11 45.5 % 2503 36.4 %
8 Crafty 19.17 : 2442 192 237 11 40.9 % 2505 27.3 %
9 Yace 0.99.87 : 2442 192 237 11 40.9 % 2505 27.3 %
10 WildCat 4 : 2442 192 237 11 40.9 % 2505 27.3 %
11 Green Light Chess 3.00.3.4 : 2442 145 237 11 40.9 % 2505 45.5 %
12 Pharaon 3.00b : 2261 178 197 11 18.2 % 2522 36.4 %
Chng Program Elo + - Games Score Av.Op. Draws
in Pl
0 1 Ruffian 1.0.5 : 2673 123 143 22 75.0 % 2483 31.8 %
0 2 Pro Deo 1.0 : 2637 129 182 22 70.5 % 2486 13.6 %
0 3 Thinker 4.6c : 2603 135 170 22 65.9 % 2489 13.6 %
+6 4 WildCat 4 : 2498 136 136 22 50.0 % 2498 27.3 %
0 5 Quark 2.35 : 2498 136 136 22 50.0 % 2498 27.3 %
-2 6 Delfi 4.5 : 2484 128 158 22 47.7 % 2500 22.7 %
0 7 Aristarch 4.50 : 2484 115 158 22 47.7 % 2500 31.8 %
-2 8 Tao 5.7 b04 : 2455 134 149 22 43.2 % 2502 22.7 %
0 9 Yace 0.99.87 : 2440 99 145 22 40.9 % 2504 45.5 %
+1 10 Green Light Chess 3.00.3.4 : 2440 114 145 22 40.9 % 2504 36.4 %
+1 11 Pharaon 3.00b : 2409 119 138 22 36.4 % 2507 36.4 %
-4 12 Crafty 19.17 : 2377 165 132 22 31.8 % 2510 18.2 %
Chng Program Elo + - Games Score Av.Op. Draws
0 1 Ruffian 1.0.5 : 2632 101 114 33 69.7 % 2488 30.3 %
0 2 Pro Deo 1.0 : 2599 106 134 33 65.2 % 2491 15.2 %
+4 3 Aristarch 4.50 : 2579 110 109 33 62.1 % 2493 27.3 %
-1 4 Thinker 4.6c : 2539 118 109 33 56.1 % 2496 21.2 %
+4 5 Yace 0.99.87 : 2519 123 84 33 53.0 % 2498 39.4 %
-2 6 WildCat 4 : 2500 113 113 33 50.0 % 2500 21.2 %
+4 7 Pharaon 3.00b : 2471 89 121 33 45.5 % 2502 36.4 %
-2 8 Delfi 4.5 : 2471 96 121 33 45.5 % 2502 30.3 %
+1 9 Green Light Chess 3.00.3.4 : 2471 103 121 33 45.5 % 2502 24.2 %
-5 10 Quark 2.35 : 2431 111 112 33 39.4 % 2506 24.2 %
-3 11 Tao 5.7 b04 : 2421 109 110 33 37.9 % 2507 27.3 %
0 12 Crafty 19.17 : 2367 150 101 33 30.3 % 2512 12.1 %
After 33 games migration in the list (+4,+4,+4,-5,-3) is still too great
(due to small number of the games).
- Code: Select all
Chng Program Elo + - Games Score Av.Op. Draws
0 1 Ruffian 1.0.5 : 2621 86 95 44 68.2 % 2489 31.8 %
0 2 Pro Deo 1.0 : 2605 89 112 44 65.9 % 2490 18.2 %
------------------
0 3 Aristarch 4.50 : 2589 91 96 44 63.6 % 2492 27.3 %
0 4 Thinker 4.6c : 2558 96 101 44 59.1 % 2494 18.2 %
+1 5 WildCat 4 : 2543 99 93 44 56.8 % 2496 22.7 %
-1 6 Yace 0.99.87 : 2507 108 76 44 51.1 % 2499 34.1 %
----------------------
0 7 Pharaon 3.00b : 2463 84 101 44 44.3 % 2503 29.5 %
0 8 Delfi 4.5 : 2463 84 101 44 44.3 % 2503 29.5 %
----------------------
+1 9 Quark 2.35 : 2418 97 92 44 37.5 % 2507 25.0 %
+1 10 Tao 5.7 b04 : 2418 97 92 44 37.5 % 2507 25.0 %
+1 11 Crafty 19.17 : 2411 115 91 44 36.4 % 2508 13.6 %
-3 12 Green Light Chess 3.00.3.4 : 2403 107 90 44 35.2 % 2509 20.5 %
After 44 games, we can differentiate 6 groups.
And note that from now on (from 44 till 110 games) all shifts occur only
within those groups!
- Code: Select all
Chng Program Elo + - Games Score Av.Op. Draws
+1 1 Pro Deo 1.0 : 2615 77 98 55 67.3 % 2489 21.8 %
-1 2 Ruffian 1.0.5 : 2589 80 84 55 63.6 % 2492 29.1 %
---------------------
+1 3 Thinker 4.6c : 2546 87 75 55 57.3 % 2496 30.9 %
+2 4 Yace 0.99.87 : 2541 88 73 55 56.4 % 2496 32.7 %
-2 5 Aristarch 4.50 : 2541 88 80 55 56.4 % 2496 25.5 %
-1 6 WildCat 4 : 2529 91 81 55 54.5 % 2497 21.8 %
----------------------
+1 7 Delfi 4.5 : 2465 77 90 55 44.5 % 2503 27.3 %
-1 8 Pharaon 3.00b : 2459 76 88 55 43.6 % 2504 29.1 %
----------------------
+1 9 Tao 5.7 b04 : 2447 85 86 55 41.8 % 2505 21.8 %
-1 10 Quark 2.35 : 2435 88 84 55 40.0 % 2506 21.8 %
0 11 Crafty 19.17 : 2429 95 83 55 39.1 % 2506 16.4 %
0 12 Green Light Chess 3.00.3.4 : 2404 96 79 55 35.5 % 2508 20.0 %
Chng Program Elo + - Games Score Av.Op. Draws
0 1 Pro Deo 1.0 : 2638 67 98 66 70.5 % 2487 19.7 %
0 2 Ruffian 1.0.5 : 2594 72 78 66 64.4 % 2491 28.8 %
--------------------
+1 3 Yace 0.99.87 : 2553 78 71 66 58.3 % 2495 28.8 %
-1 4 Thinker 4.6c : 2534 81 66 66 55.3 % 2497 31.8 %
+1 5 WildCat 4 : 2524 83 75 66 53.8 % 2497 19.7 %
-1 6 Aristarch 4.50 : 2519 84 73 66 53.0 % 2498 21.2 %
---------------------
+1 7 Pharaon 3.00b : 2485 69 85 66 47.7 % 2501 25.8 %
-1 8 Delfi 4.5 : 2461 71 80 66 43.9 % 2503 27.3 %
---------------------
0 9 Tao 5.7 b04 : 2456 73 80 66 43.2 % 2504 25.8 %
0 10 Quark 2.35 : 2426 81 75 66 38.6 % 2506 22.7 %
0 11 Crafty 19.17 : 2411 92 73 66 36.4 % 2508 15.2 %
0 12 Green Light Chess 3.00.3.4 : 2400 87 71 66 34.8 % 2509 21.2 %
So we get right group differentiation after 44 games and
here "minimum minimorum" is 44 games.
In the further rating lists after more games, shifting occurs only within
the groups.
- Code: Select all
Chng Program Elo + - Games Score Av.Op. Draws
0 1 Pro Deo 1.0 : 2621 63 84 77 68.2 % 2489 22.1 %
0 2 Ruffian 1.0.5 : 2602 65 75 77 65.6 % 2490 27.3 %
---------------------
0 3 Yace 0.99.87 : 2558 71 70 77 59.1 % 2494 24.7 %
0 4 Thinker 4.6c : 2545 73 65 77 57.1 % 2496 28.6 %
+1 5 Aristarch 4.50 : 2516 78 67 77 52.6 % 2498 22.1 %
-1 6 WildCat 4 : 2516 78 68 77 52.6 % 2498 19.5 %
---------------------
+1 7 Delfi 4.5 : 2479 66 77 77 46.8 % 2502 23.4 %
-1 8 Pharaon 3.00b : 2454 70 73 77 42.9 % 2504 23.4 %
--------------------
0 9 Tao 5.7 b04 : 2445 71 72 77 41.6 % 2505 23.4 %
0 10 Quark 2.35 : 2441 73 71 77 40.9 % 2505 22.1 %
0 11 Crafty 19.17 : 2411 85 67 77 36.4 % 2508 15.6 %
0 12 Green Light Chess 3.00.3.4 : 2411 77 67 77 36.4 % 2508 23.4 %
Chng Program Elo + - Games Score Av.Op. Draws
0 1 Pro Deo 1.0 : 2609 60 77 88 66.5 % 2490 21.6 %
0 2 Ruffian 1.0.5 : 2601 61 69 88 65.3 % 2491 28.4 %
---------------------
+1 3 Thinker 4.6c : 2540 69 59 88 56.2 % 2496 30.7 %
-1 4 Yace 0.99.87 : 2536 69 63 88 55.7 % 2497 25.0 %
0 5 Aristarch 4.50 : 2533 70 63 88 55.1 % 2497 23.9 %
0 6 WildCat 4 : 2522 72 62 88 53.4 % 2498 22.7 %
----------------------
0 7 Delfi 4.5 : 2485 61 73 88 47.7 % 2501 22.7 %
0 8 Pharaon 3.00b : 2456 65 68 88 43.2 % 2504 22.7 %
-----------------------
0 9 Tao 5.7 b04 : 2449 63 67 88 42.0 % 2504 27.3 %
0 10 Quark 2.35 : 2445 66 67 88 41.5 % 2505 23.9 %
+1 11 Green Light Chess 3.00.3.4 : 2426 70 64 88 38.6 % 2507 22.7 %
-1 12 Crafty 19.17 : 2399 81 61 88 34.7 % 2509 17.0 %
Chng Program Elo + - Games Score Av.Op. Draws
0 1 Pro Deo 1.0 : 2607 56 73 99 66.2 % 2490 21.2 %
0 2 Ruffian 1.0.5 : 2599 57 64 99 65.2 % 2491 29.3 %
-----------------
0 3 Thinker 4.6c : 2542 64 56 99 56.6 % 2496 30.3 %
0 4 Yace 0.99.87 : 2535 65 58 99 55.6 % 2497 26.3 %
0 5 Aristarch 4.50 : 2529 66 60 99 54.5 % 2497 22.2 %
0 6 WildCat 4 : 2513 69 57 99 52.0 % 2499 23.2 %
--------------------
+1 7 Pharaon 3.00b : 2477 59 67 99 46.5 % 2502 22.2 %
-1 8 Delfi 4.5 : 2471 60 66 99 45.5 % 2502 22.2 %
--------------------
+1 9 Quark 2.35 : 2441 63 62 99 40.9 % 2505 23.2 %
-1 10 Tao 5.7 b04 : 2438 60 62 99 40.4 % 2505 28.3 %
0 11 Green Light Chess 3.00.3.4 : 2431 64 61 99 39.4 % 2506 24.2 %
0 12 Crafty 19.17 : 2418 71 59 99 37.4 % 2507 18.2 %
Chng Program Elo + - Games Score Av.Op. Draws
0 1 Ruffian 1.0.5 : 2602 54 61 110 65.5 % 2491 29.1 %
0 2 Pro Deo 1.0 : 2595 54 67 110 64.5 % 2491 21.8 %
------------------------
+2 3 Aristarch 4.50 : 2546 60 59 110 57.3 % 2496 21.8 %
0 4 Yace 0.99.87 : 2544 61 56 110 56.8 % 2496 26.4 %
-2 5 Thinker 4.6c : 2541 61 54 110 56.4 % 2496 29.1 %
0 6 WildCat 4 : 2526 63 56 110 54.1 % 2497 22.7 %
--------------------------
+1 7 Delfi 4.5 : 2465 58 62 110 44.5 % 2503 21.8 %
-1 8 Pharaon 3.00b : 2459 60 61 110 43.6 % 2504 20.0 %
--------------------------
+2 9 Green Light Chess 3.00.3.4 : 2447 59 59 110 41.8 % 2505 23.6 %
+2 10 Crafty 19.17 : 2429 66 57 110 39.1 % 2506 18.2 %
-1 11 Quark 2.35 : 2429 61 57 110 39.1 % 2506 23.6 %
-3 12 Tao 5.7 b04 : 2417 62 56 110 37.3 % 2507 25.5 %
Just from curiosity you may look at the current state of AEGT 2 and compare
results. For most engines (8) they are similar. But there are 4 exceptions.
At long time control Tao and Delfi doing better while WildCat and Pharaon
is doing worser.
- Code: Select all
AEGT 2, King Class
2004.10.01 - 2004.11.02
Score
------------------------------------------
1: Ruffian 1.0.5 62.0 / 99
2: Pro Deo 1.0 59.5 / 99
3: Aristarch 4.50 56.5 / 99
4: Delfi 4.5 56.0 / 99
5: Tao 5.7 b04 52.5 / 99
6: Thinker 4.6c 52.0 / 99
7: Yace 0.99.87 50.0 / 99
8: Quark 2.35 46.5 / 99
9: WildCat 4 44.0 / 99
10: Green Light Chess 3.00.3.4 41.0 / 99
11: Pharaon 3.00b 40.5 / 99
12: Crafty 19.17 33.5 / 99
------------------------------------------
Conclusions:
Rating groups (RG) can be differentiated after 40-50 games. So minimum
minimorum is once again 40-44 games.
There are two RG criteria which should be applied at once (one of them is
not enough!):
1) Small rating difference;
2) Change in places occurs only within a group.
To divide, you must look at dynamics but not at statics.
There are little probability for differentiation within groups. All kinds of
things can occur because all engines within a group are very close by
strengh. So there is no sense to do it.
Igor