Geschrieben von:/Posted by: Heinz van Kempen at 21 September 2004 08:44:16:
Hi all ,
also in 40/40 games repeated (adapted to 2 Ghz CPU) the new Pharaon beta proves to be a monster and scored not less than 52.5 points out of 74 games against all engines from King, Queen and Rook Class and the two promoters DanChess and Spike. This gives the new Pharaon rank 7 in AEGT rating list and clearly King Class level.
Games are available at my side under downloads and King/Queen gauntlets.
http://www.husvankempen.de/nunn/
Here are the results
Pharaon against:
King Class (14.5 points out of 24 games = 60.4%)
__________
Ruffian 1.0.5  1.5 - 0.5 
List 512 1.0 - 1.0 
Aristarch 4.50 1.0 - 1.0 
Gothmog 1.0 B10  2.0 - 0.0 
Thinker 4.6c 1.0 - 1.0 
Crafty 19.15 1.0 - 1.0 
SmarThink 0.17a  1.5 - 0.5 
Tao 5.7 b04  0.5 - 1.5 
Quark 2.35 1.0 - 1.0 
Yace 0.99.87 1.0 - 1.0 
Delfi 4.5  1.0 - 1.0 
AnMon 5.32 2.0 - 0.0 
Queen Class (17 points out of 22 games = 77.3%)
___________
SlowChess 2.93a  1.5 - 0.5 
Movei 00.8.247s  1.5 - 0.5 
Amyan 1.593b 1.5 - 0.5 
Fruit 1.5t 0.5 - 1.5 
Dragon 4.5 CF  1.5 - 0.5 
King of Kings 2.56 2.0 - 0.0 
Ufim 5.01  2.0 - 0.0 
GLC 3.00.3.4 1.0 - 1.0 
WildCat 4  2.0 - 0.0 
Amy 0.8.7b 1.5 - 0.5 
Jonny 2.64 2.0 - 0.0 
Rook Class (19 points out of 24 games = 79.2%)
__________
Comet B.68 1.5 - 0.5 
Arasan 7.4 1.5 - 0.5 
Terra 3.3B11 2.0 - 0.0 
Amateur 2.80 1.5 - 0.5 
The Crazy Bishop 0052  1.0 - 1.0 
Pepito v1.59 1.5 - 0.5 
Frenzee 159  2.0 - 0.0 
PostModernist 1010a  2.0 - 0.0 
KnightDreamer 3.3  2.0 - 0.0 
The Baron 1.4.0 b2 1.5 - 0.5 
Naum 1.2 0.5 - 1.5 
Pharaon 2.62 2.0 - 0.0 
and finally against the promoters
DanChess 1.0.6 DC  0.5 - 1.5 
Spike 0.6  1.5 - 0.5 
The similar performance against Queen and Rook Class engines brought those closer together now in AEGT rating list and this ratings now already seem quite realistic, if you keep in mind that those finishing at the bottom of Rook Class are really considerably weaker than those in Queen Class.
What I am still not happy with is the difference in rating between King and Queen Class, because King Class engines are still too highly rated. I try to explain why:
We only could run four gauntlets before this one between King and Queen Class to give connections to ELOStat. Here the results were clearly in favour of King Class engines:
Pro Deo versus King = 15 points
Pro Deo versus Queen = 19.5 points
SOS versus King = 10.5 points
SOS versus Queen = 13.5 points
LambChop versus King = 8.5 points
LambChop versus Queen = 12 points
and especially
Little Goliath 1.0.0.14 against King = 9.5 points
Little Goliath 1.0.0.14 against Queen = 15 points
This leads to differences like this ones:
Quark 2.35  : 2692  49 54  142 44.0 %  2734
Movei 00.8.247s : 2574  44 56  150 48.3 %  2586 
After AEGT 1 we know that Quark is much stronger with long timecontrol than in Blitz (in fact for me it is the engine where this difference can be most clearly seen), but anyway it will not be 118 points stronger than Movei with this strong beta.
or:
Pharaon 3.00b : 2746  62 78 74 70.9 %  2591 
Pharaon 2.62  : 2499  45 46  182 58.5 %  2439
Difference will be high, but not that high, AEGT 2 will show.
You cannot blame ELOStat in my opinion, you have to give such a program enough data for comparison to calculate correctly and number of games in gauntlets to compare is by far not sufficient up to now. So we will give more data to ELOStat by running another 8 gauntlets. In progress is AnMon 5.40 playing against the same engines now than Pharaon (results tomorrow).
One thing patient people should wait for is AEGT 2 where all the engines are mixed up and ratings will be finally realistic.
Rating AEGT after Pharaon 3.00b gauntlet September, 21st. 
Program Elo +  -  Games  Score  Av.Op. Draws
1 Pro Deo 1.0 : 2821  79 118 48 71.9 %  2658  18.8 %
2 Ruffian 1.0.5 : 2814  49 53  142 62.7 %  2724  26.8 %
3 Aristarch 4.50  : 2804  50 52  142 61.3 %  2725  26.8 %
4 Delfi 4.5 : 2774  53 45  142 56.7 %  2727  33.1 %
5 List 512  : 2760  55 43  142 54.6 %  2729  34.5 %
6 Thinker 4.6c  : 2754  56 41  142 53.5 %  2729  38.0 %
7 Pharaon 3.00b : 2746  62 78 74 70.9 %  2591  31.1 %
8 Crafty 19.15  : 2731  46 46  142 50.0 %  2731  39.4 %
9 Tao 5.7 b04 : 2722  46 58  142 48.6 %  2732  25.4 %
 10 SmarThink 0.17a : 2711  42 56  142 46.8 %  2733  35.9 %
 11 Quark 2.35  : 2692  49 54  142 44.0 %  2734  26.1 %
 12 AnMon 5.32  : 2674  49 51  142 41.2 %  2736  28.9 %
 13 Yace 0.99.87  : 2674  44 51  142 41.2 %  2736  37.3 %
 14 Gothmog 1.0 B10 : 2672  52 51  142 40.8 %  2736  23.9 %
 15 LG Revival 1.00.1.4 : 2665 103 68 48 51.0 %  2658  39.6 %
 16 GLC 3.00.3.4  : 2664  48 44  150 62.0 %  2579  37.3 %
 17 SOS 4 : 2658  87 87 48 50.0 %  2658  29.2 %
 18 ElChinito 3.25  : 2631  51 46  148 57.4 %  2579  31.1 %
 19 WildCat 4 : 2624  52 48  150 56.0 %  2582  25.3 %
 20 Fruit 1.5t  : 2615  53 46  150 54.7 %  2583  26.7 %
 21 Patzer 3.61 : 2613  89 100 48 61.5 %  2531  18.8 %
 22 Amyan 1.593b  : 2609  54 45  150 53.7 %  2583  27.3 %
 23 LambChop 10.99  : 2607  80 94 48 42.7 %  2658  31.2 %
 24 SlowChess 2.93a : 2589  57 39  150 50.7 %  2585  37.3 %
 25 Movei 00.8.247s : 2574  44 56  150 48.3 %  2586  27.3 %
 26 Jonny 2.64  : 2564  46 54  150 46.7 %  2587  25.3 %
 27 Abrok 5.0 : 2560  98 88 48 54.2 %  2531  20.8 %
 28 Amy 0.8.7b  : 2559  46 54  150 46.0 %  2587  26.7 %
 29 Dragon 4.5 CF : 2551  41 53  150 44.7 %  2588  37.3 %
 30 The Baron 1.4.0 b2  : 2546  41 53  182 65.4 %  2436  22.0 %
 31 Pepito v1.59  : 2534  42 51  182 63.7 %  2436  23.1 %
 32 King of Kings 2.56  : 2527  47 50  150 41.0 %  2590  30.0 %
 33 Naum 1.2  : 2523  43 46  182 62.1 %  2437  27.5 %
 34 Comet B.68  : 2508  44 48  182 59.9 %  2438  22.0 %
 35 Pharaon 2.62  : 2499  45 46  182 58.5 %  2439  23.6 %
 36 Nejmet 3.07 : 2488  82 95 48 43.8 %  2531  29.2 %
 37 Spike 0.6 : 2484  3&