Winboard Forum

by **Heinz van Kempen** » 10 May 2004, 16:05

Geschrieben von:/Posted by: Heinz van Kempen at 10 May 2004 17:05:56:

Hi :-)

,
Nunn Blitz Elite is finished giving 600 games to each engine version.
Conditions:
Athlon 2600+
64 MB Hash
5 men EGTB
all Nunn2 positions
Time control: 4 min. + 2 sec.

1 Ruffian 1.0.5-------------392.5 / 600
2 List 512------------------364.0 / 600
3 Gandalf 4.32h-------------342.5 / 600
4 SOS 4 for Arena-----------325.5 / 600
5 Aristarch 4.41------------324.0 / 600
6 Smar Think 0.17a----------323.5 / 600
7 El Chinito 3.25-----------299.0 / 600
8 Ktulu 4.2-----------------293.5 / 600
9 Thinker 4.5e--------------290.5 / 600
10 Crafty 19.11-------------285.5 / 600
11 Delfi 4.4----------------285.0 / 600
12 Kaissa 1.7---------------270.0 / 600
13 Green Light Chess 3.00---264.5 / 600
14 Yace Paderborn-----------254.0 / 600
15 Pepito v1.59-------------250.5 / 600
16 Anaconda 1.6.2-----------235.5 / 600

Nunn Top is also coming to an end and Nunn Blitz E will also be finished in a few hours with Arasan as a clear winner and a much improved Booot. Results can be seen here:
http://www.husvankempen.de/nunn/
Nunn Active tournaments are in progress, but will last some time.
Additionally I will start a new Blitz tournament tonight with top engines, as this is the only way to compare different versions with a lot of games.
So far I have the following:
Aristarch 4.50
Thinker 4.6b
Crafty 19.12 Capablanca
Patriot light
Amy 0.8.7b
LambChop 10.99
King of Kings 2.55
PostModernist 1010a
The Baron 1.3.1b1
Terra 3.3B11
Movei 00.8.198
All those who participated in Nunn Elite, Top, B and C and want to test a new beta can send it. Private engines are also welcome, if they are strong enough for this group. If there is not much interest I will add Booot 3.3 and Naum 1.0.
Best Regards
Heinz

by **Robert Allgeuer** » 10 May 2004, 18:03

Geschrieben von:/Posted by: Robert Allgeuer at 10 May 2004 19:03:18:
Als Antwort auf:/In reply to: Nunn Blitz Elite finished geschrieben von:/posted by: Heinz van Kempen at 10 May 2004 17:05:56:

Hi ,
Nunn Blitz Elite is finished giving 600 games to each engine version.
Conditions:
Athlon 2600+
64 MB Hash
5 men EGTB
all Nunn2 positions
Time control: 4 min. + 2 sec.

Nunn Top is also coming to an end and Nunn Blitz E will also be finished in a few hours with Arasan as a clear winner and a much improved Booot. Results can be seen here:
http://www.husvankempen.de/nunn/
Nunn Active tournaments are in progress, but will last some time.
Additionally I will start a new Blitz tournament tonight with top engines, as this is the only way to compare different versions with a lot of games.
So far I have the following:
Aristarch 4.50
Thinker 4.6b
Crafty 19.12 Capablanca
Patriot light
Amy 0.8.7b
LambChop 10.99
King of Kings 2.55
PostModernist 1010a
The Baron 1.3.1b1
Terra 3.3B11
Movei 00.8.198
All those who participated in Nunn Elite, Top, B and C and want to test a new beta can send it. Private engines are also welcome, if they are strong enough for this group. If there is not much interest I will add Booot 3.3 and Naum 1.0.
Best Regards
Heinz

>1 Ruffian 1.0.5-------------392.5 / 600
>2 List 512------------------364.0 / 600
>3 Gandalf 4.32h-------------342.5 / 600
>4 SOS 4 for Arena-----------325.5 / 600
>5 Aristarch 4.41------------324.0 / 600
>6 Smar Think 0.17a----------323.5 / 600
>7 El Chinito 3.25-----------299.0 / 600
>8 Ktulu 4.2-----------------293.5 / 600
>9 Thinker 4.5e--------------290.5 / 600
>10 Crafty 19.11-------------285.5 / 600
>11 Delfi 4.4----------------285.0 / 600
>12 Kaissa 1.7---------------270.0 / 600
>13 Green Light Chess 3.00---264.5 / 600
>14 Yace Paderborn-----------254.0 / 600
>15 Pepito v1.59-------------250.5 / 600
>16 Anaconda 1.6.2-----------235.5 / 600
>
I find it always interesting to compare your results with my results, your tests use essentially twice as much CPU (rectified with clock speed) than my tests. So one can get an estimation how the various engines scale with longer time controls/faster CPUs/bigger search depth.
From this bunch of results I would state:
SoS, Aristarch, El Chinito and Smarthink go up in rating when testing with bigger search depth
Yace, Pepito and Delfi go down when testing with bigger search depth
GLC slightly up
Probably Ktulu, Crafty and Thinker slightly down
Totally unproven, just some observations...
Robert

by **Heinz van Kempen** » 10 May 2004, 18:20

Geschrieben von:/Posted by: Heinz van Kempen at 10 May 2004 19:20:42:
Als Antwort auf:/In reply to: Re: Nunn Blitz Elite finished geschrieben von:/posted by: Robert Allgeuer at 10 May 2004 19:03:18:

Hi Robert,
from my previous tournaments without Nunn positions I can confirm that Green Light Chess, SOS, Aristarch and Ruffian especially are better with more time and/or faster CPU.
More maybe can be seen when I also run my Active Tournaments for some months. We have a lot of similar results in our rating lists, except that you have List in front of Ruffian. But you have even more games so maybe after 400 games more the gap between those two will close. There are still those error margins even after 600 games.
Best Regards
Heinz

by **Robert Allgeuer** » 10 May 2004, 19:00

Geschrieben von:/Posted by: Robert Allgeuer at 10 May 2004 20:00:13:
Als Antwort auf:/In reply to: Re: Nunn Blitz Elite finished geschrieben von:/posted by: Heinz van Kempen at 10 May 2004 19:20:42:

Hi Robert,
from my previous tournaments without Nunn positions I can confirm that Green Light Chess, SOS, Aristarch and Ruffian especially are better with more time and/or faster CPU.
More maybe can be seen when I also run my Active Tournaments for some months. We have a lot of similar results in our rating lists, except that you have List in front of Ruffian. But you have even more games so maybe after 400 games more the gap between those two will close. There are still those error margins even after 600 games.
Best Regards
Heinz

List is a special case, it depends how you deal with its results and whether you play with EGTBs.
It has this endgame bug, so that it cannot mate against a lone king. If a game reaches this position, it will be a draw. This results in a problem, because engines that resign will lose such games, engines that do not resign however and play until they are mated will reach a draw.
The result is that such engines will score relatively better against List. I decided to adjudicate such games as win for List (that is I assume that ALL engines resign in a lost position). I found this preferable to the alternative of not adjudicating, because this would have influenced the ratings of all other engines, in fact engines not resigning would have gone up in the rating list, because they would then score "well" against a very strong opponent. Of course my resulting List rating is the rating of List ASSUMING that the opponent always resigns or ASSUMING the bug is fixed in List. So this rating is higher than if you just let List play as it is. I believe that the mere playing strength of List is indeed a little bit higher than Ruffian 1.0.1, I assume it is this bug that pushes it down in many rating lists. The bug manifests itself btw only with tablebases.
It is actually not sufficient to just set the GUI to adjudicate when a score threshold is reached, because various engines output the score in a non-standard form so that the score is not necessarily recognised by the GUI (e.g. Yace and Aristarch). So I always go through all games of List and check for these kind of draws.
This may explain a rating difference between our results for List. In addition to this it may very well also be the case that List indeed goes slightly down with deeper search depths, while Ruffian goes possibly slightly up.
Robert

Winboard Forum

Nunn Blitz Elite finished

Nunn Blitz Elite finished

Re: Nunn Blitz Elite finished

Re: Nunn Blitz Elite finished

Re: Nunn Blitz Elite finished

Who is online