Nunn Blitz Muse 0.898 640 games

Archive of the old Parsimony forum. Some messages couldn't be restored. Limitations: Search for authors does not work, Parsimony specific formats do not work, threaded view does not work properly. Posting is disabled.

Nunn Blitz Muse 0.898 640 games

Postby Heinz van Kempen » 06 Sep 2004, 11:28

Geschrieben von:/Posted by: Heinz van Kempen at 06 September 2004 12:28:28:

Hi all,
here a gauntlet over all 20 Nunn positions with Muse 0.898 on Athlon 2600+ and 3000+. Time control 4m + 2s.
Results were really good and Muse missed the Amateur Top 50 only by a narrow margin. Rating is 2458 and 2466 are needed for rank 50. So rating list update only after next gauntlet with Jonny 2.70.


Muse 0.898 - Glaurung 0.1.2                  19.5 - 20.5
Muse 0.898 - Movei 00_8_247s                 11.5 - 28.5
Muse 0.898 - Amy 0.8.7b                      12.5 - 27.5
Muse 0.898 - Nejmet_3.07                     17.0 - 23.0
Muse 0.898 - Dragon 4.5 CF                   17.0 - 23.0
Muse 0.898 - Abrok 5.0                       23.0 - 17.0
Muse 0.898 - Quark 2.35                      11.0 - 29.0
Muse 0.898 - Jonny 2.64                      11.5 - 28.5
Muse 0.898 - King of Kings 2.56              13.5 - 26.5
Muse 0.898 - AnMon5.30                       12.5 - 27.5
Muse 0.898 - Fruit 1.5 t                     09.5 - 30.5
Muse 0.898 - WildCat 4.0                     12.5 - 27.5
Muse 0.898 - Spike 0.7                       12.0 - 28.0
Muse 0.898 - Anaconda 1.6.2                  18.0 - 22.0
Muse 0.898 - Amyan 1.593b                    16.0 - 24.0
Muse 0.898 - LambChop 10.99                  12.5 - 27.5


Games are available for download:
http://www.husvankempen.de/nunn/
Best Regards
Heinz
Heinz van Kempen
 

Re: Nunn Blitz Muse 0.898 640 games

Postby Uri Blass » 06 Sep 2004, 12:19

Geschrieben von:/Posted by: Uri Blass at 06 September 2004 13:19:49:
Als Antwort auf:/In reply to: Nunn Blitz Muse 0.898 640 games geschrieben von:/posted by: Heinz van Kempen at 06 September 2004 12:28:28:
Hi all,
here a gauntlet over all 20 Nunn positions with Muse 0.898 on Athlon 2600+ and 3000+. Time control 4m + 2s.
Results were really good and Muse missed the Amateur Top 50 only by a narrow margin. Rating is 2458 and 2466 are needed for rank 50. So rating list update only after next gauntlet with Jonny 2.70.



Games are available for download:
http://www.husvankempen.de/nunn/
Best Regards
Heinz
>Muse 0.898 - Glaurung 0.1.2                  19.5 - 20.5
>Muse 0.898 - Movei 00_8_247s                 11.5 - 28.5
>Muse 0.898 - Amy 0.8.7b                      12.5 - 27.5
>Muse 0.898 - Nejmet_3.07                     17.0 - 23.0
>Muse 0.898 - Dragon 4.5 CF                   17.0 - 23.0
>Muse 0.898 - Abrok 5.0                       23.0 - 17.0
>Muse 0.898 - Quark 2.35                      11.0 - 29.0
>Muse 0.898 - Jonny 2.64                      11.5 - 28.5
>Muse 0.898 - King of Kings 2.56              13.5 - 26.5
>Muse 0.898 - AnMon5.30                       12.5 - 27.5
>Muse 0.898 - Fruit 1.5 t                     09.5 - 30.5
>Muse 0.898 - WildCat 4.0                     12.5 - 27.5
>Muse 0.898 - Spike 0.7                       12.0 - 28.0
>Muse 0.898 - Anaconda 1.6.2                  18.0 - 22.0
>Muse 0.898 - Amyan 1.593b                    16.0 - 24.0
>Muse 0.898 - LambChop 10.99                  12.5 - 27.5
>
I see that muse lost 15 matches out of 16.
I think that better rating can be achieved if
you give programs opponents that they score near 50% against them in order to get more realistic rating.
It is possible that Fruit1.5t suffered from playing with stronger opponents than fruit1.5 and this is the reason that it had lower rating than Fruit1.5.
Note that I do not claim that playing with stronger programs is bad for rating and it may be dependent on the program but generally playing only with stronger programs or only with weaker rating is not good for realistic rating list.
The best programs and the worst programs always need to suffer from not realistic rating but there is no reason that other programs will suffer from the same problem.
I think that it can be a good idea if you give some weak opponents for programs who have less than 45% in your list and you also give some strong opponents for programs who have more than 55% in your list.
Uri
Uri Blass
 

Re: Nunn Blitz Muse 0.898 640 games

Postby Heinz van Kempen » 06 Sep 2004, 12:54

Geschrieben von:/Posted by: Heinz van Kempen at 06 September 2004 13:54:13:
Als Antwort auf:/In reply to: Re: Nunn Blitz Muse 0.898 640 games geschrieben von:/posted by: Uri Blass at 06 September 2004 13:19:49:
Hi all,
here a gauntlet over all 20 Nunn positions with Muse 0.898 on Athlon 2600+ and 3000+. Time control 4m + 2s.
Results were really good and Muse missed the Amateur Top 50 only by a narrow margin. Rating is 2458 and 2466 are needed for rank 50. So rating list update only after next gauntlet with Jonny 2.70.



Games are available for download:
http://www.husvankempen.de/nunn/
Best Regards
Heinz
I see that muse lost 15 matches out of 16.
I think that better rating can be achieved if
you give programs opponents that they score near 50% against them in order to get more realistic rating.
It is possible that Fruit1.5t suffered from playing with stronger opponents than fruit1.5 and this is the reason that it had lower rating than Fruit1.5.
Note that I do not claim that playing with stronger programs is bad for rating and it may be dependent on the program but generally playing only with stronger programs or only with weaker rating is not good for realistic rating list.
The best programs and the worst programs always need to suffer from not realistic rating but there is no reason that other programs will suffer from the same problem.
I think that it can be a good idea if you give some weak opponents for programs who have less than 45% in your list and you also give some strong opponents for programs who have more than 55% in your list.
Uri
>>Muse 0.898 - Glaurung 0.1.2                  19.5 - 20.5
>>Muse 0.898 - Movei 00_8_247s                 11.5 - 28.5
>>Muse 0.898 - Amy 0.8.7b                      12.5 - 27.5
>>Muse 0.898 - Nejmet_3.07                     17.0 - 23.0
>>Muse 0.898 - Dragon 4.5 CF                   17.0 - 23.0
>>Muse 0.898 - Abrok 5.0                       23.0 - 17.0
>>Muse 0.898 - Quark 2.35                      11.0 - 29.0
>>Muse 0.898 - Jonny 2.64                      11.5 - 28.5
>>Muse 0.898 - King of Kings 2.56              13.5 - 26.5
>>Muse 0.898 - AnMon5.30                       12.5 - 27.5
>>Muse 0.898 - Fruit 1.5 t                     09.5 - 30.5
>>Muse 0.898 - WildCat 4.0                     12.5 - 27.5
>>Muse 0.898 - Spike 0.7                       12.0 - 28.0
>>Muse 0.898 - Anaconda 1.6.2                  18.0 - 22.0
>>Muse 0.898 - Amyan 1.593b                    16.0 - 24.0
>>Muse 0.898 - LambChop 10.99                  12.5 - 27.5
>>
Hello Uri,
you maybe correct, although we have no proof for such statements. By the way Spike had more or less the same opponents and did better and I was not certain if Spike is better than Muse (Muse did very well in the AEGT qualify with much more time of course). In the running gauntlet with Jonny 2.70 there are now opponents probably better, some about even and some weaker than Jonny.
For Nunn Top 4 I have to say that there is again a big difference from Pro Deo down to Muse. Will it be better to split this tournament up and run two smaller tournaments with 12 or 14 engines each divided by strength? Of course this would only give 440 or 520 games to each engine.
Best Regards
Heinz
Heinz van Kempen
 

Nunn Top 4 and Nunn Talents start lists

Postby Heinz van Kempen » 06 Sep 2004, 13:37

Geschrieben von:/Posted by: Heinz van Kempen at 06 September 2004 14:37:42:
Als Antwort auf:/In reply to: Re: Nunn Blitz Muse 0.898 640 games geschrieben von:/posted by: Heinz van Kempen at 06 September 2004 13:54:13:

Hi all,
Uri convinced me and so there will be two smaller tournaments with 14 participants and 520 games for each engine. In each one there will be some engines with already fixed rating. I will take the latest version for all that I have. Start date will be Wednesday morning. Maybe I will take two more in for each group if some apply.
Torunaments will be over all Nunn positions and time control 4m + 2s on Athlon 2600+ and 3000+.
Startlists:
Nunn Top 4
Pro Deo
AnMon
LG Revival
Crafty
Gothmog
Tao
Thinker
Jonny
The Baron
Patriot
Movei
Fruit
Ruffian
SOS
Nunn Talents
Naum
Knight Dreamer
DanChess
Bruja
Trace
Cerebro
Muse
Chispa
Arasan
Snitch
Booot
Spike
Ufim
Fafis
http://www.husvankempen.de/nunn/
Best Regards
Heinz
Heinz van Kempen
 

Re: Nunn Blitz Muse 0.898 640 games

Postby Robert Allgeuer » 06 Sep 2004, 16:08

Geschrieben von:/Posted by: Robert Allgeuer at 06 September 2004 17:08:22:
Als Antwort auf:/In reply to: Re: Nunn Blitz Muse 0.898 640 games geschrieben von:/posted by: Uri Blass at 06 September 2004 13:19:49:
I see that muse lost 15 matches out of 16.
I think that better rating can be achieved if
you give programs opponents that they score near 50% against them in order to get more realistic rating.
It is possible that Fruit1.5t suffered from playing with stronger opponents than fruit1.5 and this is the reason that it had lower rating than Fruit1.5.
Note that I do not claim that playing with stronger programs is bad for rating and it may be dependent on the program but generally playing only with stronger programs or only with weaker rating is not good for realistic rating list.
The best programs and the worst programs always need to suffer from not realistic rating but there is no reason that other programs will suffer from the same problem.
I think that it can be a good idea if you give some weak opponents for programs who have less than 45% in your list and you also give some strong opponents for programs who have more than 55% in your list.
Uri

How do Fruit 1.5 and Fruit1.5t differ?
Thanks
Robert
Robert Allgeuer
 

Re: Nunn Blitz Muse 0.898 640 games

Postby Heinz van Kempen » 06 Sep 2004, 17:22

Geschrieben von:/Posted by: Heinz van Kempen at 06 September 2004 18:22:47:
Als Antwort auf:/In reply to: Re: Nunn Blitz Muse 0.898 640 games geschrieben von:/posted by: Robert Allgeuer at 06 September 2004 17:08:22:
How do Fruit 1.5 and Fruit1.5t differ?
Thanks
Robert
Hello Robert,
Fruit 1.5t are those special settings from Joachim Rang, also called Fruit 1.5JR or tralala in some tournaments. I read that you already experimented a lot with Fruit settings and I suppose you also tested them.

Pawn = 100
Pawn (Endgame) = 100
Knight = 350
Knight (Endgame) = 320
Bishop = 350
Bishop (Endgame) = 320
Rook = 550
Rook (Endgame) = 520
Queen = 1050
Queen (Endgame) = 1000
Bishop Pair = 15
Bishop Pair (Endgame) = 50
For the moment Fruit 1.5 t and Fruit 1.5 have similar rating in my tournaments, Fruit 1.5t has 2619 after more than 900 games and Fruit 1.5 2615 with over 600 games. So I was not able to demonstrate that the settings might be superior, what is of course well possible.
Fruit 1.5t had stronger opposition, although it now played also against Glaurung, Muse and Spike (I forgot to put the t there). Fabien himself once wrote to me that he does not believe in the ELO formula regarding weaker opposition. Another reason might be that you need more than 1000 games to demonstrate ELO differences of 20 points or whatsoever.
Best Regards
Heinz
Heinz van Kempen
 

Re: Nunn Blitz Muse 0.898 640 games

Postby Uri Blass » 06 Sep 2004, 17:59

Geschrieben von:/Posted by: Uri Blass at 06 September 2004 18:59:16:
Als Antwort auf:/In reply to: Re: Nunn Blitz Muse 0.898 640 games geschrieben von:/posted by: Heinz van Kempen at 06 September 2004 18:22:47:
How do Fruit 1.5 and Fruit1.5t differ?
Thanks
Robert
Hello Robert,
Fruit 1.5t are those special settings from Joachim Rang, also called Fruit 1.5JR or tralala in some tournaments. I read that you already experimented a lot with Fruit settings and I suppose you also tested them.

Pawn = 100
Pawn (Endgame) = 100
Knight = 350
Knight (Endgame) = 320
Bishop = 350
Bishop (Endgame) = 320
Rook = 550
Rook (Endgame) = 520
Queen = 1050
Queen (Endgame) = 1000
Bishop Pair = 15
Bishop Pair (Endgame) = 50
For the moment Fruit 1.5 t and Fruit 1.5 have similar rating in my tournaments, Fruit 1.5t has 2619 after more than 900 games and Fruit 1.5 2615 with over 600 games. So I was not able to demonstrate that the settings might be superior, what is of course well possible.
Fruit 1.5t had stronger opposition, although it now played also against Glaurung, Muse and Spike (I forgot to put the t there). Fabien himself once wrote to me that he does not believe in the ELO formula regarding weaker opposition. Another reason might be that you need more than 1000 games to demonstrate ELO differences of 20 points or whatsoever.
Best Regards
Heinz
I think that it may be better simply to try to make the score of all players closer to 50% in case that it is possible.
you can start by matches between programs that scored less than 50% in the elite tournament against programs that have more than 50% in the top tournament but lower rating.
Uri
Uri Blass
 

Re: Nunn Blitz Muse 0.898 640 games

Postby Robert Allgeuer » 06 Sep 2004, 18:10

Geschrieben von:/Posted by: Robert Allgeuer at 06 September 2004 19:10:40:
Als Antwort auf:/In reply to: Re: Nunn Blitz Muse 0.898 640 games geschrieben von:/posted by: Heinz van Kempen at 06 September 2004 18:22:47:
How do Fruit 1.5 and Fruit1.5t differ?
Thanks
Robert
Hello Robert,
Fruit 1.5t are those special settings from Joachim Rang, also called Fruit 1.5JR or tralala in some tournaments. I read that you already experimented a lot with Fruit settings and I suppose you also tested them.

Pawn = 100
Pawn (Endgame) = 100
Knight = 350
Knight (Endgame) = 320
Bishop = 350
Bishop (Endgame) = 320
Rook = 550
Rook (Endgame) = 520
Queen = 1050
Queen (Endgame) = 1000
Bishop Pair = 15
Bishop Pair (Endgame) = 50
For the moment Fruit 1.5 t and Fruit 1.5 have similar rating in my tournaments, Fruit 1.5t has 2619 after more than 900 games and Fruit 1.5 2615 with over 600 games. So I was not able to demonstrate that the settings might be superior, what is of course well possible.
Fruit 1.5t had stronger opposition, although it now played also against Glaurung, Muse and Spike (I forgot to put the t there). Fabien himself once wrote to me that he does not believe in the ELO formula regarding weaker opposition. Another reason might be that you need more than 1000 games to demonstrate ELO differences of 20 points or whatsoever.
Best Regards
Heinz
In my Fruit testing the two settings (default and the alternative material setting jr) were within the error margins of each other; though the default was a bit ahead of jr. This was testing against exactly identical opponents, games starting from Nunn-I positions with both sides, default scored a bit higher.
Robert
Robert Allgeuer
 


Return to Archive (Old Parsimony Forum)

Who is online

Users browsing this forum: No registered users and 19 guests

cron