Geschrieben von:/Posted by: Fabien Letouzey at 22 June 2004 14:56:26:
Als Antwort auf:/In reply to: Re: Knockout and Nunn Elite 2 geschrieben von:/posted by: Heinz van Kempen at 22 June 2004 14:29:34:
Hi Heinz,
the first thought was to have some engines in Elite 2 with already known rating in my list, so that I do not have to run those additional gauntlets. Fruit was the first candidate for this, as I think that it is better to have more than 1000 games than "only" 640 games to decrease error margins. Not that I do not trust my own results that much, as I am seeing great performance elsewhere, but shooting stars are always in danger to be examined a lot.
On the other hand it is promising to fine tune engines. I have a lot of tests with other CM settings giving much better results and with Crafty personalities and I already used other settings for some, like the "heavy" settings for Tao.
Do I understand correctly that only piece values are changed and pawn(endgame) remains at 100?
As I presume that Ferret and others will not apply for this "important" tournament it would be no problem to include Fruit 1.5 and Fruit 1.5s (just an example for another name) in the event. In this strong field anyway Fruit will look worse than it is. The tournament will also not proceed that fast as I am still experimenting with others with the aim to include more weaker engines, like the knockout tournaments. But with more than 200 engines and some updating very fast this is almost impossible and seemingly does not cause a lot of interest.
Fruit 1.5 won most of the gauntlet matches you ran. IMO games against stronger opponents are necessary for a more accurate rating estimation indeed (I don't trust the Elo formula for large rating differences).
OK, I have nothing against it either.
Yes, it was my will. I felt that the very definition of "centipawn" would be destroyed otherwise.
OK, my main point in the previous post is that a different name (hence a different rating) should be used each time a change is made to any engine (whether binary or configuration). It is obvious to you, but I wanted to state it clearly anyway.
Two more points of mine about making two versions participate in the same tournament are:
1) never do that when the tournament ranking is important (but perfectly OK for rating *each* version as you do)
2) I would prefer another promising engine to participate instead
But my main opinion is that people can do whatever they want with the versions I release (contrary to development versions that participated in some tournaments). So the decision whether to rate new settings is actually yours. I have no doubt that they are better.
Fabien.