If there were a utility it would be a bit less annoying.How do you check for the underpromotion errors? Do you use a utility?
If there were a utility it would be a bit less annoying.How do you check for the underpromotion errors? Do you use a utility?
What I do:
1) I run pgn-extract over thepgn collection to remove duplicates, during this process I get as a by-product also all games with illegal moves in it, because pgn-extract does not match such games. This catches the third possibility.
2) For the first two possibilities I simply search for the strings =N, =R and =B in the pgn file and look at / analyse those games. Particularly suspect are games where immediately or only a few moves after the underpromotion a stalemate is claimed or a win for the side that was not underpromoting.
So sorry, it is more the legwork approach, rather than having a clever tool.
Robert
It is negligible in that it occurs in only .01% of your games.After 760 games Tao 5.6 scored 50 points higher than its predecessor, which is a statistically significant improvement.
However, it still does not support underpromotions and this turns out really annoying, because in such cases Tao
- may claim an incorrect stalemate
- may claim even an incorrect win
- may make incorrect moves
The latter is easy to find, but the other two are really annoying: no other engine needs that much checking of game output and adjudicating as Tao.
It is not negligable: in the 760 games of Tao 5.6 there were 6 underpromotions to a knight and 3 underpromotions to a rook, altogether 9, which means they occurred in more than 1% of the games.
It is a shame that an engine at that level does not implement all rules of chess.
For tools, conditions, time control etc. please refer to the link below. Next engine will be the last remaining still untested (by me) candidate for being the strongest free engine: List 5.12.
Program Elo + - Games Score Av.Op. Draws
01 Ruffian v2.1.0 : 2679 18 28 778 71.6 % 2518 24.9 %
02 Ruffian v2.0.0 : 2675 17 27 840 71.6 % 2515 25.8 %
03 Ruffian v1.0.1 : 2652 17 24 936 69.7 % 2508 26.7 %
Too early in the morning and had not finished my coffee it is 1%It is negligible in that it occurs in only .01% of your games.After 760 games Tao 5.6 scored 50 points higher than its predecessor, which is a statistically significant improvement.
However, it still does not support underpromotions and this turns out really annoying, because in such cases Tao
- may claim an incorrect stalemate
- may claim even an incorrect win
- may make incorrect moves
The latter is easy to find, but the other two are really annoying: no other engine needs that much checking of game output and adjudicating as Tao.
It is not negligable: in the 760 games of Tao 5.6 there were 6 underpromotions to a knight and 3 underpromotions to a rook, altogether 9, which means they occurred in more than 1% of the games.
It is a shame that an engine at that level does not implement all rules of chess.
For tools, conditions, time control etc. please refer to the link below. Next engine will be the last remaining still untested (by me) candidate for being the strongest free engine: List 5.12.
Program Elo + - Games Score Av.Op. Draws
01 Ruffian v2.1.0 : 2679 18 28 778 71.6 % 2518 24.9 %
02 Ruffian v2.0.0 : 2675 17 27 840 71.6 % 2515 25.8 %
03 Ruffian v1.0.1 : 2652 17 24 936 69.7 % 2508 26.7 %
The author is aware of this issue but has a low priority due to it's insignificance and the main priority being book learning.
Just as a FYI you state in your conditions that Ruffian 1.0.1 is fixed to 2650 as a reference point and the below shows this not to be the case.
The author is aware of this issue but has a low priority due to it's insignificance and the main priority being book learning.Just as a FYI you state in your conditions that Ruffian 1.0.1 is fixed to 2650 as a reference point and the below shows this not to be the case.For tools, conditions, time control etc. please refer to the link below. Next engine will be the last remaining still untested (by me) candidate for being the strongest free engine: List 5.12.
Program Elo + - Games Score Av.Op. Draws
01 Ruffian v2.1.0 : 2679 18 28 778 71.6 % 2518 24.9 %
02 Ruffian v2.0.0 : 2675 17 27 840 71.6 % 2515 25.8 %
03 Ruffian v1.0.1 : 2652 17 24 936 69.7 % 2508 26.7 %
I hardly think that comparing a amateur chess engine fault to a wheel falling off a car any type of a comparison. Would you say that a chess engine that looses 1% of the time not negligible when looking at a win lose ratio?The author is aware of this issue but has a low priority due to it's insignificance and the main priority being book learning.Just as a FYI you state in your conditions that Ruffian 1.0.1 is fixed to 2650 as a reference point and the below shows this not to be the case.For tools, conditions, time control etc. please refer to the link below. Next engine will be the last remaining still untested (by me) candidate for being the strongest free engine: List 5.12.
Program Elo + - Games Score Av.Op. Draws
01 Ruffian v2.1.0 : 2679 18 28 778 71.6 % 2518 24.9 %
02 Ruffian v2.0.0 : 2675 17 27 840 71.6 % 2515 25.8 %
03 Ruffian v1.0.1 : 2652 17 24 936 69.7 % 2508 26.7 %
I am not sure whether it is negligable when the problem has 1% probability, causes undetermined behaviour and essentially means that Tao does not support all rules of chess. Losing a wheel 3 times a year is also not negligable for a car.
True, I have posted some time later that I have changed the reference point, it is now Pharaon 2.62 with 2509 (2509 was chosen to maintain continuity, Pharaon since then has fallen by 2 with comparison to Ruffian 1.0.1). The reason was that Ruffian 1.0.1 is now an engine superseded by several newer versions, while Pharaon has stayed and I guess will stay with us for a while without version change.
I should probably repost the full conditions and reference this new post from then on. But anyway the absolute reference point is generally arbitrary.
Robert
I hardly think that comparing a amateur chess engine fault to a wheel falling off a car any type of a comparison. Would you say that a chess engine that looses 1% of the time not negligible when looking at a win lose ratio?The author is aware of this issue but has a low priority due to it's insignificance and the main priority being book learning.
I am not sure whether it is negligable when the problem has 1% probability, causes undetermined behaviour and essentially means that Tao does not support all rules of chess. Losing a wheel 3 times a year is also not negligable for a car.
Such an engine just does not work according to the specs of the game.I hardly think that comparing a amateur chess engine fault to a wheel falling off a car any type of a comparison. Would you say that a chess engine that looses 1% of the time not negligible when looking at a win lose ratio?I am not sure whether it is negligable when the problem has 1% probability, causes undetermined behaviour and essentially means that Tao does not support all rules of chess. Losing a wheel 3 times a year is also not negligable for a car.
in the Tao case it is not only a case of losing or making mistakes, it incorrectly claims draws when it is in fact in a lost position and I also saw it claim wins when the game is pretty much even. If it were just losing ok, but this is just incorrect behaviourI hardly think that comparing a amateur chess engine fault to a wheel falling off a car any type of a comparison. Would you say that a chess engine that looses 1% of the time not negligible when looking at a win lose ratio?The author is aware of this issue but has a low priority due to it's insignificance and the main priority being book learning.
I am not sure whether it is negligable when the problem has 1% probability, causes undetermined behaviour and essentially means that Tao does not support all rules of chess. Losing a wheel 3 times a year is also not negligable for a car.
Movei is losing less than 1% of its games on time with ponder on.
I think that the problem is not negligible and I consider to release a new version.
I already have a new version without that problem but hopefully I will do some improvements.
there is a difference between mistakes.
Mistakes that cause the program to lose the game by chess errors are different than mistakes that cause the program to lose on time or mistakes that cause the program not to understand the rules.
bugs that cause the program to play a stupid tactical mistake in one out of 100 games is also more serious than lack of endgame knowledge that cause the program to miss a draw in one out of 100 games.
Uri
It is not a mistake nor a bug, we are talking about it is the lack of knowledge in Tao of how to handle a pawn promotion by the oponent to anything other than a queen. My point was to put things into perspective and that this is NOT significate.I hardly think that comparing a amateur chess engine fault to a wheel falling off a car any type of a comparison. Would you say that a chess engine that looses 1% of the time not negligible when looking at a win lose ratio?The author is aware of this issue but has a low priority due to it's insignificance and the main priority being book learning.
I am not sure whether it is negligable when the problem has 1% probability, causes undetermined behaviour and essentially means that Tao does not support all rules of chess. Losing a wheel 3 times a year is also not negligable for a car.
Movei is losing less than 1% of its games on time with ponder on.
I think that the problem is not negligible and I consider to release a new version.
I already have a new version without that problem but hopefully I will do some improvements.
there is a difference between mistakes.
Mistakes that cause the program to lose the game by chess errors are different than mistakes that cause the program to lose on time or mistakes that cause the program not to understand the rules.
bugs that cause the program to play a stupid tactical mistake in one out of 100 games is also more serious than lack of endgame knowledge that cause the program to miss a draw in one out of 100 games.
Uri
I grant you that it does not understand this one aspect of the game. Let me ask you this, you state that you are testing for the strongest freeware blitz engine. In the case of chess strength is knowledge. How do you attain knowledge --- though learning. Yet you have disabled learning aspects of the engines your are matching. Is this not negligible?Such an engine just does not work according to the specs of the game.I hardly think that comparing a amateur chess engine fault to a wheel falling off a car any type of a comparison. Would you say that a chess engine that looses 1% of the time not negligible when looking at a win lose ratio?I am not sure whether it is negligable when the problem has 1% probability, causes undetermined behaviour and essentially means that Tao does not support all rules of chess. Losing a wheel 3 times a year is also not negligable for a car.
Robert
I grant you that it does not understand this one aspect of the game. Let me ask you this, you state that you are testing for the strongest freeware blitz engine. In the case of chess strength is knowledge. How do you attain knowledge --- though learning. Yet you have disabled learning aspects of the engines your are matching. Is this not negligible?Such an engine just does not work according to the specs of the game.I hardly think that comparing a amateur chess engine fault to a wheel falling off a car any type of a comparison. Would you say that a chess engine that looses 1% of the time not negligible when looking at a win lose ratio?I am not sure whether it is negligable when the problem has 1% probability, causes undetermined behaviour and essentially means that Tao does not support all rules of chess. Losing a wheel 3 times a year is also not negligable for a car.
Robert
I grant you that it does not understand this one aspect of the game. Let me ask you this, you state that you are testing for the strongest freeware blitz engine. In the case of chess strength is knowledge. How do you attain knowledge --- though learning. Yet you have disabled learning aspects of the engines your are matching. Is this not negligible?Such an engine just does not work according to the specs of the game.I hardly think that comparing a amateur chess engine fault to a wheel falling off a car any type of a comparison. Would you say that a chess engine that looses 1% of the time not negligible when looking at a win lose ratio?I am not sure whether it is negligable when the problem has 1% probability, causes undetermined behaviour and essentially means that Tao does not support all rules of chess. Losing a wheel 3 times a year is also not negligable for a car.
Robert
Such a rating list or tournament is nothing else than a measurement. And it >depends what you want to measure; the conditions must then be set accordingly. >I do not want to measure how an engine can adapt to the style of another engine >through learning, but how an engine would perform against another opponent, if >it played without history against this opponent. In order to achieve this, >learning must be off, so that each game of a given engine against an opponent >is run under identical conditions, independent whether it is the first or the >twentieth game in a series.
A side effect of leaving learning on is that this distorts ratings, because not >all engines support learning. Testing with learning does also mean that >strictly speaking each game against a given opponent is run under different >conditions, and results principally depend on the number of games you play.
If I wanted to test learning functions I would run series of games of the same >engine(s), once with and once without learning and compare results.
Robert
Return to Archive (Old Parsimony Forum)
Users browsing this forum: No registered users and 29 guests