Hello HG,
IMO the described problem with Numpty in the thread linked below also shows an unwanted behaviour
in WB_F, respectively a bug in the given result.
http://www.open-aurec.com/wbforum/viewtopic.php?f=2&t=50132
Guenther
Moderators: hgm, Andres Valverde
H.G.Muller wrote:Well, we should discuss this. From the fact that this message exists it should be obvious that WB exactly knows what is going on, and that ruling a draw in this situation was the intended behavior.
My reasoning was this: There are several weak engines that fail to print RESULT commands, and just exit when they wantto end the gae for reasons of checkmate, repetition or 50-move draw. Forfeiting engines only for non-compliancy seems a bit harsh. When I am testing an engine I am developing against such a non-compliant engine, the validity of the testing is very much disrupted by such forfeits.
So my policy is to forfeit engines only in case of absolute necessity. If they send a false RESULT or illegal-move claim, I can do little else then forfeit them, as most engines stop after such a claim, and it would be a waste of time to wait them out. (Not to speak of the disaster when autoCallFlag is off...). But if the problem is merely that they omit the RESULT claim, and WinBoard can deduce the result, it uses that result, rather than forfeiting them. This is in general much less disruptive for testing when the score for the opponent matters; handing free points to an engine that should have lost or drawn on its own merits has a very distorting effect on rating lists. It would basically make the non-compliant engine useless for testing, and stable engines in the Elo range where this is common are very hard to find already, and usually completely deterministic, and often not able to handle an external book, so you need many.
Usually it is the quality of the Chess the engine (and its opponent!) produces, not its compliancy with protocol, which one is interested in. The situation is flagged by a unique REASON message, and those mainly interested in compliancy can always scan the PGN for this message, and thake the appropriate action. The default result now at least does not contaminate the result of the opponent.
Note that the fact that you instruct WB to adjudicate after 6 repetitions does in no way void the right of the engine to claim after 3 reptitions.
H.G.Muller wrote:Eden 0.0.11 is one of the engnes that, as I recall it, failed to send RESULT claims. I am pretty sure this is not an imagined problem, as why else would I have taken the trouble to add the code to intercept this case?
I agree that in general it is not the task of a GUI to supplement the engine, but this is a situaton that has a sufficiently large impact on the score of the opponent, and a simple-enugh solution, to handle it with care. Penalizing non-compliant engines is one thing, but handing out free points to an opponent is quite another.
This is clearly a situation that can occur only through a protocol violation, and the effects of a protocol violation are in principle undefined. That leaves the GUI the possibility to do whatever it wants. The currently implementd action still seems the least of all evils to me, most useful to the averge user. I could of course mke the behavior dependent on a new command-line option, say /forfeitExitingEngines. This could then even forfeit engines that exit without RESULT claim after checkmating or stalemating.
Note that this is not a private e-mail exchange, and that whatever I post is not meant to only address you, but a general audience who might be less expert than you...
H.G.Muller wrote:Would you also like a /forfeitExitingEngines flag to forfeit an engine that exits after checkmating the opponent? Or just in a draw position?
H.G.Muller wrote:About Toledo nanochess:
There are two development branches of micro-Max, with as currently most recent verions micro-Max 4.8 (optimized for best Elo/char ratio, 1968 characters) and micro-Max 1.6 (optimized for smallest size irrespective of playing strength, 1433 characters). On the Chess War rating scale (which is a bit compressed compared to the true ratings) 4.8 has 1882, 1.6 only 1451.
Even micro-Max 1.6 is significatly stronger than Toledo nanochess (in direct confrontation), at equal thinking time. But the version of micro-Max 1.6 that is on my website is set for blitz TC, while the published version of Toledo nanochess (5 ply) is set for standard TC, and thinks 20 times longer. Oscar Toledo likes to avertize that the published version of Toledo nanochess (lightly) beats the published version of micro-Max 1.6, without mentioning that uMax is facing a factor 20 time odds in those tests. In the few equal-time tests I conducted, uMax 1.6 beat Toledo nanochess by about 75%.
Return to WinBoard development and bugfixing
Users browsing this forum: No registered users and 10 guests