Robert Allgeuer wrote:The Eigenmann database is of course just a collection of whatever games, so none of your conditions will hold.
YABRL, however, is completely different and very controlled. White and black average elo are guaranteed to be completely identical (because each engine plays each side equally often against identical opponents), there are no duplicates etc. The maximum ELO difference between opponents is 400 though. A tool that can filter out those games where the ELO difference is > 200 is probably difficult to find I reckon, though. but it wold be interesting.
Despite the differences in nature of the two databases the figures for Eigenmann?s database and YABRL look so similar, which for me is already interesting to see.
IIRC with the human games database the performance of white did not drop towards 50% even when having ELO differences > 200 (as is the case with the two computer databases, both of which however drop towards 50%. So as a minimum in this respect the human and computer games behave differently).
Robert
Filtering out games where the elo difference is a fixed amount is too difficult. However you can set an elo range of 200 width and filter out games with players not in that range. That's how I got my sample.
How reliable are computer elos? Human elos seem to be reliable.
A 400 elo difference is too much. A 2700 elo player will probably have a 90% score against a 2300 player. Even 200 is too large. I would like to make the max difference about 100 elo. The key thing is to ensure equality of skill so that the only important difference between the players is the color of the pieces.
Another problem is that elo is fluid stat. Elos change constantly and a 2530 elo player one day could be a 2620 elo player some other day.
Another issue with comparing a sample of computer games to human games is the quantity of players. My 2nd sample of 30000 games had over 900 players. It is hard to tell exactly because players get their names spelled differently so many times that 3 players could actually be the same. I also think the max number of games played by any one player was around 300 or 1%.
In your samples, how many different chess engines. What was the max number of games played by any one engine.