A Statistical study of chess results

Discussions about Winboard/Xboard. News about engines or programs to use with these GUIs (e.g. tournament managers or adapters) belong in this sub forum.

Moderator: Andres Valverde

A Statistical study of chess results

Postby Norm Pollock » 15 Feb 2005, 15:07

Thanks to the wonderful database tools (Scid and pgn-extract) and the availability of many, many games of master players, I decided to do a very simple statistical look at the question: "Does White's advantage dissipate as the game goes on?"

The surprising answer is NO!

Here is my study. For the database I chose (for obvious reasons) my own database, which by the way, is publicly available at crafty-chess.com. Briefly, this database has 59,249 games between human players rated elo 2400+, at long time controls, played since Jan 1, 2000. The database is filtered to eliminated games up to 40 plies, Internet games, Blitz, Rapid, Blindfold, Simultaneous games as well as FRC games. Most importantly duplicate and twin games were eliminated using Scid and pgn-extract.

Here are my observations:

59,249 games, 41+ plies, White 56.0%, Black 44.0%
These results include draws.

The next step was to filter out the draws.

34,886 games, 41+ plies, White 60.2%, Black 39.8%
All draws were removed.

The next step was to filter out games of 41-60 plies.

29,818 games, 61+ plies, White 59.5%, Black 40.5%
Surprisingly, White maintains the same advantage over Black.

The next step was to filter out games of 61-80 plies.

18,801 games, 81+ plies, White 59.6%, Black 40.4%
Again, White still maintains the same advantage over Black.

The next step was to filter out games of 81-100 plies.

9,948 games, 101+ plies, White 58.0%, Black 42.0%
A tiny improvement for Black in games over 50 moves, but White still has a commanding advantage.

The next step was to filter out games of 101-120 plies.

4580 games, 121+ plies, White 58.7%, Black 41.3%
White increases it's advantage.

The next, and final step, was to filter out games of 121-140 plies.

1899 games, 141+ plies, White 58.7%, Black 41.3%
No relief for Black, White still dominates by the same ratio.

Conclusion:
White maintains the same likelihood of winning over Black (about 59% in non-drawn games) regardless of the length of the game. White's initial advantage does not diminish as the game gets longer.
Norm Pollock
 
Posts: 217
Joined: 27 Sep 2004, 02:52

Re: A Statistical study of chess results

Postby Dann Corbit » 15 Feb 2005, 15:16

That is quite an interesting study.

I admit that the result is somewhat surprising to me after 100+ plies.
Dann Corbit
 

Re: A Statistical study of chess results

Postby Roger Brown » 15 Feb 2005, 16:40

Norm Pollock wrote:Thanks to the wonderful database tools (Scid and pgn-extract) and the availability of many, many games of master players, I decided to do a very simple statistical look at the question: "Does White's advantage dissipate as the game goes on?"

The surprising answer is NO!

[snip]

Conclusion:
White maintains the same likelihood of winning over Black (about 59% in non-drawn games) regardless of the length of the game. White's initial advantage does not diminish as the game gets longer.





Hello Norm,

Interesting.

I have a question. At the level of 2400 and above (that is a GM level is it not), isn't a possible explanation the use of opening systems which ECO shows to be, if not actually favourable to white, certainly even?

That advantage, combined with the right of the first move, should enable white to draw if not win most games.

I am curious to see the makeup of your databse - I have it so I am not in any way impugning its quality. I just want to know if we are looking at 10,000 sicilian variations with an advantage of 0.56 for white.

Also, at the lower levels occupied by the 2000 players, does the ratio increase or worsen? I am expecting that it should even out. AT the lower levels they say that it is tactics that decides games....

I am also curious if individuals players - say Capablanca, Alekhine, Karpov, Kasparov and Tal - have that ratio.

Korchnoi was reputed to be quite formidable with the black pieces.

Thanks for the study.

Later.

Ps. Do you have an update of your database?
Roger Brown
 
Posts: 346
Joined: 24 Sep 2004, 12:31

Re: A Statistical study of chess results

Postby Norm Pollock » 15 Feb 2005, 17:17

Roger Brown wrote:
Norm Pollock wrote:Thanks to the wonderful database tools (Scid and pgn-extract) and the availability of many, many games of master players, I decided to do a very simple statistical look at the question: "Does White's advantage dissipate as the game goes on?"

The surprising answer is NO!

[snip]

Conclusion:
White maintains the same likelihood of winning over Black (about 59% in non-drawn games) regardless of the length of the game. White's initial advantage does not diminish as the game gets longer.





Hello Norm,

Interesting.

I have a question. At the level of 2400 and above (that is a GM level is it not), isn't a possible explanation the use of opening systems which ECO shows to be, if not actually favourable to white, certainly even?

That advantage, combined with the right of the first move, should enable white to draw if not win most games.

I am curious to see the makeup of your databse - I have it so I am not in any way impugning its quality. I just want to know if we are looking at 10,000 sicilian variations with an advantage of 0.56 for white.

Also, at the lower levels occupied by the 2000 players, does the ratio increase or worsen? I am expecting that it should even out. AT the lower levels they say that it is tactics that decides games....

I am also curious if individuals players - say Capablanca, Alekhine, Karpov, Kasparov and Tal - have that ratio.

Korchnoi was reputed to be quite formidable with the black pieces.

Thanks for the study.

Later.

Ps. Do you have an update of your database?


Roger,

Almost all openings in the ECO system are used in the database. A00-99 10669 games, B00-99 18985 games (king's pawn), C00-99 9555 games, D00-99 10814 games and E00-99 9226 games. Please download the database from user submissions at crafty-chess.com and use Scid to check it out.

No particular opening(s) were favored in collecting the database. No opening(s) were filtered out.

I recognize that the quality of my database is the crucial issue as to whether or not my study is meaningful. It is a database made up of all the games from twic.com and from chesscollect.com. It was then filtered for the following: no computers, no players with elo under 2400, no games prior to year 2000, no Internet, no ICC-IECC-FICS-playchess, no blitz-rapid-blindfold-simultaneous games (as best as possible because sometimes these games are not denoted properly in the tags), and MOST IMPORTANTLY no duplicate games (which means one game, one vote).

The database was updated thru about Dec 26, 2004.

With players below elo 2400, the statistical results might be different due to greater emphasis on tactical abilities. And likely as you say, things should even out with more moves.

My study here however is based on master level / professional level players who are tactical pros, and therefore tactical skill levels are not an issue in my study.
Norm Pollock
 
Posts: 217
Joined: 27 Sep 2004, 02:52

Re: A Statistical study of chess results

Postby Norm Pollock » 15 Feb 2005, 23:05

Here is a more complete look at the white/black percents, with/without draws:

With drawn games included:

41-60 plies, 12510 games, 55.6% for white, 44.4% for black

61-80 plies, 16479 games, 56.3% for white, 43.7% for black

81-100 plies, 14069 games, 57.1% for white, 42.9% for black

101-120 plies, 8425 games, 54.8% for white, 45.2% for black

121-140 plies, 4372 games, 56.0% for white, 44.0% for black

141+ plies, 3394 games, 54.9% for white, 45.1% for black

Total games with drawn games included, 59249 games, 56.0% for white, 44.0% for black

Notice the consistent values for % white, % black

--------------------------------------

With drawn games removed:

41-60 plies, 5068 games, 63.8% for white, 36.2% for black

61-80 plies, 11017 games, 59.5% for white, 40.5% for black

81-100 plies, 8853 games, 61.3% for white, 38.7% for black

101-120 plies, 5368 games, 57.5% for white, 42.5% for black

121-140 plies, 2681 games, 58.6% for white, 41.4% for black

141+ plies, 1899 games, 58.7% for white, 41.3% for black

Total games with drawn games included, 34886 games, 60.2% for white, 39.8% for black

Notice the consistent values for % white, % black
Norm Pollock
 
Posts: 217
Joined: 27 Sep 2004, 02:52

Re: A Statistical study of chess results

Postby Norm Pollock » 15 Feb 2005, 23:07

Correction in the next to last line:

Total games with drawn games REMOVED, 34886 games, 60.2% for white, 39.8% for black
Norm Pollock
 
Posts: 217
Joined: 27 Sep 2004, 02:52

Re: A Statistical study of chess results

Postby Sven Schüle » 15 Feb 2005, 23:53

Hi Norm,

one question first: Why do you think that it is important to remove all drawn games from your statistics? I do not see good reasons for it, so can you explain it, please?

I have thought about your analysis and found out that there is some more to be considered.

It might be interesting to see a similar statistics about these "2400+" games not based on the total number of plies played but based on the ply of the last "error" which was responsible for the outcome of the game. An "error" can turn a won position into a draw or a loss, or a drawn position into a loss. It could be recognized (of course with some uncertainty as long as we don't have the perfect chess engine) by tracking the scores returned for each game position by a world class engine's analysis and finding out where the score drops significantly. This would take a lot of time, of course, but I want to ignore performance issues here.

I want to explain why this approach could be interesting. Your main question was:
"Does White's advantage dissipate as the game goes on?"

Most strong chess players think that optimal play of both sides from the initial chess position leads to a draw. However, statistics of games of top players show a score for white of more than 50 percent. Why?

The only explanation for me is that there is a higher chance for black to choose a non-optimal move than it is for white, at least in the early stages of a game. The advantage of white, as the side being on move when both sides have made the same number of moves so far, seems to be that black is forced to react sometimes, while white has more equivalent choices. And even if black reacts with the second best move and this move still keeps the position balanced assuming optimal play, this second best move narrows the path to a draw.

IMO this is true for openings and early middlegames. Now consider endgames. I do not believe that there is an advantage for white in endgames provided you look at balanced positions. If you carefully choose a number of endgame positions which are "balanced" (as far as you can judge) and let a couple of very strong players play many games from these positions, with equal distribution of colours, I would expect a score of 50% for white.

If you give them some non-balanced endgame positions, the result will be different. In real games, even in those of very strong players, I expect that there is a significant part of unbalanced endgame positions occurring. They are unbalanced due to one or more non-optimal moves which have narrowed the path to a draw. So the key must be somewhere earlier in the game. And this will lead us back to the early stages of the game.

Another point is chess knowledge. Top players have excellent endgame knowledge, and it is a well-known task for them how to play most types of endgames. So I expect that they choose the best move more often than in middlegames where the situation is quite new and complex sometimes, and there is a high chance of error.

This is just my opinion, I might be partially wrong.

Sven
User avatar
Sven Schüle
 
Posts: 240
Joined: 26 Sep 2004, 20:19
Location: Berlin, Germany

Re: A Statistical study of chess results

Postby Norm Pollock » 16 Feb 2005, 02:38

Sven Sch?le wrote:Hi Norm,

one question first: Why do you think that it is important to remove all drawn games from your statistics? I do not see good reasons for it, so can you explain it, please?


I think that both ways (with draws, without draws) are worth looking at. Initially I only looked at the situation without draws because I was working on something else when I stumbled about the observation that game length is irrelevant to the white/black ratio. A while later I went back and looked at the situation with draws. I found that with draws the same pattern occurs, except the percent difference is a little less.

Without draws the white/black ratio is roughly 59%-41%. With draws the white/black ratio is roughly 56%-44%. Both ratios are independent of the length of the game, as measured in plies (1/2 a move).


I have thought about your analysis and found out that there is some more to be considered.

It might be interesting to see a similar statistics about these "2400+" games not based on the total number of plies played but based on the ply of the last "error" which was responsible for the outcome of the game. An "error" can turn a won position into a draw or a loss, or a drawn position into a loss. It could be recognized (of course with some uncertainty as long as we don't have the perfect chess engine) by tracking the scores returned for each game position by a world class engine's analysis and finding out where the score drops significantly. This would take a lot of time, of course, but I want to ignore performance issues here.

I want to explain why this approach could be interesting. Your main question was:
"Does White's advantage dissipate as the game goes on?"

Most strong chess players think that optimal play of both sides from the initial chess position leads to a draw. However, statistics of games of top players show a score for white of more than 50 percent. Why?

The only explanation for me is that there is a higher chance for black to choose a non-optimal move than it is for white, at least in the early stages of a game. The advantage of white, as the side being on move when both sides have made the same number of moves so far, seems to be that black is forced to react sometimes, while white has more equivalent choices. And even if black reacts with the second best move and this move still keeps the position balanced assuming optimal play, this second best move narrows the path to a draw.

IMO this is true for openings and early middlegames. Now consider endgames. I do not believe that there is an advantage for white in endgames provided you look at balanced positions. If you carefully choose a number of endgame positions which are "balanced" (as far as you can judge) and let a couple of very strong players play many games from these positions, with equal distribution of colours, I would expect a score of 50% for white.

If you give them some non-balanced endgame positions, the result will be different. In real games, even in those of very strong players, I expect that there is a significant part of unbalanced endgame positions occurring. They are unbalanced due to one or more non-optimal moves which have narrowed the path to a draw. So the key must be somewhere earlier in the game. And this will lead us back to the early stages of the game.

Another point is chess knowledge. Top players have excellent endgame knowledge, and it is a well-known task for them how to play most types of endgames. So I expect that they choose the best move more often than in middlegames where the situation is quite new and complex sometimes, and there is a high chance of error.

This is just my opinion, I might be partially wrong.

Sven


I don't have a clue why white is so much more successful than black. I'm just the messenger. :D
Norm Pollock
 
Posts: 217
Joined: 27 Sep 2004, 02:52

Re: A Statistical study of chess results

Postby Norm Pollock » 16 Feb 2005, 18:02

Norm Pollock wrote:I don't have a clue why white is so much more successful than black. I'm just the messenger. :D


Yesterday I took the easy way out. Today I'll try to answer the question as to what the stats say regarding the 56.0%-44.0% advantage for white over black.

I think we would all agree that White is entitled to a small advantage for going first. Let me evaluate it as 100 millipawns (1/10 pawn). The stats clearly state that there is a 12% difference in results. So that means that in 12% of the games, the White player is able to parley that 100 millipawn advantage to at least 1000 millipawns (1 pawn), which is what is needed to produce a win. Sometimes it takes just 40-60 plies (20-30 moves) to achieve that, sometimes it takes over 140 plies (over 70 moves) to achieve it. But it does happen consistently over all game lengths in 12% of the games.

The other 88% of the games are split evenly. So White wins 12%+44%=56% of the time, Black wins 44% of the time. That's what the stats are saying, in my humble opinion.
Norm Pollock
 
Posts: 217
Joined: 27 Sep 2004, 02:52

Re: A Statistical study of chess results

Postby Sven Schüle » 16 Feb 2005, 18:20

Hi Norm,

perhaps my posting was too long to understand my key points :D

Norm Pollock wrote:I don't have a clue why white is so much more successful than black. I'm just the messenger.

It is just that I'm not sure whether your message is really correct. If strong players score about 56% with white in long games as well as in short games, this does not necessarily mean that the white advantage which is present in the starting position does not disappear towards the end of the game.

My proposal instead was to have a look not only at game length but also at the ply where an error occurs which decides the game, and then to see whether black still makes more errors than white in endgames, too.

But I know this may easily turn out to be a huge project, and of course your numbers are quite interesting, too. Thanks for your work, Norm!

Sven
User avatar
Sven Schüle
 
Posts: 240
Joined: 26 Sep 2004, 20:19
Location: Berlin, Germany

Re: A Statistical study of chess results

Postby Peter Fendrich » 16 Feb 2005, 18:56

Thanks Norm for this interesting post.

One point that I have is about the actual strength balance between the white and black players.
If the white player had the higher ELO in 58% of the games your results are showing something else than if it was in 45% of the games.

I think that this information is important in order to make any further conclusions about the material.

/Peter
User avatar
Peter Fendrich
 
Posts: 193
Joined: 26 Sep 2004, 20:28
Location: Sweden

Re: A Statistical study of chess results

Postby Norm Pollock » 16 Feb 2005, 21:51

Peter Fendrich wrote:Thanks Norm for this interesting post.

One point that I have is about the actual strength balance between the white and black players.
If the white player had the higher ELO in 58% of the games your results are showing something else than if it was in 45% of the games.

I think that this information is important in order to make any further conclusions about the material.

/Peter


Hi Peter,

The database collection was not filtered in any way to give the better players white more often. However, a check of 10 of the highest elo rated players show that surprisingly they had white 3.1% more often (51.56% white, 48.44% black). How that could happen is beyond my understanding.
Are higher rated players given white more often in tournaments?

[The breakdown details are at the bottom of this post.]

On the other hand, the number of games played by the 2700+ players against under 2700 players is probably in the 1500 range which is 2.5% of the 59000 games in the database.

How does this affect my statistical study? Maybe I should not have had so many statistical outliers? Like if I wanted to find average income, I would not want Bill Gates in the sample. A 2700 player is too dominant over a 2400-2699 player and therefore the result masks what I am looking for.

I am leaning towards redoing the stats just using 2450-2650 rated players. This would focus on the white/black difference without the "interference" of a too wide variation in chess talent.

Here is the breakdown of white/black in my database for ten 2700+ players:

Gary Kasparov: 245 games, 127 white, 118 black
Anatoly Karpov: 250 games, 127 white, 123 black
Peter Leko: 344 games, 182 white, 162 black
V Anand: 422 games, 213 white, 209 black
V Kramnik: 294 games, 150 white, 144 black
Michael Adams: 347 games, 173 white, 174 black (only one with more black)
Alex Grischuk: 348 games, 187 white, 161 black
Vassily Ivanchuk: 344 games, 177 white, 167 black
Alex Shirov: 525 games, 272 white, 253 black
Ivan Sokolov: 382 games, 197 white, 185 black

-Norm
Last edited by Norm Pollock on 19 Feb 2005, 22:22, edited 1 time in total.
Norm Pollock
 
Posts: 217
Joined: 27 Sep 2004, 02:52

Re: A Statistical study of chess results

Postby Norm Pollock » 17 Feb 2005, 00:54

Thanks to all for your comments. To correct the inadequacies that were pointed out to me, I made a new database (using Scid and pgn-extract). Unfortunately this new database is not publicly available. But I am willing to upload it if anyone wants to host it.

The new database has the same filters but in addition I made 2 changes:
(1) I only included games between players with 2450-2650 elo. The previous database had an elo range of 2400-2800? (whatever Kasparov is);
and (2) I included games from Jan 1, 1997 to Dec 26, 2004. The previous database was from Jan 1, 2000 to Dec 26, 2004. I did this to enlarge the size. Even so it is still only about 50% the size of the first database (30,034 games compared to 59,249).

The new database is more homogeneous in terms of player strength. The elo range was narrowed from 400 to 200.

However, the underlying observation is still intact. In fact, it is even more stable. That is, White's winning advantage over Black remains constant regardless of the length of the game.

I will just give the results of the study where draws are included. The same pattern holds with the draws removed.

Here are my observations:

With Draws:

30,034 games, 41+ plies, White 53.3%, Black 46.7%

23,387 games, 61+ plies, White 53.4%, Black 46.6%

15,004 games, 81+ plies, White 53.6%, Black 46.4%

7,938 games, 100+ plies, White 53.4%, Black 46.6%

3,847 games, 120+ plies, White 54.1%, Black 45.9%

1,658 games, 140+ plies, White 53.4%, Black 46.6%

Conclusion:
White maintains the same winning ratio over Black (about 53.5%-46.5%) regardless of the length of the game. White's advantage does not dissipate as the game gets longer.

Possible Explanation:
White starts with a small advantage, perhaps 1/10 of a pawn. In 7% (1 out of 14) of the games, White is able to build on that advantage to eventually win the game, even if it is a long game. In the remaining 93% of the games. Black equalizes, and the games are split evenly (including draws). As a result, White wins 7 + 46.5 = 53.5% of the time (including draws), while Black only wins 46.5% of the time (including draws).

Note:
With draws removed, White wins consistantly at the rate of 56.0%.
Norm Pollock
 
Posts: 217
Joined: 27 Sep 2004, 02:52

Re: A Statistical study of chess results

Postby Roger Brown » 17 Feb 2005, 12:00

Norm Pollock wrote:Thanks to all for your comments. To correct the inadequacies that were pointed out to me, I made a new database (using Scid and pgn-extract). Unfortunately this new database is not publicly available. But I am willing to upload it if anyone wants to host it.




Hello Norm,

What about Peter Skinner, host of your previous database? Your latest offering sounds delicious to a data junkie such as myself.

I am confident that Dann would host it for you if you asked.

I cannot wait for it.

Thanks for the effort Norm. meticulous and precise as always. Interesting thread too.

:D

Later.
Roger Brown
 
Posts: 346
Joined: 24 Sep 2004, 12:31

Re: A Statistical study of chess results

Postby Tony van Roon-Werten » 17 Feb 2005, 15:24

Norm Pollock wrote:Thanks to all for your comments. To correct the inadequacies that were pointed out to me, I made a new database (using Scid and pgn-extract). Unfortunately this new database is not publicly available. But I am willing to upload it if anyone wants to host it.

..

Conclusion:
White maintains the same winning ratio over Black (about 53.5%-46.5%) regardless of the length of the game. White's advantage does not dissipate as the game gets longer.

Possible Explanation:
White starts with a small advantage, perhaps 1/10 of a pawn. In 7% (1 out of 14) of the games, White is able to build on that advantage to eventually win the game, even if it is a long game. In the remaining 93% of the games. Black equalizes, and the games are split evenly (including draws). As a result, White wins 7 + 46.5 = 53.5% of the time (including draws), while Black only wins 46.5% of the time (including draws).

Note:
With draws removed, White wins consistantly at the rate of 56.0%.


Might be easier.

Say the first move is worth 10, a game is won when score is >50 or <-50 and a small error is -20 points

White player is allowed to make 3 small errors, black player only 2, therefor white will win more games. Removing draws makes it clearer because they will dampen the effect.

What your numbers basicly say is that white has an advantage. It was noticed by several other people in computerchess tournements as well. And regarded as coincidentle by quite a few people too (strangly enough)

Tony
Tony van Roon-Werten
 
Posts: 99
Joined: 02 Oct 2004, 15:31
Location: 's Hertogenbosch, Netherlands

Re: A Statistical study of chess results

Postby Norm Pollock » 17 Feb 2005, 16:14

[quote="Tony van Roon-Werten

Might be easier.

Say the first move is worth 10, a game is won when score is >50 or <-50 and a small error is -20 points

White player is allowed to make 3 small errors, black player only 2, therefor white will win more games. Removing draws makes it clearer because they will dampen the effect.

What your numbers basicly say is that white has an advantage. It was noticed by several other people in computerchess tournements as well. And regarded as coincidentle by quite a few people too (strangly enough)

Tony[/quote]

White's winning advantage has been known a long time. Certainly the advantage by itself at the start does not guarantee victory. Your explanation for White's better results is that White has a bigger margin of error. He can afford one extra minor error. That is a good theory using a negative point of view.

My explanation, which is only a theory as well, uses a positive point of view. White has a tiny advantage to work with at the start, and on occassion, he can build on that advantage to the point of winning. It could be in a few moves, or it could take many, many moves.

We probably are saying the same thing, but from two differing perspectives. Sort of the glass is half empty or half full.

Btw, my study's point is not that White has an advantage. That has been know for centuries. The point I am making is that longer games do not dissipate that advantage.
Norm Pollock
 
Posts: 217
Joined: 27 Sep 2004, 02:52

Re: A Statistical study of chess results

Postby Norm Pollock » 17 Feb 2005, 17:35

I received a request to examine the percent of draws against the length of the game.

I also noticed that a very small number of games (7 out 30,034) needed to be filtered out because they did not report a result. Fortunately this did not affect the stat percentages that I reported. Nevertheless I will restate those stats as well.

Here are my observations:

With Draws:

30,027 games, 41+ plies, White 53.3%, Black 46.7%, Draws 45.5%

23,381 games, 61+ plies, White 53.4%, Black 46.6%, Draws 39.2%

15,000 games, 81+ plies, White 53.6%, Black 46.4%, Draws 40.7%

7,936 games, 100+ plies, White 53.4%, Black 46.6%, Draws 40.4%

3,846 games, 120+ plies, White 54.1%, Black 45.9%, Draws 41.8%

1,658 games, 140+ plies, White 53.4%, Black 46.6%, Draws 44.5%

Conclusion: With regard to the draws, there seems to be a 5% bump in the percent of draws for short games and for very long games. Quick draws and 50 move draws seem to be the likely causes for these bumps. Otherwise the percentage is close to 40%. But draws do show a lot more variation than the white/black score ratio.
Norm Pollock
 
Posts: 217
Joined: 27 Sep 2004, 02:52

Re: A Statistical study of chess results

Postby Robert Allgeuer » 18 Feb 2005, 22:45

I found this observation so interesting that I have carried out the same analysis for computer games, both for Blitz games (based on 50000 games from my YABRL Blitz rating list) and also for games at longer time controls (based on the database by W. Eigenmann, >130000 games all with a time control >30 min/game).

As can be seen below, computer games - independent of time control - behave as one would have naively expected: with longer games the perfomance of white falls asymptotically towards a performance of 50%, although not reaching it. Likewise, the longer the games, the higher the percentage of games ending in draws.

Generally, the statistics for Blitz and longer time controls (i.e. both draw percentage and performance in both relative and absolute terms) behave remarkably identical.

It appears that "Pollock?s Law" (sounds good I think :) ) must be a human or psychological effect, but it is most probably not an intrinsic property of chess.

Blitz (5min + 2 sec) from YABRL:

Code: Select all
>= 20 moves:
Games        :  49195 (finished)

White Wins   :  20164 (41.0 %)
Black Wins   :  16656 (33.9 %)
Draws        :  12375 (25.2 %)
Unfinished   :      0

White Perf.  : 53.6 %
Black Perf.  : 46.4 %

>= 30 moves:
Games        :  48157 (finished)

White Wins   :  19651 (40.8 %)
Black Wins   :  16443 (34.1 %)
Draws        :  12063 (25.0 %)
Unfinished   :      0

White Perf.  : 53.3 %
Black Perf.  : 46.7 %

>= 40 moves:
Games        :  45070 (finished)

White Wins   :  17984 (39.9 %)
Black Wins   :  15479 (34.3 %)
Draws        :  11607 (25.8 %)
Unfinished   :      0

White Perf.  : 52.8 %
Black Perf.  : 47.2 %

>= 50 moves:
Games        :  39042 (finished)

White Wins   :  15008 (38.4 %)
Black Wins   :  13277 (34.0 %)
Draws        :  10757 (27.6 %)
Unfinished   :      0

White Perf.  : 52.2 %
Black Perf.  : 47.8 %

>= 60 moves:
Games        :  30532 (finished)

White Wins   :  11016 (36.1 %)
Black Wins   :   9971 (32.7 %)
Draws        :   9545 (31.3 %)
Unfinished   :      0

White Perf.  : 51.7 %
Black Perf.  : 48.3 %

>= 70 moves:
Games        :  21344 (finished)

White Wins   :   6940 (32.5 %)
Black Wins   :   6340 (29.7 %)
Draws        :   8064 (37.8 %)
Unfinished   :      0

White Perf.  : 51.4 %
Black Perf.  : 48.6 %

>= 80 moves:
Games        :  14151 (finished)

White Wins   :   3952 (27.9 %)
Black Wins   :   3589 (25.4 %)
Draws        :   6610 (46.7 %)
Unfinished   :      0

White Perf.  : 51.3 %
Black Perf.  : 48.7 %

>= 90 moves:
Games        :   9410 (finished)

White Wins   :   2092 (22.2 %)
Black Wins   :   1894 (20.1 %)
Draws        :   5424 (57.6 %)
Unfinished   :      0

White Perf.  : 51.1 %
Black Perf.  : 48.9 %

>= 100 moves:
Games        :   6501 (finished)

White Wins   :   1083 (16.7 %)
Black Wins   :   1004 (15.4 %)
Draws        :   4414 (67.9 %)
Unfinished   :      0

White Perf.  : 50.6 %
Black Perf.  : 49.4 %




Games with time control > 30min (W. Eigenmann):

Code: Select all
>= 20 moves:
Games        : 117335 (finished)

White Wins   :  47181 (40.2 %)
Black Wins   :  37596 (32.0 %)
Draws        :  32558 (27.7 %)
Unfinished   :      0

White Perf.  : 54.1 %
Black Perf.  : 45.9 %

>= 30 moves:
Games        : 113895 (finished)

White Wins   :  45465 (39.9 %)
Black Wins   :  36628 (32.2 %)
Draws        :  31802 (27.9 %)
Unfinished   :      0

White Perf.  : 53.9 %
Black Perf.  : 46.1 %

>= 40 moves:
Games        : 104506 (finished)

White Wins   :  40650 (38.9 %)
Black Wins   :  33465 (32.0 %)
Draws        :  30391 (29.1 %)
Unfinished   :      0

White Perf.  : 53.4 %
Black Perf.  : 46.6 %

>= 50 moves:
Games        :  87758 (finished)

White Wins   :  32460 (37.0 %)
Black Wins   :  27436 (31.3 %)
Draws        :  27862 (31.7 %)
Unfinished   :      0

White Perf.  : 52.9 %
Black Perf.  : 47.1 %

>= 60 moves:
Games        :  66066 (finished)

White Wins   :  22541 (34.1 %)
Black Wins   :  19600 (29.7 %)
Draws        :  23925 (36.2 %)
Unfinished   :      0

White Perf.  : 52.2 %
Black Perf.  : 47.8 %

>= 70 moves:
Games        :  45520 (finished)

White Wins   :  13800 (30.3 %)
Black Wins   :  12271 (27.0 %)
Draws        :  19449 (42.7 %)
Unfinished   :      0

White Perf.  : 51.7 %
Black Perf.  : 48.3 %

>= 80 moves:
Games        :  30636 (finished)

White Wins   :   7897 (25.8 %)
Black Wins   :   7098 (23.2 %)
Draws        :  15641 (51.1 %)
Unfinished   :      0

White Perf.  : 51.3 %
Black Perf.  : 48.7 %

>= 90 moves:
Games        :  20846 (finished)

White Wins   :   4379 (21.0 %)
Black Wins   :   3949 (18.9 %)
Draws        :  12518 (60.0 %)
Unfinished   :      0

White Perf.  : 51.0 %
Black Perf.  : 49.0 %

>= 100 moves:
Games        :  14841 (finished)

White Wins   :   2498 (16.8 %)
Black Wins   :   2245 (15.1 %)
Draws        :  10098 (68.0 %)
Unfinished   :      0

White Perf.  : 50.9 %
Black Perf.  : 49.1 %


Robert
Robert Allgeuer
 
Posts: 124
Joined: 28 Sep 2004, 19:09
Location: Konz / Germany

Re: A Statistical study of chess results

Postby Norm Pollock » 19 Feb 2005, 00:20

Hi Robert,

Thanks for your kind words and your interest.

For a study of this sort to be strongly persuasive, there has to be controls on the players. First, the players have to be of relatively equal strength in each sample. And second, we have to be sure that average of the White players' elos is approximately equal to the average of the Black players' elos in each sample. By "sample", I mean each grouping, for example, plies >= 81.

The reason for the first control is sort of obvious. We don't want player dominance to skew the statistics. The reason for the second control is to make sure that White and Black have an equal chance. It could happen that the upper tier (elo 2600-2650 in my study) in our sample gets White more often.

In my study, I limited the players to a range of 2450-2650. If I made it any tighter, then I would not have enough data. What my study lacks is the second control. I do not have proof that the average White elo is approximately equal to the average Black elo in each sample.

And remember that these two controls have to be verified for each sample (for example: plies >= 81) of each study.

With computer engines, there is an elo problem. What is the elo of an engine? How can we be sure that the competing engines are close enough in ability? Whose elo do you go by? And secondly, is there evidence that the White elo average is approximately equal to the Black elo average?

I'm not saying it happened, but this is what could have happened if there was a wide range in chess engine strength. It could have happened that the stronger engines finished their games quickly. So they were not around for games with higher plies. And the stats get skewed.

Now I have to find someone to write a utility that can figure out the average elo for White, and the average elo for Black, taking its data from a pgn file.

If we both can get the two controls in place, then it will be interesting to compare results. You may be right in saying that my observation was strictly a "human" phenomenon.

-Norm
Norm Pollock
 
Posts: 217
Joined: 27 Sep 2004, 02:52

Re: A Statistical study of chess results

Postby Robert Allgeuer » 19 Feb 2005, 00:49

The Eigenmann database is of course just a collection of whatever games, so none of your conditions will hold.

YABRL, however, is completely different and very controlled. White and black average elo are guaranteed to be completely identical (because each engine plays each side equally often against identical opponents), there are no duplicates etc. The maximum ELO difference between opponents is 400 though. A tool that can filter out those games where the ELO difference is > 200 is probably difficult to find I reckon, though. but it wold be interesting.

Despite the differences in nature of the two databases the figures for Eigenmann?s database and YABRL look so similar, which for me is already interesting to see.

IIRC with the human games database the performance of white did not drop towards 50% even when having ELO differences > 200 (as is the case with the two computer databases, both of which however drop towards 50%. So as a minimum in this respect the human and computer games behave differently).

Robert
Robert Allgeuer
 
Posts: 124
Joined: 28 Sep 2004, 19:09
Location: Konz / Germany

Next

Return to Winboard and related Topics

Who is online

Users browsing this forum: No registered users and 10 guests