Download > 37000 games CEGT 40/40

Discussions about Winboard/Xboard. News about engines or programs to use with these GUIs (e.g. tournament managers or adapters) belong in this sub forum.

Moderator: Andres Valverde

Download > 37000 games CEGT 40/40

Postby Heinz van Kempen » 26 Sep 2005, 12:37

Hi all :) ,

on special request, for games collectors and those who want to apply bayeselo instead of EloStat there is a monster download file available (12 Mb) for all CEGT 40/40 games played up to now by around 17 testers, some using more than one machine.

http://www.husvankempen.de/nunn/downloads.htm

Games are unstripped from all comments, doubles are killed and engine names are unified. This file will only be offered for download for a few days.

The database reflects the current standings of the CEGT ratinglist:

http://www.husvankempen.de/nunn/rating.htm

There is also a smaller updated file for Fritz 9 games (including comments).

We have decided to run the Fritz 9 test in "slow motion" because there were complaints by other testers that it removes all the fun for them when CEGT outputs 2000 games within one week for a new interesting engine. The last thing we want is that other testers feel pushed aside by our efforts to give statistical reliable ratings with a medium time control.

Links to other CEGT testers sites:

http://www.cegt.de (german)

http://uk.geocities.com/bloodhound66@bt ... /index.htm

http://www.beepworld.de/members46/rainerschach/


Best Regards
Heinz
Heinz van Kempen
 
Posts: 160
Joined: 27 Sep 2004, 07:35
Location: Leverkusen, Germany

Re: Download > 37000 games CEGT 40/40

Postby Anonymous » 26 Sep 2005, 14:17

Heinz van Kempen


wow, that's a great job . thank you ,Heinz van Kempen
,for your hard work!!!! :shock: :D
Anonymous
 

Re: Download > 37000 games CEGT 40/40

Postby Heinz van Kempen » 26 Sep 2005, 14:21

Hi Alfred :) ,

work and first of all enthusiasm from all testers of course.

I am very happy we have combined so good testers, all reliable and contributing daily with many new games and also interesting proposals and ideas.

Best Regards
Heinz
Heinz van Kempen
 
Posts: 160
Joined: 27 Sep 2004, 07:35
Location: Leverkusen, Germany

Re: Download > 37000 games CEGT 40/40

Postby Kirill Kryukov » 27 Sep 2005, 04:46

Very interesting, thanks!

I run my script on it (which took a while), and uploaded statistics. It seems the tournament is very unbalanced: some engines played thousands of games, some have just a few. Does not it make any problems?

For example Hiarcs 9 and Shredder 9 played 176 games with each other. Because of that the difference in strength that they have may have too much weight versus other pairs. So for example if Hiarcs 9 is particularly weak agains Shredder 9, this weakness will be overestimated because there are not as many games they played with other engines..

Is there any coordination planed to produce more balanced game sets? (For example running gauntlet tournaments instead of roundrobins).

Another notice is that the top of table is overpopulated with CM10. I don't know if it's good - if a particular engine is strong of weak against CM10, that strength of weakness will be amplified..
User avatar
Kirill Kryukov
 
Posts: 127
Joined: 21 Sep 2005, 09:56

Re: Download > 37000 games CEGT 40/40

Postby Heinz van Kempen » 27 Sep 2005, 06:28

Hi Kirill :) ,

well, if you think that this one is unbalanced just take a look at popular SSDF where you see dozens of Shredder and Fritz versions playing each other. Forget about the perfect tournament and testing method, it does simply not exist and all has advantages and drawbacks.

Of course it could be done even more systematically. I could order each tester to play certain matches to gain ideal amount of games against opponents that would fit best or add to having a better average opposition ELO compared to another one, again and again.

What would happen then? I can tell you, all testers would run away. In such a team you have to offer first freedom for all: this means everyone contributes with what he has fun in, for example Ray and Graham playing with CM and now Shredder personalities, Uschi and I trying new kinds of tournaments, etc.. Astonishingly combining big projects from testers still give a very good combined list overall.

Secondly it has to be fun and suspense. I want to see testers happy and enthusiastic even after months and maybe years and not feeling bored.

By the way a lot of matches are run, especially by Michael and Christian and the tests for Fruit and Fritz 9 are done very systematically. You could do the following....Extract the ideal matches with the search function of a database, maybe Shredder 50 games against all other top engines and same amount against strong amateurs. Do the same with all engines. Then calculate ELO again and see if you will get differences.

Best Regards
Heinz
Last edited by Heinz van Kempen on 27 Sep 2005, 15:21, edited 1 time in total.
Heinz van Kempen
 
Posts: 160
Joined: 27 Sep 2004, 07:35
Location: Leverkusen, Germany

Re: Download > 37000 games CEGT 40/40

Postby Robert Allgeuer » 27 Sep 2005, 11:30

Kiryll, great statistics, great work,
I wanted to post the bayeselo ratings for CEGT but no need to do so any more....

Observations are:
- for engines with a certain minimum number of games, (relative) differences between bayeselo and elostat are small/negligable, modulaton is within a few ELO points
- bayeselo is much better in estimating the strength of those engines that have played a very small number of games. Here Elostat calculates some astronomical ratings ....
- There is a kind of offset of 30 points or so in the two rating lists (e.g. Fritz 221 vs. 185, Knightdreamer -213 vs. -240), ELOstat having abolute lower values.

Is your script a tool that takes in a pgn, calculates all the statistics and spits out html on the other side?

Robert
Robert Allgeuer
 
Posts: 124
Joined: 28 Sep 2004, 19:09
Location: Konz / Germany

Re: Download > 37000 games CEGT 40/40

Postby Heinz van Kempen » 27 Sep 2005, 11:37

Hi Kirill and Robert :) ,

first of all I want to ask Kirill, if I would be allowed to put his fine statististics to my webpage.

Secondly: yes, EloStat tends to overvalue engines with few games. There are numerous cases where an engine starts with skyhigh ratings and then drops like a stone.

With more games both rating systems show very few differences.

Thank you both for feedback

Heinz
Last edited by Heinz van Kempen on 27 Sep 2005, 15:21, edited 1 time in total.
Heinz van Kempen
 
Posts: 160
Joined: 27 Sep 2004, 07:35
Location: Leverkusen, Germany

Re: Download > 37000 games CEGT 40/40

Postby Uri Blass » 27 Sep 2005, 12:35

Heinz van Kempen wrote:Hi Kirill and Robert :) ,

first of all I want to ask Kirill, if I would be allowed to put his fine statististics to my webpage.

Secondly: yes, EloStat tends to overvalue engines with few games. There are numerous cases where an engine start with skyhigh ratings and then drops like a stone.

With more games both rating systems show very few differences.

Thank you both for feedback

Heinz


Thanks for the information.

I want to ask if you check that Fritz9 has no bugs that make it get good results that are too good like using the opponent time.

Do you check that opponent get 100% cpu time.
I think that in games with ponder off it is important to check it for new programs.

Uri
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Re: Download > 37000 games CEGT 40/40

Postby Heinz van Kempen » 27 Sep 2005, 12:40

Hi Uri :) ,

we expect that rating for Fritz 9 will drop a bit still.

I will forward your question by email to Christian and Michael, although like already told they are using it in the old GUI (Fritz 8). At least Michael wrote this to me.

Best Regards
Heinz
Heinz van Kempen
 
Posts: 160
Joined: 27 Sep 2004, 07:35
Location: Leverkusen, Germany

Re: Download > 37000 games CEGT 40/40

Postby Christian Koch » 27 Sep 2005, 12:55

I want to ask if you check that Fritz9 has no bugs that make it get good results that are too good like using the opponent time.

Do you check that opponent get 100% cpu time.
I think that in games with ponder off it is important to check it for new programs.

Hi Uri

I checked engine matches under the old F8 and under the new F9 GUI. With both cases I have noticed no problems with opponent cpu time.

best regards,
Christian
Christian Koch
 
Posts: 5
Joined: 27 Sep 2004, 11:43
Location: Holm

Re: Download > 37000 games CEGT 40/40

Postby Kirill Kryukov » 27 Sep 2005, 14:19

Hi everyone, I'm glad my page is useful.. :)

Heinz van Kempen wrote:Secondly it has to be fun and suspense. I want to see testers happy and enthusiastic even after months and maybe years and not feeling bored.

Yeah, this makes sense. :) Fun factor is of course important.

What I was thinking it that if testers can see the pairwise game number table (like the one on my page), they can arrange their tournaments to kind of fill the gaps... It they like to, of course.. At least that's what I would do if I run a CEGT event. :)

Heinz van Kempen wrote:By the way a lot of matches are run, especially by Michael and Christian and the tests for Fruit and Fritz 9 are done very systematically. You could do the following....Extract the ideal matches with the search function of a database, maybe Shredder 50 game against all other top engines and same amount against strong amateurs. Do the same with all engines. Then calculate ELO again and see if you will get differences.

I guess there will be difference, but whether it's due to chance or because of some tendency will be hard to say...

Robert Allgeuer wrote:Is your script a tool that takes in a pgn, calculates all the statistics and spits out html on the other side?

Yes, that's what it does. More precisely, it goes into directory "initial-games", and finds all *.pgn files there. Then it computes everything and writes a result tables, each to its separate file. Then I include those tables into HTML-page using Apache's SSI. By this way I can change the layout or write my comments in HTML without re-running the script.

The script also computes engine correlation tables, like ones on this page, but that needs chessbase comments in PGN files. I think it would be very interesting to see such data for CEGT too, so I wonder, is it possible to collect and analyze all games with comments?

Heinz van Kempen wrote:first of all I want to ask Kirill, if I would be allowed to put his fine statististics to my webpage.

Sure, use them however you like! I can also volanteer to re-run the statistics for any updated games. I will release my script for free after it is in a little more finished state, now I am still fixing and adding things...
User avatar
Kirill Kryukov
 
Posts: 127
Joined: 21 Sep 2005, 09:56

Re: Download > 37000 games CEGT 40/40

Postby Heinz van Kempen » 27 Sep 2005, 14:46

Hi Kirill :),

this is a great offer to update CEGT data from time to time. From tomorrow onwards I will give daily combined unstripped pgn files, that only have to be added to the huge file.

Regrettably the same file with comments would be around ten times bigger, so it is impossible to upload it, at least here.

New interesting games like the Fritz 9 test are of course with comments. The big Fruit test should also be still available for download with comments and Fabien would surely be interested to get data out of it.

Michael and Christian are very interested in statistical data (and I also love to play with numbers) and they will care for filling the gaps like you proposed.

Please write to me (click the contact button on my webpage and replace at by @ ) and I will gladly forward your proposals to all testers.

Best Regards
Heinz
Heinz van Kempen
 
Posts: 160
Joined: 27 Sep 2004, 07:35
Location: Leverkusen, Germany

Re: Download > 37000 games CEGT 40/40

Postby Heinz van Kempen » 27 Sep 2005, 15:39

Hi Kirill :) ,

thanks again. The stats are uploaded to the ratings page.

http://www.husvankempen.de/nunn/rating.htm

New data and suggestions highly welcome at any time.

Best Regards
Heinz
Heinz van Kempen
 
Posts: 160
Joined: 27 Sep 2004, 07:35
Location: Leverkusen, Germany


Return to Winboard and related Topics

Who is online

Users browsing this forum: No registered users and 27 guests