Page 1 of 1

A New PGN Collection - CCG.pgn

PostPosted: 23 Jul 2007, 23:42
by Norm Pollock
My new pgn collection is CCG.pgn. The link is at Jim Ablett's site:

http://homepages.tesco.net/henry.ablett/jims.html

I selected long time-control computer chess games between the best engines in CCRL 40/40 and CEGT 40/120. Altogether about 30k games. I did some slight modifications like standardized engine/version names, and new elo and eco values. Games have at least 51 plies. No duplicate games. Move-wrapping removed. Excess tags removed. Download is 7.9M.

The collection is suitable for use in making Opening books. It complements my collections of human-human games (Grand pgns) whose links are also available on Jim's site.

Re: A New PGN Collection - CCG.pgn

PostPosted: 24 Jul 2007, 10:37
by Volker Pittlik
Norm Pollock wrote:...
I selected long time-control computer chess games between the best engines in CCRL 40/40 and CEGT 40/120, ... new elo and eco values... at least 51 plies. No duplicate games...


I made something similar yesterday and had some trouble especially with the CEGT games. Your tools were very useful. It seems the SCID derivate does not use the ratings file properly and I used your embed tool for the same purpose.

Is there any way to make it use of a BayesElo rating output? The output is very similar to EloStat. I only noticed a ":" as difference.

And (entering impertinent mode): I found more than 500 (!) different "Event" tags in the CEGT 40/20 pgn. Not to talk about "Site" tag. I tried to change them to [Event "CEGT 40/20"] and [Site "CEGT"] with an editor what ended in endless swapping. Is there a way to do that with your tools I missed so far?

Volker

Re: A New PGN Collection - CCG.pgn

PostPosted: 24 Jul 2007, 14:36
by Norm Pollock
Volker Pittlik wrote:
Norm Pollock wrote:...
I selected long time-control computer chess games between the best engines in CCRL 40/40 and CEGT 40/120, ... new elo and eco values... at least 51 plies. No duplicate games...


I made something similar yesterday and had some trouble especially with the CEGT games. Your tools were very useful. It seems the SCID derivate does not use the ratings file properly and I used your embed tool for the same purpose.

Is there any way to make it use of a BayesElo rating output? The output is very similar to EloStat. I only noticed a ":" as difference.


The "embed" program is custom made for Elostat 1.3. It takes into account many peculiarities of 1.3 when certain limits are reached. For BayesElo, I would have to make another program.

Volker Pittlik wrote:And (entering impertinent mode): I found more than 500 (!) different "Event" tags in the CEGT 40/20 pgn. Not to talk about "Site" tag. I tried to change them to [Event "CEGT 40/20"] and [Site "CEGT"] with an editor what ended in endless swapping. Is there a way to do that with your tools I missed so far?

Volker


The CEGT 40/20 pgn is very difficult to work with because there is a lot of non-conformity as well as other problems.

What about [Event "*"] and [Site "*"]? I have seen [Event "?"] and [Site "?"] before, but [Event "*"] and [Site "*"] is new, at least to me.

To avoid the swapping, I would suggest a custom made program that replaces all existing Event and Site tags with user supplied tag values. A text editor goes nuts with so much search/replace on such a large file.

Thanks for 2 ideas for possible future programs - BayesElo insertion and a Event/Site replacement program. The most difficult part of writing programs is coming up with the idea!

-Norm

Re: A New PGN Collection - CCG.pgn

PostPosted: 24 Jul 2007, 19:48
by Ron Murawski
Norm Pollock wrote:A text editor goes nuts with so much search/replace on such a large file.

-Norm


The free Context Editor has been able to edit large files for me.
http://www.context.cx/
I have not tried search-and-replace using Context on this particular pgn collection though. YMMV.

Ron

Re: A New PGN Collection - CCG.pgn

PostPosted: 25 Jul 2007, 01:23
by Norm Pollock
Volker Pittlik wrote:...I used your embed tool for the same purpose.

Is there any way to make it use of a BayesElo rating output? The output is very similar to EloStat. I only noticed a ":" as difference.

Volker


Done, I hope! But I would appreciate some beta testing by anyone interested.

This utility embayes.exe, takes elo values created by bayeselo.exe and inserts them into a pgn file.

embayes.exe is based on embed.exe, which does a similar service for elo values created by elostat 1.3. embed.exe is part of 40H which can be downloaded from Jim Ablett's site.

embayes.exe takes elo values from a text file created by bayeselo.exe, and inserts those values into a pgn file. Usually it is the pgn file that was used to create the elo values. However, where the pgn file has an existing elo value in place, that elo value is not replaced.

Exact directions on how to use embayes.exe are in an included readme file.

Also included is embayes.class for Linux users and any other java runtime users.

One further note. I also have another utility in 40H that removes all existing elo values in a pgn file. This may be helpful. It is called cleanelo.exe.

http://www.mydatabus.com/public/ten/embayes.rar

Re: A New PGN Collection - CCG.pgn

PostPosted: 26 Jul 2007, 12:14
by Volker Pittlik
Norm Pollock wrote:...
Done, I hope! But I would appreciate some beta testing by anyone interested.
...


Thanks Norm!

I noticed that the tool works as described and I didn't notice any bugs crashes or anything else what shouldn't happen.

Volker

Re: A New PGN Collection - CCG.pgn

PostPosted: 26 Jul 2007, 18:09
by jdart
I also have a large PGN computer game collection, available here:

http://www.arasanchess.org/ccomp.zip

It is long time control games also. Probably some overlap with yours but not a lot. There are about 77,000 games in this file.

--Jon