Moderator: Andres Valverde
Michel wrote:I wrote an extension to polyglot to dump polyglot books in human readable format.
I need to do a little bit of code cleanup but it is late. In any case here you can see
the output for gm2600.bin, one of the books that comes with Scid.
Lines for white:
http://alpha.uhasselt.be/Research/Algeb ... _white.txt
LInes for black:
http://alpha.uhasselt.be/Research/Algeb ... _black.txt
Regards,
Michel
Lines for white : 15177
Lines for black : 9944
White positions : 28164
Black positions : 18974
Unreachable positions : 30734
A B C D E F G H (1-0)
C B A D E F G H (1-0)
If a position is in both books, take the moves from the first one.
Michel wrote:I think it should be safe to delete unreachable positions from the book.
I would appreciate some comments from PG book authors on this. In any case I will write a utility (an extension of polyglot) to do this.
There is no reason why these unreachable positions should remain in the merged book. I think.
Marc Lacrosse wrote:Michel wrote:I think it should be safe to delete unreachable positions from the book.
I would appreciate some comments from PG book authors on this. In any case I will write a utility (an extension of polyglot) to do this.
There is no reason why these unreachable positions should remain in the merged book. I think.
Hi Michel
I have exactly the opposite view on this topic.
What you call "unreachable positions" are not unreachable at all!
They just were not reached in the subset of games that have been retained within the book.
This does not mean at all that a game that would follow a slightly different path than the paths explicitly followed by the selected games from which the book was built could not reach one of these "unreachable" positions.
It could be the bookmaker wish that in this precise case his book should include some knowledge for the conduct of this precise and analogous games.
And secondly these "unreachable positions" do not do any harm in actual play. They even prove useful when your engine faces an opponent that uses unusual transpositional paths to enter his favorite openings.
Books are not about games, lines or any sequences of moves. They are actually collections of positions together with some knowledge regarding the optimal continuation from each of these positions.
If you consider that book building is about lines you have to be exhaustive regarding any paths to a given position when you build a book and that's a pretty impossible task. Older programs like Yace or Pharaon had only this kind of book building process (you had to collect recommended sequences of moves, not just positions with favored moves for each of them).
As a bookmaker I may notice that in one precise position my engine tends to choose a wrong move. I may wish to add this precise position with my recommended moves to the book without any consideration for the path or paths that could lead to this precise position.
This whole argumentation was extremely important for Fabien when he defined book functions to be integrated in Fruit and Polyglot!
In fact the one and only negative aspect of what you call "unreachable" positions is that any attempt at a dump-down in the form of sequences of moves will fail !
So if you build a utility for eliminating positions toward which there are no explicit path within the book you will actually lower the amount of knowledge that is included in this book and i am ready to bet that, for example, a version of performance.bin that you would prune this way should perform badly as compared with the original.
I suppose I would almost never use such a utility myself.
Marc
- "-only-white" *** NEW ***
Save only white moves. This allows to use different parameters for
white and black books, and merge them into a single file with the
"merge-book" command, see below.
- "-only-black" *** NEW ***
Same for black moves.
I guess I overlook something but anyway:
I thought all positions are entered in the book with/via pgn files.
So how can there be unreachable positions in it ?
Michel wrote:
only-white and only-black do NOT produce unreachable positions by themselves. They simply produce books without black or white lines.
EDIT: perhaps I should be clear what I mean by a line. A line for white is a sequence
of moves from the starting position containing only book moves for white and arbitrary moves for black. A line for black is defined similarly.
Regards,
Michel
(1) MY POINT: I think that many isolated positions produced by the PG utilities are provably unreachable in the sense that they cannot arise on the board if one of the players sticks to the book.
YOUR POINT: Even if a position is provably unreachable it still represents knowledge about the game. And it may become reachable if we introduce for example a new move. Furthermore one may very well introduce isolated positions manually which are not provably unreachable and thus beneficial.
(2) MY POINT: Isolated positions are not discoverable. You cannot know what they are until they arise in actual gameplay.
YOUR POINT: Not a problem.
(3) YOUR POINT: I am convinced "performance.bin" will be much weaker if we remove the isolated positions (and it will certainly not become stronger).
MY POINT: Hmm... I don't know. But the isolated positions take up almost half the book (if I did not make a mistake). So it would be interesting to know what their true benefit is.
Michel wrote:Hi Marc,
I fully agree that an opening book is about positions and not lines. I am NOT somehow trying to subvert the PG format in terms of lines. But a view in terms of lines maybe
beneficial since humans tend to think in those terms.
Michel wrote:The question is how should we view isolated positions (positions not on any lines). I DO understand your points in this matter. For the benefit of others let me try to summarize them.(1) MY POINT: I think that many isolated positions produced by the PG utilities are provably unreachable in the sense that they cannot arise on the board if one of the players sticks to the book.
YOUR POINT: Even if a position is provably unreachable it still represents knowledge about the game. And it may become reachable if we introduce for example a new move. Furthermore one may very well introduce isolated positions manually which are not provably unreachable and thus beneficial.
(2) MY POINT: Isolated positions are not discoverable. You cannot know what they are until they arise in actual gameplay.
YOUR POINT: Not a problem.
(3) YOUR POINT: I am convinced "performance.bin" will be much weaker if we remove the isolated positions (and it will certainly not become stronger).
MY POINT: Hmm... I don't know. But the isolated positions take up almost half the book (if I did not make a mistake). So it would be interesting to know what their true benefit is.
This last point can of course only be resolved by testing. So I will make a version with the isolated positions pruned. It would only be for experimental purposes. I hope you are not offended by that.
Regards,
Michel
Michel wrote:Just to add one more ingredient to this discussion. I can confirm that it is the merge utility that creates huge numbers of isolated positions (which are probably provably unreachable).
Merging two pgn files and then creating a book gives quite different results from first making two books and then merging them.
Michel wrote:Here are two separate proposals for enhancing the merge utility which do not create isolated positions.
(1) In case positions appear in both books give the moves in the second book which are not in the first book weight zero. This is equivalent to the current behavior.
(2) Make the probabilities in the merged book the average of the probabilities in the original books.
These could be options to the "merge-book" command.
Michel
As you say yourself option "1" is equivalent to the current behavior.
hmm difficult to predict the practical effect of all these averagings.
./polyglot info-book -bin /usr/local/share/scid/books/Performance.bin -exact
PolyGlot 1.4 by Fabien Letouzey
Lines for white : 15177
Lines for black : 9944
Positions on lines for white : 28164
Positions on lines for black : 18974
Unreachable white positions(?) : 10865
Unreachable black positions(?) : 11127
Isolated positions : 8742
./polyglot info-book -bin spitaleri.bin -exact
PolyGlot 1.4 by Fabien Letouzey
Lines for white : 9749
Lines for black : 14000
Positions on lines for white : 84822
Positions on lines for black : 122999
Unreachable white positions(?) : 54351
Unreachable black positions(?) : 27467
Isolated positions : 852
./polyglot info-book -bin /usr/local/share/scid/books/gm2600.bin -exact
Lines for white : 5360
Lines for black : 5459
Positions on lines for white : 7750
Positions on lines for black : 8143
Unreachable white positions(?) : 117
Unreachable black positions(?) : 116
Isolated positions : 194
./polyglot info-book -bin /usr/local/share/scid/books/varied.bin -exact
Lines for white : 18426
Lines for black : 13435
Positions on lines for white : 34388
Positions on lines for black : 25592
Unreachable white positions(?) : 4472
Unreachable black positions(?) : 4343
Isolated positions : 8817
./polyglot info-book -bin /usr/local/share/scid/books/Elo2400.bin -exact
Lines for white : 46005
Lines for black : 48602
Positions on lines for white : 51081
Positions on lines for black : 55459
Unreachable white positions(?) : 691
Unreachable black positions(?) : 799
Isolated positions : 1021
Michel wrote:I improved my dump utility a bit. It now also shows probabilities.
...
Recall that these are the lines obtained when one(not both) of the players makes only book moves. These dumps do not reflect isolated positions.
Will you publish the dump utility too?
Can your utility display the unreachable positions?
If you dump them as EPD strings, I think it would be interesting to examine them.
It seems a great puzzle that they would be entered into the book if they cannot be achieved by retrograde somehow.
Return to Winboard and related Topics
Users browsing this forum: Google [Bot] and 24 guests