Winboard Forum

by **Stef Luijten** » 21 Jan 2006, 18:33

Hello everybody,

I am considering to write a EGTB generator myself (just for fun).
Is there a possibility/trick/workaround in C++ on a 32-bit machine to create arrays with 64-bit addresses (unsigned __int64)?
Some of the 6-man tables will need more than 32 bit addresses.....

Thanks for any input!
Stef

by **Gian-Carlo Pascutto** » 21 Jan 2006, 20:01

Uhm, in 32 bits mode you have an address space of maximally 2G or 3G. There's no way to make an array so big that you would need >32 bit addresses!

by **H.G.Muller** » 21 Jan 2006, 20:03

Well, in the first place you would of course need more that 4GB of memory to make it sensible to inplement the TB as a memory array. Otherwise you would have to swap the (virtual) memory to disk all the time, and it is probably better to decide by hand wat you want to swap out as part of your TB generation algorithm than rely on the operating system to make the right choice.

In principle Pentiums can handle more that 4G of memory, to this end there are (model-specific) memory type and range registers (MTRR) in the CPU, supplying the extra address bits that can not be derived from the 32-bit address mode. But you need to be in kernel mode to access them, so you would have to write your own memory drivers. It does not seem worth it.

What TBs would you be interested in? If they contain pawns it is comparatively easy to slice up the TB in fragments that never have to reside in memory simultaneously, as I posted elsewhere. Then a 32 -bit machine would be more than enough, even for 8-men TB's...

by **Stef Luijten** » 21 Jan 2006, 20:45

H.G.Muller wrote:... What TBs would you be interested in? ...

Not one specifically. I have deviced an indexing scheme that would require for instance 17,427,972,240 entries for a KQPKRB table, which is more than 32-bits addressing can handle (max 4,294,967,295).

5-man are all fine, for instance KQRKB would need 99,764,280 entries and KPPKQ would need 115,184,874 entries.

Stef

by **H.G.Muller** » 22 Jan 2006, 14:14

For KQPKRB, ignoring the pawn and not using symmetry, even the most obvious indexing scheme requires only 64^5 = 1GB if you store DTM or DTC for only one side (I usually tak btm). With the pawn you then make use of the fact that the pawn has only one move, i.e. to fill the TB with the pawn on e5 you only have to work backwards from positions
1) without the pawn (KQKRB)
2) with the pawn on e6
3) with the pawn on d6 in KQPKR
4) with the pawn on d6 in KQPKB
5) with the pawn on f6 in KQPKR
6) with the pawn on f6 in KQPKB

from these (2) is the largest (also 1GB), since (1) can use full symmetry, reducing it to ~130MB), and (3)-(5) are only 16MB each. So together with the fragment you are working on you only need 2.2GB, which fits well within the address space of a 32-bit machine. After you completed the fragment with the pawn on e5, you write it to disk in whatever (compressed) format you like, but keep it in memory, where you next use it to work your way back to the fragment of KQPKRB with the pawn on e4 in the 1GB of memory that was released by discarding (2), after replacing (3)-(6) by the corresponding selections of KQPKx with the pawn on d5 or f5.

This minimizes disk I/O to hardly anything more than what you would have to do anyway (to save the result). If reading the 16MB fragments of KQPKx is still eating significant time, you probably have enough RAM left to store the entire d and f file of those endings, and after you completed the e-file fragment of KQPKRB you can work next on the g file, which also needs the f-file data of KQPKx, so that you don't have to read that twice.

The best indexing scheme to implement this is probably to use the pawn position in the highest-order bits of the index, so that you can mask them off to access the data in your 1GB buffer of choice.

by **Sven Schüle** » 22 Jan 2006, 22:49

H.G.Muller wrote:For KQPKRB, ignoring the pawn and not using symmetry, even the most obvious indexing scheme requires only 64^5 = 1GB if you store DTM or DTC for only one side (I usually tak btm).

Hi H.G.,

I do not understand your numbers. Do you mean "1GB after compression"? 64^5 = 2^30 = ~ 10^9 but for DTM/DTC this would require e.g. 128GB (uncompressed) when using 1 byte per position.

1GB or 128GB, seems to make a difference, doesn't it?

Sven

by **H.G.Muller** » 23 Jan 2006, 09:59

H.G.Muller wrote:For KQPKRB, ignoring the pawn and not using symmetry, even the most obvious indexing scheme requires only 64^5 = 1GB

Hi Sven,

Indeed 1GB or 128GB makes a huge difference, which is exactly my point: The 1GB pertains to the number of positions that have the pawn fixed on a certain square (e5 in the example). So there are only 5 pieces left that can be anywhere, i.e. 64^5 positions = 1GB without compression. So the memory requirement is significantly less than the 128GB you mention (shouldn't this be 64GB, by the way?).

So the proposed scheme indeed leads to huge memory savings without any performance penalty: it doesn't do a single disk access that you would not have had to do if you had used 128GB of RAM (for writing the result or accessing the end-games that you can convert to: KQKRB, KQPKR, KQPKB and KQQKRB, and I guess also KPKRB, which I forgot to mention earlier and probably is only relevant with the pawn on the 7th rank, and a sure loss with the pawn on e5).

With only 2.5GB of DRAM you can run at a speed as if the TB would have fit entirely in DRAM. And that is fast, because you don't have to worry about symmetry either. The left-right symmetry you can exploit afterwards, by copying the computed results with the pawn on e6 to all TB positions with the pawn on d6. (Of course it would be smarter not to burden the TB with this redundant information, but let the access routine perform the mirroring when you use the TB, and retrieve the data from the half that you store.)

by **Sven Schüle** » 24 Jan 2006, 00:11

Hi H.G.,

o.k., I ignored that you ignored the pawn and store only with black to move. Indeed 64^5 positions need 64^5 bytes = 1GB. Then everything else becomes much clearer, too.

Did you really try it, i.e. have you implemented it, and if yes, does it work? I assume yes, otherwise you wouldn't have described it in detail.

Sorry for my yesterday's late-night blindness ...
Sven

by **H.G.Muller** » 24 Jan 2006, 10:36

Well, I am working on building a generator that works exactly along the lines described above. The description is the design that came out after some trials with earlier versions. My PC has only 0.5GB of memory, though, so I still could not do KqpKrb even if my code could in principle do it. My interest goes out to end-games like KrpKrp, KbpKnp, or even KrppKrp and KbpppKr. Ignoring the pawns there remain only 4 pieces, resulting in 16M positions for each pawn formation. This is just as well, because - unlike the example above with a single pawn - from every position you can go to (or come from) several positions with another pawn formation, by advancing each of the pawns. So in KbpppKr to make the fragment of the TB with pawns on e3,d4 and g4 you would have to consult the (e4,d4,g4), (e3,d5,g4) and (e3,d4,g5) fragment of that same TB, and to do that efficiently all three have to be kept in memory. And you don't want to discard them immediately after you're done, because sometimes they are needed again later. So with multiple pawns you need to hold many fragments in memory simultaneously, but if the fragments are only 16MB that is no problem even if your DRAM totals only 512M.

Although TBs like KbppKnp are of course huge, in a specific match you would need only a very small part of it, because you know where your pawns are in the current position (e.g. that they are e,d and g pawns, so that only 1/8x8x8 = 1/512 of the TB can be of relevance, even much less if the pawns are already advanced. So I hope it will be able to build the part of the TB that it needs during the match (e.g. while pondering). This seems a more realistic approach than having to pre-compute and store all complete 7-men TBs. From my previous experiments I know that a 16MB fragment with KbKn can be built in ~10 sec.

Main obstacle at the moment is that to build a TB with pawns (let's stick to the KqpKrb example), you need to know which queening positions are won, e,g, you need KqqKrb. This is a much harder nut to crack (despite the fact that you can make use of 8-fold symmetry then). I still don't feel like maintaining a library of all such TBs, even if you would need to store only the fragments that can result from queening, i.e. with one of the Queens on the 8th rank.

So my research at this time focuses on how I can efficiently obtain the won queening positions that I need as a starting point for building the TB with pawns (KqqKrb with one Queen on e8 in the example). I want to solve that before I actually implement the described algorithm, because for my intended application of the TB generator this is a real obstacle. That would not have to stop people that want to build TBs for storage on disk, though.

Fortunately the situation seems far from hopeless. This is mainly because in an end-game that is interesting before queening, you have an overwhelming advantage after queening. (Unless both sides promote a Pawn at the same time, in which case it is almost always a draw.) This means that any position that is won, usually remains won if you change the rules a little bit to give the winning side a handicap. With KqqKrb the handicap could be that the side with the Queens has to keep one of the Queens on the promotion square as long as you stay in KqqKrb. This handicap means that you confine the game to a fragment of KqqKrb that is only as large as KqKrb: one of the Queens does not participate other than by capturing enemy pieces (which would leave KqqKrb by conversion to KqqKx, also small). At the moment I cannot imagine positions that stay in KqqKrb for several moves (i.e. no immediate conversion to KqKrb because one of the Queens was hanging) and that cannot be won even with the handicap of only using one Queen and keeping the other as a 'spare' that becomes fully active after you sacrifice the moving one. The very small fraction of promotion positions that does not get labelled as 'won', you can probably analyze by more expensive means without much impact on performance.

Winboard Forum

Arrays with 64-bit addresses on a 32 bit machine?

Arrays with 64-bit addresses on a 32 bit machine?

Re: Arrays with 64-bit addresses on a 32 bit machine?

Re: Arrays with 64-bit addresses on a 32 bit machine?

Re: Arrays with 64-bit addresses on a 32 bit machine?

Re: Arrays with 64-bit addresses on a 32 bit machine?

Re: Arrays with 64-bit addresses on a 32 bit machine?

Re: Arrays with 64-bit addresses on a 32 bit machine?

Re: Arrays with 64-bit addresses on a 32 bit machine?

Re: Arrays with 64-bit addresses on a 32 bit machine?

Who is online