Moderator: Andres Valverde
vancekic wrote:So I get the same index in the table and I am about to overwrite the previous value. Shall I implement a strategy to overcome this issue (like linked list) or is this absolutely not normal and should be very unlikely?
Most primitive way is simply to overwrite the old entry, under the philosophy that newly generated positions are more likely to be still relevant than older ones. This way is completely safe: your table will never be filled up with obsolete entries.
You might need a larger hash table to get the same hit-rate, though.
Just to make sure you have this right: You certainly store the rest of
your hash key in the hash entry, do you? That is rest = key / 20940347.
index = hashFromCurrentPosition % 20940347;
HashPosition* hashPosition = m_HashTable[index];
if (hashPosition)
{
if (hashPosition->key != index)
{
// that's the case where I am getting a hash collision
// two different positions with two different keys
// hash to the same index in the hash table
// I will just replace the old one by the new one
m_HashTable[index] = hashPosition;
}
else
{
// we found a hit, return the value we stored
// if we're at the same depth kind of logic, etc etc...
}
}
else
{
// otherwise, create new entry
m_HashTable[index] = new HashPosition;
}
vancekic wrote:I do:
- Code: Select all
index = hashFromCurrentPosition % 20940347;
HashPosition* hashPosition = m_HashTable[index];
if (hashPosition)
{
if (hashPosition->key != index)
{
// that's the case where I am getting a hash collision
// two different positions with two different keys
// hash to the same index in the hash table
// I will just replace the old one by the new one
m_HashTable[index] = hashPosition;
}
else
{
// we found a hit, return the value we stored
// if we're at the same depth kind of logic, etc etc...
}
}
else
{
// otherwise, create new entry
m_HashTable[index] = new HashPosition;
}
There's a lot of confusion between the terms index, hash code, hash key, key, etc...
delete m_HashTable[index];
m_HashTable[index] = new HashPosition;
// fill new entry with data, store the key, ...
if (m_HashTable[index].key != 0)
{
...
}
if (existingHashPosition->key != index)
if (existingHashPosition->key != hashKeyFromCurrentPosition)
vancekic wrote:You're right for the memory leak, it's meant to be deleted before being overwritten.
In fact I will revert to an array of plain members and not pointers. My exe will just increase 300+ Mb but who cares It's much less error prone.
The point was not the memory leak but that your code snippet does not look as if you do a "new" when overwriting, instead you store the old pointer over itself. But now I guess your real code does this right Smile
The size of your exe should not increase. Instead, at runtime it will even use less memory than before by not storing 20M pointers + 20M allocated entries of 16 byte each but only one array of 20M * 16 byte. You might as well do one "new" for the whole "m_HashTable" at program startup, which would also be future proof when thinking of a configurable hash table size. (Of course you would need different prime numbers for different table sizes.)
vancekic wrote:The exe DOES increase - not the exe size mind you, I meant its memory usage, because it stores a 300Mb struct - try it for yourself, check your task manager.
Why would I need a configurable hash table size? What does this achieve?
H.G.Muller wrote:This is quite normal. With 5400 entries you have 5400 x 5399 / 2 = 14.5M pairs of entries. The probability that there is a collision within a pair (i.e. the two positions in the pair map accidentally to the same table entry) is ~1/21M. So there already is a 14.5/21 = ~65% probability that one of the pairs will collide. And that is what you see.
Return to Programming and Technical Discussions
Users browsing this forum: No registered users and 6 guests