Hash Function, denoted as \(h = H(M)\), is an unkey function that condense arbitrary lengths of value into a fixed size output (fingerprint). It could be used as a mapping function of #Hash Table key into an appropriate cell, in #Digital Signature for authentication#, creating one-way password, detecting intrusion or virus, or construct pseudorandom# function (PRF) or a pseudorandom number generator (PRNG).
Attention
- Hash Function should be easy to compute
- Needs to handle Hash Collision#
-
There are three practical requirements for Hash Function
- variable input size (can be applied to a block of data of any size)
- fixed output/fingerprint size
- efficiency
-
There are four security requirements for Hash Function
- preimage resistant/one-way property (given \(h\), computationally infeasible# to find \(y\) (preimage) such that \(H(y) = h\))
- second preimage resistant/weak collision resistant (given \(x\) (first preimage), computationally infeasible# to find \(y\) (second preimage) such that \(H(x) = H(y)\))
- collision resistant/strong collision resistant (computationally infeasible# to find any pair \((x, y)\) such that \(H(x) = H(y)\))
- pseudorandomness#
Note: Weak collision resistance is a stronger security guarantee than Strong collision resistance.
Implementation
Depends on the data type, hash function could be different.
Integer
Simply return Key
mod TableSize
.
String
Use Horner’s Rule:
Index Hash(const char *Key, int TableSize) {
unsigned int HashVal = 0;
while(*Key != '\0')
HashVal = (HashVal << 5) + *Key++;
return HashVal % TableSize;
}
Even though this will not result in the best table distribution, this algorithm is simple and fast if overflow is allowed. However, if the key is very long, it could take a long time to compute the key. One could compute for just a few characters to improve the performance.