# Algorithm Implementation/Hashing

Hashing algorithms are generically split into three sub-sets:

- Indexing Algorithms
- generally used to quickly find items, using lists called "hash tables".
- Checksums
- often used for simple data checking, to detect any accidental bit errors during communication—we discuss them in an earlier chapter, Checksums.
- Message Digests
- a cryptographically secure one-way function, and many are closely examined for their security in the computer security field.

## Indexing Algorithms[edit | edit source]

### Jenkins one-at-a-time hash[edit | edit source]

The "Jenkins One-at-a-time hash",
from an article by Bob Jenkins
in *Dr. Dobb's* September 1997.

C:

```
uint32 joaat_hash(uchar *key, size_t key_len) {
uint32 hash = 0;
size_t i;
for (i = 0; i < key_len; i++) {
hash += key[i];
hash += (hash << 10);
hash ^= (hash >> 6);
}
hash += (hash << 3);
hash ^= (hash >> 11);
hash += (hash << 15);
return hash;
}
```

Java:

```
int joaat_hash(byte[] key) {
int hash = 0;
for (byte b : key) {
hash += (b & 0xFF);
hash += (hash << 10);
hash ^= (hash >>> 6);
}
hash += (hash << 3);
hash ^= (hash >>> 11);
hash += (hash << 15);
return hash;
}
```

See Data Structures/Hash Tables#Choosing a good hash function for more details on the "Jenkins One-at-a-time hash".

### other hash functions for hash tables[edit | edit source]

*(FIXME: say a few words about " universal hash function")*

Other popular hash functions for hash tables include: other Jenkins hash functions, CityHash, and MurmurHash

## Checksums and Cyclic Redundancy Checks[edit | edit source]

See Algorithm Implementation/Checksums.

## Message Digests[edit | edit source]

The state-of-the-art for message digests and what is considered secure change frequently. The US NSA holds algorithm contests and select the winners as SHAs, the "Secure Hashing Algorithms".

- MD5 (RFC 1321) and its predecessors MD2 and MD4 are all broken. They are now both obsolete and insecure.
- SHA0/SHA1 (FIPS-180-1) are partially broken. They are obsolete and considered insecure against well-funded opponents since 2005.
- SHA-2 is not yet considered broken, but is vulnerable to length extension attacks.
- SHA-3 (Keccak) is not vulnerable to length extension attacks.
- KangarooTwelve is an extremely parallelized (hence
*very fast*) version of Keccak. It is not vetted by the NSA, but the authors believe it is as secure as SHA-3. - BLAKE is a hash function that made it to the final round of the SHA-3 contest. The BLAKE2b variant is widely used and considered secure (as of 2020). It also has an extremely parallelized version called BLAKE3.

- KangarooTwelve is an extremely parallelized (hence

Although these algorithms can be implemented directly from the specification, as with other forms of cryptography it is usually safer and faster to use a thoroughly-reviewed open-source library.

## Further reading[edit | edit source]

- see Data Structures/Hash Tables for perhaps the most common application of hashing.
- The Microsoft FCIV (File Checksum Integrity Verifier)
- Running the FCIV Utility from VBA
- String Hashing in VBA
- File Hashing in VBA