Algorithm Implementation/Hashing

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Hashing algorithms are generically split into three sub-sets:

  • Indexing Algorithms
  • Checksums and Cyclic Redundancy Checks
  • Message Digests

An indexing algorithm hash is generally used to quickly find items, using lists called "hash tables". A Checksum or a Cyclic Redundancy Check is often used for simple data checking, to detect any accidental bit errors during communication—we discuss them in an earlier chapter, Checksums. A Message Digest is a cryptographically secure one-way function, and many are closely examined for their security in the computer security field.

Indexing Algorithms

Jenkins one-at-a-time hash

The "Jenkins One-at-a-time hash", from an article by Bob Jenkins in Dr. Dobb's September 1997.

C:

uint32 joaat_hash(uchar *key, size_t key_len) {
    uint32 hash = 0;
    size_t i;

    for (i = 0; i < key_len; i++) {
        hash += key[i];
        hash += (hash << 10);
        hash ^= (hash >> 6);
    }
    hash += (hash << 3);
    hash ^= (hash >> 11);
    hash += (hash << 15);
    return hash;
}

Java:

int joaat_hash(byte[] key) {
    int hash = 0;

    for (byte b : key) {
        hash += (b & 0xFF);
        hash += (hash << 10);
        hash ^= (hash >>> 6);
    }
    hash += (hash << 3);
    hash ^= (hash >>> 11);
    hash += (hash << 15);
    return hash;
}

See Data Structures/Hash Tables#Choosing a good hash function for more details on the "Jenkins One-at-a-time hash".

other hash functions for hash tables

(FIXME: say a few words about " universal hash function")

Other popular hash functions for hash tables include:

Other Jenkins hash functions,[1] CityHash,[2] MurmurHash[3]

Checksums and Cyclic Redundancy Checks

Algorithm Implementation/Checksums

Message Digests

Further reading


References