Algorithm Implementation/Hashing

From Wikibooks, open books for an open world
Jump to: navigation, search

Hashing algorithms are generically split into three sub-sets:

  • Indexing Algorithms
  • Checksums and Cyclic Redundancy Checks
  • Message Digests

An indexing algorithm hash is generally used to quickly find items, using lists called "hash tables". A Checksum or a Cyclic Redundancy Check is often used for simple data checking, to detect any accidental bit errors during communication -- we discuss them in an earlier chapter, Checksums. A Message Digest is a cryptographically secure one-way function, and many are closely examined for their security widely in the computer security field.

Indexing Algorithms[edit]

The "Jenkins One-at-a-time hash", from an article by Bob Jenkins in Dr. Dobb's September 1997.

C:

uint32 joaat_hash(uchar *key, size_t key_len) {
    uint32 hash = 0;
    size_t i;
 
    for (i = 0; i < key_len; i++) {
        hash += key[i];
        hash += (hash << 10);
        hash ^= (hash >> 6);
    }
    hash += (hash << 3);
    hash ^= (hash >> 11);
    hash += (hash << 15);
    return hash;
}

Java:

int joaat_hash(byte[] key) {
    int hash = 0;
 
    for (byte b : key) {
        hash += (b & 0xFF);
        hash += (hash << 10);
        hash ^= (hash >>> 6);
    }
    hash += (hash << 3);
    hash ^= (hash >>> 11);
    hash += (hash << 15);
    return hash;
}

See Data Structures/Hash Tables#Choosing a good hash function for more details on the "Jenkins One-at-a-time hash".

Message Digests[edit]

Further reading[edit]