Algorithm Implementation/Hashing

From Wikibooks, open books for an open world
Jump to: navigation, search

Hashing algorithms are generically split into three sub-sets:

  • Indexing Algorithms
  • Checksums and Cyclic Redundancy Checks
  • Message Digests

An indexing algorithm hash is generally used to quickly find items, using lists called "hash tables". A Checksum or a Cyclic Redundancy Check is often used for simple data checking, to detect any accidental bit errors during communication -- we discuss them in an earlier chapter, Checksums. A Message Digest is a cryptographically secure one-way function, and many are closely examined for their security in the computer security field.

Indexing Algorithms[edit]

Jenkins one-at-a-time hash[edit]

The "Jenkins One-at-a-time hash", from an article by Bob Jenkins in Dr. Dobb's September 1997.

C:

uint32 joaat_hash(uchar *key, size_t key_len) {
    uint32 hash = 0;
    size_t i;

    for (i = 0; i < key_len; i++) {
        hash += key[i];
        hash += (hash << 10);
        hash ^= (hash >> 6);
    }
    hash += (hash << 3);
    hash ^= (hash >> 11);
    hash += (hash << 15);
    return hash;
}

Java:

int joaat_hash(byte[] key) {
    int hash = 0;

    for (byte b : key) {
        hash += (b & 0xFF);
        hash += (hash << 10);
        hash ^= (hash >>> 6);
    }
    hash += (hash << 3);
    hash ^= (hash >>> 11);
    hash += (hash << 15);
    return hash;
}

See Data Structures/Hash Tables#Choosing a good hash function for more details on the "Jenkins One-at-a-time hash".

other hash functions for hash tables[edit]

(FIXME: say a few words about " universal hash function")

Other popular hash functions for hash tables include:

Other Jenkins hash functions,[1] CityHash,[2] MurmurHash[3]

Checksums and Cyclic Redundancy Checks[edit]

The Microsoft FCIV Utility[edit]

The utility FCIV, the File Checksum Integrity Verifier , is a free downloadable zipped package that allows a user to produce both SHA1 and MD5 hashes for any single file or recursively for files in folders, exporting its entire results package to a nominated .xml file. It can also conduct verification of file sets against a previously saved listing. It is used from an elevated command prompt, but can be run using the Shell function from code like VBA, or from a batch file. For further reading in its use see: Availability and description of the File Checksum Integrity Verifier utility.

FCIV at the command line[edit]

Notice the use of double quotes for paths that contain spaces. A typical command line use is of this form:

c:\>FCIV\fciv.exe -r "C:\users\My Folder\Documents" -sha1 -xml "c:\users\My Folder\Documents\myhash.xml"

FCIV run from VBA[edit]

The Shell function in VBA has no Wait feature, so this line is best as the last. The quotes are a little different in this case from the usual VBA expectation. Note that all of the paths must be enclosed in two sets of double quotes and that the entire command line is enclosed in one set of double quotes. Assuming that the fciv.exe has been downloaded and installed as shown, this code line exports all of the hash strings for every file in the users Documents folder to the file myhash.xml. An exclusion file path can also be added. (See examples on the page File Checksum Integrity Verifier (FCIV) Examples).

Sub FCIV()
   'runs the fciv function from VBA   
Dim Ret
   Ret = Shell("""c:\FCIV\fciv.exe"" -r ""C:\users\My Folder\Documents"" -sha1 -xml ""c:\users\My Folder\Documents\myhash.xml""")
End Sub

Algorithm Implementation/Checksums

Message Digests[edit]

VBA Hashes[edit]

The VBA code below generates the digests for the SHA1, SHA2/256, and SHA2/512 hashes, for strings, in either of the hex or base-64 output formats. These codings each make use of MS Office's built-in functions, and provide consistent results. It has been noted that original implementations elsewhere for the same digests can differ widely in their outputs.

Sub Test()
    'run this to test sha1, sha2/256, or sha2/512
    Dim sIn As String, sOut As String, bOpt As Boolean
    
    sIn = ""
    'select as required
    bOpt = False   'output hex
    'bOpt = True   'output base-64
    
    If SHA1(sIn, sOut, bOpt) Then MsgBox sOut: Debug.Print sOut
    'If SHA256(sIn, sOut, bopt) Then MsgBox sOut: Debug.Print sOut
    'If SHA512(sIn, sOut, bopt) Then MsgBox sOut: Debug.Print sOut

End Sub

Function SHA1(sIn As String, sOut As String, Optional bOpt As Boolean = 0) As Boolean
    'Set a reference to mscorlib 4.0 64-bit
    
    'Test with empty string input:
    '40 Hex:   da39a3ee5e6...etc
    '28 Base-64:   2jmj7l5rSw0yVb...etc
    
    Dim oT As Object, oSHA1 As Object
    Dim TextToHash() As Byte
    Dim bytes() As Byte
            
    Set oT = CreateObject("System.Text.UTF8Encoding")
    Set oSHA1 = CreateObject("System.Security.Cryptography.SHA1Managed")
    
    TextToHash = oT.GetBytes_4(sIn)
    bytes = oSHA1.ComputeHash_2((TextToHash))
        
    If bOpt = True Then
       sOut = ConvToBase64String(bytes)
    Else
       sOut = ConvToHexString(bytes)
    End If
            
    Set oT = Nothing
    Set oSHA1 = Nothing
    
    SHA1 = True

End Function

Function SHA256(sIn As String, sOut As String, Optional bOpt As Boolean = 0) As Boolean
    'Set a reference to mscorlib 4.0 64-bit
    
    'Test with empty string input:
    '64 Hex:   e3b0c44298f...etc
    '44 Base-64:   47DEQpj8HBSa+/...etc
    
    Dim oT As Object, oSHA256 As Object
    Dim TextToHash() As Byte, bytes() As Byte
    
    Set oT = CreateObject("System.Text.UTF8Encoding")
    Set oSHA256 = CreateObject("System.Security.Cryptography.SHA256Managed")
    
    TextToHash = oT.GetBytes_4(sIn)
    bytes = oSHA256.ComputeHash_2((TextToHash))
    
    If bOpt = True Then
       sOut = ConvToBase64String(bytes)
    Else
       sOut = ConvToHexString(bytes)
    End If
    
    Set oT = Nothing
    Set oSHA256 = Nothing
    
    SHA256 = True

End Function

Function SHA512(sIn As String, sOut As String, Optional bOpt As Boolean = 0) As Boolean
    'Set a reference to mscorlib 4.0 64-bit
    
    'Test with empty string input:
    '128 Hex:   cf83e1357eefb8bd...etc
    '88 Base-64:   z4PhNX7vuL3xVChQ...etc
    
    Dim oT As Object, oSHA512 As Object
    Dim TextToHash() As Byte, bytes() As Byte
    
    Set oT = CreateObject("System.Text.UTF8Encoding")
    Set oSHA512 = CreateObject("System.Security.Cryptography.SHA512Managed")
    
    TextToHash = oT.GetBytes_4(sIn)
    bytes = oSHA512.ComputeHash_2((TextToHash))
    
    If bOpt = True Then
       sOut = ConvToBase64String(bytes)
    Else
       sOut = ConvToHexString(bytes)
    End If
    
    Set oT = Nothing
    Set oSHA512 = Nothing
    
    SHA512 = True

End Function

Function ConvToBase64String(vIn As Variant) As Variant

    Dim oD As Object
      
    Set oD = CreateObject("MSXML2.DOMDocument")
      With oD
        .LoadXML "<root />"
        .DocumentElement.DataType = "bin.base64"
        .DocumentElement.nodeTypedValue = vIn
      End With
    ConvToBase64String = Replace(oD.DocumentElement.text, vbLf, "")
    
    Set oD = Nothing

End Function

Function ConvToHexString(vIn As Variant) As Variant

    Dim oD As Object
      
    Set oD = CreateObject("MSXML2.DOMDocument")
      
      With oD
        .LoadXML "<root />"
        .DocumentElement.DataType = "bin.Hex"
        .DocumentElement.nodeTypedValue = vIn
      End With
    ConvToHexString = Replace(oD.DocumentElement.text, vbLf, "")
    
    Set oD = Nothing

End Function

Further reading[edit]