Visual Basic for Applications/File Hashing in VBA

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Summary[edit]

  • This section contains code for making file hashes, that is, hashes of entire files.
    • Several algorithms are provided, with output options for base64 or hex.
    • The code is made for single files, but the code given on an adjacent page, Folder Hashing in VBA, can be used for recursive hash listings, again with a choice of hashes and output options.
    • String hash routines are given in another section.
    • In general these hashes do not make use of a seed value, but to illustrate the method, the code contains one such example, (FileToSHA512SALT()). Please note that its output differs from that of the SHA512Managed class. A note exists in the respective procedure in case other salted (seeded) inputs are of interest.
    • These listed algorithms can hash any single file up to about 200MB (Mega Bytes) in length, beyond which an out of memory error will be generated in GetFileBytes(). Specific tests found that hashes work well for a 200MB zip file but fail for a 500MB zip file; the exact break point is unclear. For files larger than 200MB, other facilities exist.
  • Large file hashing, say beyond 200MB is best done with other tools. Four such examples are mentioned here:
    • Microsoft's FCIV utility, is free download. It is a command-line application, capable of hashing both single files and whole folder trees. It handles large files with ease, but only for MD5 and SHA1 hashes. It sends both base64 and HEX outputs to the screen but only b64 output format to a file. Prepared files can be verified against any new run, but results only to the screen. It is a bit tricky to use, even with their instructions, so the pages Running the FCIV Utility from VBA and File Checksum Integrity Verifier (FCIV) Examples might be found of use to the novice. So far, Microsoft have not extended the coding to include contemporary algorithms.
    • PowerShell in Windows 8.1 and above, can make large single-file hashes, using all of the MD5. SHA1, SHA256, SHA384, and SHA512 algorithms. It produces output on the screen only, though the output can also be piped to the clipboard for pasting as required. There are no simple options for hashing a folder or for output to an xml file. For completion, an example of its use is given in File Checksum Integrity Verifier (FCIV) Examples.
    • An external application that can handle large files is MD5 and SHA Checksum Utility. It is a stand-alone application, and a basic version is available as a free download. It produces MD5, SHA1, SHA2/256, and SHA2/512 hashes for single files. The outputs are in HEX and are displayed together on a neat user interface. A more complex commercial version is also available.
    • FSUM Fast File Integrity Checker is another free, external application for command line use. It resembles FCIV in many ways but includes up to date algorithms. (MD2, MD4, MD5, SHA-1, SHA-2( 256, 384, 512), RIPEMD-160, PANAMA, TIGER, ADLER32, and CRC32). In addition to large file HEX hashes it can carry out flat or recursive folder hashes. The code to enter is not identical to that of FCIV but a text file is provided with examples in its use. The web page FSUM Fast File Integrity Checker has the download and other details, though the text file fails to mention that results can be easily piped to the clipboard with |clip. Although a graphical interface exists elsewhere, the command-line application has been found the most stable..
  • The permissions for files need to be considered when attempting hashing. Hashing has to access files to obtain the bytes that they contain. Although this does not involve actually running the files, some folder and file types might be found locked at run time. In fact, this type of access is the main difference between string hashing and file hashing. Whenever files are accessed, error handling tends to be needed. It is assumed here that the user will add his own error-handling, or that he will go-around files that are troublesome before the hashing attempt. Users should know that the code cannot handle an empty text file; for example, a Notepad file that has been saved without any text in it. The GetFileBytes routine will error. A message and exit will be produced if an empty file is encountered, as for a file in excess of 200MB.
  • User files and folders have few restrictions. The empty file problem apart, those who want to access user files in folders that they have made themselves will not usually have any problems, and interested parties should know that there is a recursive folder hashing module in another section of this series that might be of related interest. Folder Hashing in VBA also contains notes on how to avoid virtual folder problems with music, video, and other Microsoft libraries.
  • Hashing is concerned only with the content of a file, and not its name, or other file details. This means that duplicates of files under any name can be found by comparing their hashes. In secure systems with deliberately confusing file names, this means that a very long file list could be hashed until a searched-for hash value is found, rather than depending on a less secure file name to find it. Alternatively, file names are sometimes just the file's hash value, so that hashing can reveal any error or illegal change. In such a case a hacker might change the file then name the file with a corresponding hash, but he does not know the required hash algorithm or private string to use, so changes will always be detected when the owner runs his own hash verification.

Code Listings[edit]

Modifications[edit]

  • Added file selection dialog, and file size limits, 17 Jun 2019
  • Set file selection dialog to open with all-file types to be listed, 25 July 2019

Using Built-in Windows Functions in VBA[edit]

The code to make hashes of STRINGS and for bulk file hashing is given elsewhere in this set. The panel below bears code that is virtually identical to that for strings, but with only slight modification, is used to make hashes of single whole FILES. The user provides a full path to the file via a selection dialog as the starting parameter. A parameter option allows for a choice of hex or base-64 outputs. Functions are included for MD5, SHA1, SHA2-256, SHA2-384, and SHA2-512 hashes.

For frequent use, the selection dialog is most convenient, though the code contains a commented-out line for those who intend to type the file address into the procedure; simply comment out the line not needed.

In each case, coders can find the unmodified hash values in the bytes() array and at that point they are in 8-bit bytes, that is, the numbers that represent the ASCI code as it applies to a full eight-bit, 256 character set. The code that follows the filling of the bytes() array in each case decides which version of the ASCI character set to deliver. For a hex set of characters, 0-9, and A to F, the total bit set is broken into double the number of four-bit bytes, then returned for use. For the base-64 set, lower case letters,upper case letters, and integers mainly, six bit characters are made for output. These two sets are the most useful here, since they consist of commonly used characters. The 128 and 256 ASCI sets are too full of both exotic and non-printing characters to be useful. For each hash version its bit count is a constant, so the length of its output will vary according to the chosen type.

As a general point; message boxes do not allow copying of their text. If copying is needed, replace the message box with an input box, and set the output hash to be the default value of the box. Then it can be copied with ease. Alternatively use the output of the Debug.Print method in the immediate window.

Option Explicit

Private Sub TestFileHashes()
    'run this to obtain file hashes in a choice of algorithms
    'select any one algorithm call below
    'Limited to unrestricted files less than 200MB and not zero
    'Set a reference to mscorlib 4.0 64-bit, and Scripting Runtime
    
    Dim sFPath As String, b64 As Boolean, bOK As Boolean
    Dim sH As String, sSecret As String, nSize As Long, reply
    
    'USER SETTINGS
    '======================================================
    '======================================================
    'set output format here
    b64 = True     'true for output base-64, false for hex
    '======================================================
    'set chosen file here
    'either set path to target file in hard-typed line
    'or choose a file using the file dialog procedure
    'sFPath = "C:\Users\Your Folder\Documents\test.txt" 'eg.
    sFPath = SelectFile2("SELECT A FILE TO HASH...") 'uses file dialog
        
    'check the file
    If sFPath = "" Then 'exit sub for no file selection
        MsgBox "No selection made - closing"
        Exit Sub
    End If
    bOK = GetFileSize(sFPath, nSize)
    If nSize = 0 Or nSize > 200000000 Then 'exit sub for zero size
        MsgBox "File has zero contents or greater than 200MB - closing"
        Exit Sub
    End If
    '======================================================
    'set secret key here if using HMAC class of algorithms
    sSecret = "Set secret key for FileToSHA512Salt selection"
    '======================================================
    'choose algorithm
    'enable any one line to obtain that hash result
    'sH = FileToMD5(sFPath, b64)
    'sH = FileToSHA1(sFPath, b64)
    'sH = FileToSHA256(sFPath, b64)
    'sH = FileToSHA384(sFPath, b64)
    'sH = FileToSHA512Salt(sFPath, sSecret, b64)
    sH = FileToSHA512(sFPath, b64)
    '======================================================
    '======================================================
    
    'Results Output - open the immediate window as required
    Debug.Print sFPath & vbNewLine & sH & vbNewLine & Len(sH) & " characters in length"
    MsgBox sFPath & vbNewLine & sH & vbNewLine & Len(sH) & " characters in length"
    'reply = InputBox("The selected text can be copied with Ctrl-C", "Output is in the box...", sH)
    
    'decomment this block to place the hash in first cell of sheet1
'    With ThisWorkbook.Worksheets("Sheet1").Cells(1, 1)
'        .Font.Name = "Consolas"
'        .Select: Selection.NumberFormat = "@" 'make cell text
'        .Value = sH
'    End With
End Sub

Public Function FileToMD5(sFullPath As String, Optional bB64 As Boolean = False) As String
    'parameter full path with name of file returned in the function as an MD5 hash
    'Set a reference to mscorlib 4.0 64-bit
    
    Dim enc, bytes, outstr As String, pos As Integer
    
    Set enc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
    'Convert the string to a byte array and hash it
    bytes = GetFileBytes(sFullPath)
    bytes = enc.ComputeHash_2((bytes))
    
    If bB64 = True Then
       FileToMD5 = ConvToBase64String(bytes)
    Else
       FileToMD5 = ConvToHexString(bytes)
    End If
        
    Set enc = Nothing

End Function

Public Function FileToSHA1(sFullPath As String, Optional bB64 As Boolean = False) As String
    'parameter full path with name of file returned in the function as an SHA1 hash
    'Set a reference to mscorlib 4.0 64-bit
    
    Dim enc, bytes, outstr As String, pos As Integer
    
    Set enc = CreateObject("System.Security.Cryptography.SHA1CryptoServiceProvider")
    'Convert the string to a byte array and hash it
    bytes = GetFileBytes(sFullPath) 'returned as a byte array
    bytes = enc.ComputeHash_2((bytes))
    
    If bB64 = True Then
       FileToSHA1 = ConvToBase64String(bytes)
    Else
       FileToSHA1 = ConvToHexString(bytes)
    End If
        
    Set enc = Nothing

End Function

Function FileToSHA512Salt(ByVal sPath As String, ByVal sSecretKey As String, _
                           Optional ByVal bB64 As Boolean = False) As String
    'Returns a sha512 FILE HASH in function name, modified by parameter sSecretKey.
    'This hash differs from that of FileToSHA512 using the SHA512Managed class.
    'HMAC class inputs are hashed twice;first input and key are mixed before hashing,
    'then the key is mixed with the result and hashed again.
    'Set a reference to mscorlib 4.0 64-bit
    
    Dim asc As Object, enc As Object
    Dim SecretKey() As Byte
    Dim bytes() As Byte
    
    'create a text and crypto objects
    Set asc = CreateObject("System.Text.UTF8Encoding")
    
    'Any of HMACSHAMD5,HMACSHA1,HMACSHA256,HMACSHA384,or HMACSHA512 can be used
    'for corresponding hashes, albeit not matching those of Managed classes.
    Set enc = CreateObject("System.Security.Cryptography.HMACSHA512")

    'make a byte array of the text to hash
    bytes = GetFileBytes(sPath)
    
    'make a byte array of the private key
    SecretKey = asc.Getbytes_4(sSecretKey)
    'add the key property
    enc.Key = SecretKey

    'make a byte array of the hash
    bytes = enc.ComputeHash_2((bytes))
    
    'convert the byte array to string
    If bB64 = True Then
       FileToSHA512Salt = ConvToBase64String(bytes)
    Else
       FileToSHA512Salt = ConvToHexString(bytes)
    End If
    
    'release object variables
    Set asc = Nothing
    Set enc = Nothing

End Function

Public Function FileToSHA256(sFullPath As String, Optional bB64 As Boolean = False) As String
    'parameter full path with name of file returned in the function as an SHA2-256 hash
    'Set a reference to mscorlib 4.0 64-bit
    
    Dim enc, bytes, outstr As String, pos As Integer
    
    Set enc = CreateObject("System.Security.Cryptography.SHA256Managed")
    'Convert the string to a byte array and hash it
    bytes = GetFileBytes(sFullPath) 'returned as a byte array
    bytes = enc.ComputeHash_2((bytes))
    
    If bB64 = True Then
       FileToSHA256 = ConvToBase64String(bytes)
    Else
       FileToSHA256 = ConvToHexString(bytes)
    End If
        
    Set enc = Nothing

End Function

Public Function FileToSHA384(sFullPath As String, Optional bB64 As Boolean = False) As String
    'parameter full path with name of file returned in the function as an SHA2-384 hash
    'Set a reference to mscorlib 4.0 64-bit
    
    Dim enc, bytes, outstr As String, pos As Integer
    
    Set enc = CreateObject("System.Security.Cryptography.SHA384Managed")
    'Convert the string to a byte array and hash it
    bytes = GetFileBytes(sFullPath) 'returned as a byte array
    bytes = enc.ComputeHash_2((bytes))
    
    If bB64 = True Then
       FileToSHA384 = ConvToBase64String(bytes)
    Else
       FileToSHA384 = ConvToHexString(bytes)
    End If
    
    Set enc = Nothing

End Function

Public Function FileToSHA512(sFullPath As String, Optional bB64 As Boolean = False) As String
    'parameter full path with name of file returned in the function as an SHA2-512 hash
    'Set a reference to mscorlib 4.0 64-bit
    
    Dim enc, bytes, outstr As String, pos As Integer
    
    Set enc = CreateObject("System.Security.Cryptography.SHA512Managed")
    'Convert the string to a byte array and hash it
    bytes = GetFileBytes(sFullPath) 'returned as a byte array
    bytes = enc.ComputeHash_2((bytes))
    
    If bB64 = True Then
       FileToSHA512 = ConvToBase64String(bytes)
    Else
       FileToSHA512 = ConvToHexString(bytes)
    End If
    
    Set enc = Nothing

End Function

Private Function GetFileBytes(ByVal sPath As String) As Byte()
    'makes byte array from file
    'Set a reference to mscorlib 4.0 64-bit
    
    Dim lngFileNum As Long, bytRtnVal() As Byte, bTest
    
    lngFileNum = FreeFile
    
    If LenB(Dir(sPath)) Then ''// Does file exist?
        
        Open sPath For Binary Access Read As lngFileNum
        
        'a zero length file content will give error 9 here
        
        ReDim bytRtnVal(0 To LOF(lngFileNum) - 1&) As Byte
        Get lngFileNum, , bytRtnVal
        Close lngFileNum
    Else
        Err.Raise 53 'File not found
    End If
    
    GetFileBytes = bytRtnVal
    
    Erase bytRtnVal

End Function

Function ConvToBase64String(vIn As Variant) As Variant
    'used to produce a base-64 output
    'Set a reference to mscorlib 4.0 64-bit
    
    Dim oD As Object
      
    Set oD = CreateObject("MSXML2.DOMDocument")
      With oD
        .LoadXML "<root />"
        .DocumentElement.DataType = "bin.base64"
        .DocumentElement.nodeTypedValue = vIn
      End With
    ConvToBase64String = Replace(oD.DocumentElement.Text, vbLf, "")
    
    Set oD = Nothing

End Function

Function ConvToHexString(vIn As Variant) As Variant
     'used to produce a hex output
    'Set a reference to mscorlib 4.0 64-bit
    
    Dim oD As Object
      
    Set oD = CreateObject("MSXML2.DOMDocument")
      
      With oD
        .LoadXML "<root />"
        .DocumentElement.DataType = "bin.Hex"
        .DocumentElement.nodeTypedValue = vIn
      End With
    ConvToHexString = Replace(oD.DocumentElement.Text, vbLf, "")
    
    Set oD = Nothing

End Function

Function GetFileSize(sFilePath As String, nSize As Long) As Boolean
    'use this to test for a zero file size
    'takes full path as string in sFilePath
    'returns file size in bytes in nSize
    'Make a reference to Scripting Runtime
    
    Dim fs As FileSystemObject, f As File
    
    Set fs = CreateObject("Scripting.FileSystemObject")
    
    If fs.FileExists(sFilePath) Then
        Set f = fs.GetFile(sFilePath)
        nSize = f.Size
        GetFileSize = True
        Exit Function
    End If

End Function

Function SelectFile2(Optional sTitle As String = "") As String
    'opens a file-select dialog and on selection
    'returns its full path string in the function name
    'If Cancel or OK without selection, returns empty string
    
    Dim fd As FileDialog, sPathOnOpen As String, sOut As String
    
    Set fd = Application.FileDialog(msoFileDialogFilePicker)
    'do not include backslash here
    sPathOnOpen = Application.DefaultFilePath
    
    'set the file-types list on the dialog and other properties
    With fd
        .Filters.Clear
        'the first filter line below sets the default on open (here all files are listed)
        .Filters.Add "All Files", "*.*"
        .Filters.Add "Excel workbooks", "*.xlsx;*.xlsm;*.xls;*.xltx;*.xltm;*.xlt;*.xml;*.ods"
        .Filters.Add "Word documents", "*.docx;*.docm;*.dotx;*.dotm;*.doc;*.dot;*.odt"
                
        .AllowMultiSelect = False
        .InitialFileName = sPathOnOpen
        .Title = sTitle
        .InitialView = msoFileDialogViewList 'msoFileDialogViewSmallIcons
        .Show
        
        If .SelectedItems.Count = 0 Then
            'MsgBox "Canceled without selection"
            Exit Function
        Else
            sOut = .SelectedItems(1)
            'MsgBox sOut
        End If
    End With
    
    SelectFile2 = sOut

End Function

See Also[edit]

  • String Hashing in VBA : A companion page in this series for those who want only to hash strings.
  • Folder Hashing in VBA :Another companion page that makes recursive folder hash listings, and logs. Uses up to date hash algorithms, but limited to files no larger than about 200MB.
  • Running the FCIV Utility from VBA: How to use the Microsoft fciv.exe command line utility to make MD5 and SHA1 file hashes from VBA. The MS utility, although awkward to use, allows hashing and verification of entire folder trees.
  • File Checksum Integrity Verifier (FCIV) Examples: More details on how to use the FCIV utility for those without much experience working from the command line prompt.

External Links[edit]

  • MD5 and SHA Checksum Utility: a external site's free application to simultaneously display MD5,SHA1,SHA256,and SHA512 hashes of single files (Hex only). Includes a browse-for-file function and drag and drop to make life easy. This utility can also be used to hash large files; author-tested for a 500MB file.
  • FSUM Fast File Integrity Checker : The basic command-line version of the FSUM hasher download.
  • FSUM GUI : A graphical interface download site for the FSUM utility. This allows file browsing, drag and drop, and other facilities to simplify the otherwise command-line working.