Visual Basic for Applications/The Knuth String Shuffle
Jump to navigation Jump to search
- This code module includes both a Fisher-Yates character shuffling routine for strings, and a Durstenfeld-Knuth routine to shuffle one-dimensional arrays.
- The procedure FisherYatesStrShuffle() shuffles characters of a single string. This is limited to shifting single characters around within one string.
- Procedure KnuthArrShuffle() sorts the elements of a one-dimensional array. The procedure is limited only by what can be stored in array elements.
- The two methods are pseudo-random and bias-free. Elsewhere, the use of a random generator does not necessarily guarantee that the results will be free from bias.
- The code can work in any of the MS Office applications that support VBA.
- The Fisher-Yates shuffle applies a pseudo random selection method. It is described here for characters in a string but a related method, the Durstenfeld-Knuth method is preferred for arrays.
- Taking each element of a string in sequence for repositioning leaves one end of the result string badly biased. The Knuth algorithm instead proposes a random position within the string. The element at that position is then accumulated into the output and removed from the original. Subsequent selections are made in the same way from the ever shortened string.
- Note that there is still the possibility of a given character being unmoved in the process, but only within expectation.
- Set the number of strings required with variable Cycles in the top procedure. The Immediate Window has proved the best place for display and copying.
- It should be pointed out that any attempt to avoid the unmoved elements, will not only change the random nature of the shuffle but prevent the use of other than non-repeat strings. That is to say strings with repeated characters could not then be shuffled.
- The Durstenfeld-Knuth method for arrays differs only slightly from that of the Fisher-Yates implementation.
- To reduce processing, and no doubt to overcome the burden of removing an element from the middle of an array during shortening, the algorithm instead overwrites the element selected for output with the last element. In this VBA implementation the array is then conveniently shortened by one element with Redim Preserve.
- See Fisher Yates Shuffle for a good description of both methods.
The VBA Code Module
Copy all of the code below into say, an MS Excel standard module, save the workbook as an xlsm file type, and run either of the test procedures to test the requisite code. Be sure to open the Immediate Window for output.
Option Explicit Private Sub testFisherYatesStrShuffle() 'run this to test the string shuffle Dim bOK As Boolean, sStr As String, sR As String Dim sOut As String, n As Long, Cycles As Long 'set number of shuffled versions needed Cycles = 1 'test string sStr = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" For n = 1 To Cycles bOK = FisherYatesStrShuffle(sStr, sR) sOut = sR If bOK = False Then MsgBox "Problems in shuffle" Exit Sub End If 'output to message box and immediate window 'MsgBox "Before : " & sStr & vbCrLf & _ "After : " & sR Debug.Print "Before : " & sStr Debug.Print "After : " & sOut & vbCrLf Next n End Sub Private Function FisherYatesStrShuffle(ByVal sIn As String, sOut As String) As Boolean 'Performs a naive Fisher-Yates shuffle on the input string. 'Returns result in sOut. Pseudo random character sequencing. 'Note: Some or all elements could occupy their original positions, 'but only in accordance with expectation based on the random generator. 'This can be seen best for very short character strings, like "ABC". Dim sL As String, sR As String, sT1 As String, sT2 As String, sMod As String Dim sAcc As String, i As Long, j As Long, nL As Long, n As Long 'check input string If sIn = "" Or Len(sIn) < 2 Then MsgBox "At least 2 characters needed - closing" Exit Function End If 'initial assignments nL = Len(sIn) sMod = sIn Randomize For i = 1 To Len(sIn) 'first get a random number j = Int((nL - 1 + 1) * Rnd + 1) 'find string value of jth element sT1 = Mid$(sMod, j, 1) DoEvents 'allow break 'accumulate jth element sAcc = sAcc & sT1 'remove current character sL = Left$(sMod, j - 1) sR = Right$(sMod, nL - j) sMod = sL & sR 'new string length nL = Len(sMod) DoEvents 'allow break Next i 'transfer sOut = sAcc FisherYatesStrShuffle = True End Function Private Sub testKnuthArrShuffle() 'run this to test the array shuffle Dim bOK As Boolean, sStr As String, sOut As String Dim Cycles As Long, n As Long, bF As Boolean Dim vS As Variant, vA As Variant, vB As Variant 'define a typical array for shuffling vS = Array("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", _ "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", _ "0", "1", "2", "3", "4", "5", "6", "7", "8", "9") 'set number of shuffled versions needed Cycles = 1 For n = 1 To Cycles 'shuffle array bOK = KnuthArrShuffle(vS, vA) If bOK = False Then MsgBox "Problems in array shuffle" Exit Sub End If 'arrays to strings for display sStr = Arr1DToStr2(vS) sOut = Arr1DToStr2(vA) 'display ' MsgBox "Before : " & sStr & vbCrLf & _ ' "After : " & sOut Debug.Print "Before : " & sStr Debug.Print "After : " & sOut & vbCrLf 'return to an array in vB if needed bF = StrTo1DArr2(sOut, vB) Next n End Sub Private Function KnuthArrShuffle(vIn As Variant, vR As Variant) As Boolean ' Performs a modified Knuth random shuffle on the elements of the input array. ' The original by Fisher-Yates, was modified for computers by Durstenfeld ' then popularised by Knuth. Returns result in vR with vIn unchanged. 'Note: Some or all elements COULD occupy their original positions, 'but only in accordance with expectation based on the random generator. 'This is best seen for small arrays, say with only 3 elements or so. Dim vW As Variant Dim LB As Long, UB As Long, nL As Long Dim i As Long, j As Long 'initial assignments LB = LBound(vIn): UB = UBound(vIn) ReDim vR(LB To UB) 'return array ReDim vW(LB To UB) 'work array nL = UB - LB + 1 'length of input array vW = vIn 'transfer to a work array 'working Randomize For i = LB To nL 'first get a random number j = Int((UB - LB + 1) * Rnd + LB) 'transfer jth of vW to ith of vR vR(i) = vW(j) 'replace selection with current last of vW vW(j) = vW(UB) 'remove last of vW by shortening array ReDim Preserve vW(LB To UB - 1) 'get new UBound of shortened vW UB = UBound(vW) 'exception; return if last chara If UB = LB Then vR(i + 1) = vW(UB) Exit For End If DoEvents 'allow breaks Next i KnuthArrShuffle = True End Function Function StrTo1DArr2(ByVal sIn As String, vRet As Variant, _ Optional ByVal bLB1 As Boolean = True) As Boolean ' Loads string characters into 1D array (vRet). One per element. ' Optional choice of lower bound. bLB1 = True for one-based (default), ' else bLB1 = False for zero-based. vRet dimensioned in proc. Dim nC As Long, sT As String Dim LB As Long, UB As Long If sIn = "" Then MsgBox "Empty string - closing" Exit Function End If 'allocate array for chosen lower bound If bLB1 = True Then ReDim vRet(1 To Len(sIn)) Else ReDim vRet(0 To Len(sIn) - 1) End If LB = LBound(vRet): UB = UBound(vRet) 'load charas of string into array For nC = LB To UB If bLB1 = True Then sT = Mid$(sIn, nC, 1) Else sT = Mid$(sIn, nC + 1, 1) End If vRet(nC) = sT Next StrTo1DArr2 = True End Function Function Arr1DToStr2(vIn As Variant) As String ' Makes a single string from 1D array string elements. ' Works for any array bounds. Dim nC As Long, sT As String, sAccum As String Dim LB As Long, UB As Long LB = LBound(vIn): UB = UBound(vIn) 'join characters of array into string For nC = LB To UB sT = vIn(nC) sAccum = sAccum & sT Next Arr1DToStr2 = sAccum End Function
- Fisher Yates Shuffle: A very clearly written article in Wikipedia, that explains worked examoles step by step.