Algorithm Implementation/Strings/Longest repeated substring

From Wikibooks, open books for an open world
Jump to navigation Jump to search

A search conducted within one string for its longest repeated substring. This function finds the non-overlapping repeats as opposed to the overlapping repeats of other implementations. That is to say, for banana, this form will find only an and not the overlapping ana. The function has application in the pattern search of cipher text. A detected pattern of this non-overlapping kind can be used to find a Caesar cipher's key length when certain other factors obtain.

VBA[edit | edit source]

Returns the longest repeated non-overlapping substring for a single string, and the number of repeats. In contests, if length is matched, the substring with most repeats is returned. If still matched, the one that was found first is returned. The worst case is for no repeats found where the number of cycles becomes maximum at (0.5*n)^2.

Function LongestRepeatSubstring(sIn As String, Optional nSS As Long) As String
    'finds longest repeated non-overlapping substring and number of repeats
    
    Dim s1 As String, s2 As String, X As Long
    Dim sPrev As String, nPrev As Long, nLPrev As Long
    Dim nL As Long, nTrial As Long, nPos As Long, vAr as Variant
        
    nL = Len(sIn)
    For nTrial = Int(nL / 2) To 1 Step -1
        For nPos = 1 To (nL - (2 * nTrial) + 1)
            X = 0
            s1 = Mid(sIn, nPos, nTrial)
            s2 = Right(sIn, (nL - nPos - nTrial + 1))
            vAr = Split(s2, s1)            
            X = UBound(vAr) - LBound(vAr)
            If X > 0 Then
                If nPrev < X Then
                    sPrev = s1
                    nPrev = X
                End If
            End If
        Next nPos
        If nPrev <> 0 Then
            LongestRepeatSubstring = sPrev
            nSS = nPrev
            Exit Function
        End If
    Next nTrial
End Function