Jump to content

Algorithm Implementation/Strings/Longest common leading/trailing substring

From Wikibooks, open books for an open world

This algorithm is a special case of the Longest Common Substring algorithm. The special case of leading (or trailing) gives a dramatic performance improvement - in my application roughly two orders of magnitude better.

Unfortunately I've only provided Perl at the moment, because thats what I need for my project! I'm hoping others who need this algorithm in other languages will record their implementation here.

Longest common leading and trail substrings

[edit | edit source]
    public static string LCS_start(string word1, string word2)
    {
      int end1 = word1.Length - 1;
      int end2 = word2.Length - 1;

      int pos = 0;

      while (pos <= end1 && pos <= end2)
      {
        if (word1[pos] != word2[pos])
        {
          pos++;
        }
      }
      return word1.Substring(0,pos);
    }

    public static string LCS_end(string word1, string word2)
    {
      int pos1 = word1.Length - 1;
      int pos2 = word2.Length - 1;

      while (pos1 >= 0 && pos2 >= 0)
      {
        if (word1[pos1] != word2[pos2])
        {
          pos1--;
          pos2--;
        }
      }
      return word1.Substring(pos1 + 1);
    }

Javascript

[edit | edit source]

Longest common leading and trail substrings

[edit | edit source]
function LCS_start( word1,  word2) {

  let end1 = word1.length - 1
  let end2 = word2.length - 1

  let pos = 0

  while (pos <= end1 && pos <= end2){
    if (word1[pos] != word2[pos]) {
       return word1.substring(0,pos)
    } else {
        pos++
    }
  }
}

function LCS_end( word1,  word2) {

  let pos1 = word1.length - 1
  let pos2 = word2.length - 1

  while (pos1 >= 0 && pos2 >= 0){
    if (word1[pos1] != word2[pos2]) {
       return word1.substring(pos1 + 1)
    } else {
        pos1--
        pos2--
    }
  }
}

Longest common leading substring

[edit | edit source]
sub lcl_substr
{
  my $s1 = shift;
  my $s2 = shift;

  my $end1 = length($s1) - 1;
  my $end2 = length($s2) - 1;

  my $pos = 0;

  while ($pos <= $end1 && $pos <= $end2)
  {
   last if substr($s1, $pos, 1) ne substr($s2, $pos, 1);
   $pos++;
  }

  return substr($s1, 0, $pos);
}

Longest common trailing substring

[edit | edit source]
sub lct_substr
{
  my $s1 = shift;
  my $s2 = shift;

  my $pos1 = length($s1) - 1;
  my $pos2 = length($s2) - 1;

  while ($pos1 >= 0 && $pos2 >= 0)
  {
   last if substr($s1, $pos1, 1) ne substr($s2, $pos2, 1);
   $pos1--;
   $pos2--;
  }

  return substr($s1, $pos1+1);
}

Python

[edit | edit source]

Longest common leading substring

[edit | edit source]
def longest_common_leading_substring(string1, string2):

  common = ''
  for s1, s2 in zip(string1, string2):
    if s1 == s2:
      common += s1
  return common

Longest common trailing substring

[edit | edit source]
def longest_common_trailing_substring(string1, string2):

  common = ''
  for s1, s2 in zip(string1[-1::-1], string2[-1::-1]):
    if s1 == s2:
      common += s1
  return common[-1::-1]