longest common subsequence print all subsequences

2.5k Views Asked by At

how to display all the substrings of a longest common substring of two substring i know the dp method to calculate length of lcs but how to display all those lcs code for lcs

       function LCSLength(X[1..m], Y[1..n])
C = array(0..m, 0..n)
for i := 0..m
   C[i,0] = 0
for j := 0..n
   C[0,j] = 0
for i := 1..m
    for j := 1..n
        if X[i] = Y[j]
            C[i,j] := C[i-1,j-1] + 1
        else
            C[i,j] := max(C[i,j-1], C[i-1,j])
return C[m,n]

i couldnt find a good article on net for how to find all lcs

string 1=abcabcaa string 2=acbacba

all lcs

ababa abaca abcba acaba acaca acbaa acbca

i already know the dp method to calculate lcs any help would be appreciated

i found this on wiki

             function backtrackAll(C[0..m,0..n], X[1..m], Y[1..n], i, j)
  if i = 0 or j = 0
    return {""}
    else if X[i] = Y[j]
    return {Z + X[i] for all Z in backtrackAll(C, X, Y, i-1, j-1)}
   else
    R := {}
    if C[i,j-1] ≥ C[i-1,j]
        R := backtrackAll(C, X, Y, i, j-1)
    if C[i-1,j] ≥ C[i,j-1]
        R := R ∪ backtrackAll(C, X, Y, i-1, j)
    return R

but am having diificulty in understanding it

3

There are 3 best solutions below

0
On

A very intuitive way to do this is to store in additional 2D array backpointers, i.e.:

BP[i,j] := [(i-1,j-1)] if C[i,j] was created from C[i-1,j-1] (there was a match)
BP[i,j] := [(i,j-1), (i-1,j)] if C[i,j] could came from C[i,j-1] or C[i-1,j]
BP[i,j] := [(i,j-1)] if C[i,j] was created from C[i,j-1]
BP[i,j] := [(i-1,j)] if C[i,j] was created from C[i-1,j]

Now, let's define a directed graph. Any entry [i,j] in the array C corresponds to a vertex and backpointers correspond to edges.

There are n*m vertices and at most 2*n*m edges, since one vertex has at most 2 backpointers.

Now the problem is to return all paths in this graph from the vertex [n,m] to the vertex [0,0].

The graph is directed and there are no cycles, so you can simply follow the pointers by a DFS (without marking vertices as visited) and for each edge ([i,j] -> [i-1,j-1]) append the letter that corresponds to this match to a resulting string. The LCS is the string accumulated after reaching the [0,0] vertex.

0
On

Each c[i,j] entry depends on only three other c table entries: c[i-1,j-1], c[i-1,j], and c[i,j-1]. Given the value of c[i,j], we can determine in O(1) time which of these three values was used to compute c[i,j]. Different values would be possible when there are more than one possible values for the previous 3 points, that is that they are same and highest at the same point. Then you would have to consider each of them.

1
On

this is the simplest java program of LCS :-

import java.util.Scanner;

public class StrCmp {

public static void main(String[] args) {
    Scanner s = new Scanner(System.in);
    System.out.println("Enter the string");
    String s1=s.nextLine();
    String s2=s.nextLine();
    String temp = "";
    char[] a = s1.toCharArray();
    char[] b= s2.toCharArray();
    for(int i=0;i<a.length;i++)
    {
        for(int j=0;j<b.length;j++)
        {
            if(s1.charAt(i)==s2.charAt(j))
            {   
                if(temp.contains(String.valueOf(s2.charAt(j)))){
                    break;
                }
                temp = temp + s2.charAt(j);
            }   
        }   
    }
    System.out.println(temp);
    }

}