Build Palindrome from two strings

Question

Build Palindrome from two strings

143 Views Asked by Duck Dodgers At 18 March 2024 at 12:16

I want to write a python function that does this efficiently:

The function will take two strings, 'a' and 'b', and attempt to find the longest palindromic string that can be formed such that it is a concatenation of a non-empty substring of 'a' and a non-empty substring of 'b'. If there are multiple valid answers, it will return the lexicographically smallest one. If no such string can be formed, it will return '-1'.

I have an inefficient solution that generates all the substrings of both strings, and then creates all possible concatenations whle tracking the longest which is a valid palindrome:

def is_palindrome(word):
    """Check if a word is a palindrome."""
    reversed_word = word[::-1]
    return word == reversed_word


def all_substrings_of_word(word):
    """Generate all possible non-empty substrings of a given string."""
    substrings = []
    for sub_string_length in range(1, len(word) + 1):
        for i in range(len(word) - sub_string_length + 1):
            new_word = word[i:i + sub_string_length]
            substrings.append(new_word)
    return substrings


def buildPalindrome(a, b):
    """Attempt to find the longest palindromic string created by concatenating
    a substring of `a` with a substring of `b`."""
    sub_strings_a = all_substrings_of_word(a)
    sub_strings_b = all_substrings_of_word(b)

    # Generate all possible concatenations of substrings from `a` and `b`
    multiplexed_array = [
        word_a + word_b for word_a in sub_strings_a for word_b in sub_strings_b]

    # Find the best palindrome (longest, then lexicographically smallest)
    best_palindrome = ""
    for word in multiplexed_array:
        if is_palindrome(word):
            if len(word) > len(best_palindrome):
                best_palindrome = word
            elif len(word) == len(best_palindrome) and word < best_palindrome:
                best_palindrome = word

    return best_palindrome if best_palindrome else "-1"

print(buildPalindrome("bac", "bac"))   # EXPECTED OUTPUT -- aba
print(buildPalindrome("abc", "def"))   # EXPECTED OUTPUT -- -1
print(buildPalindrome("jdfh", "fds"))   # EXPECTED OUTPUT -- dfhfd

Can I please get an explanation on how this can be improved?

Original Q&A

There are 1 best solutions below

**trincot** · Accepted Answer · 2024-03-20T14:53:05.857000

You could take this approach:

Build a trie for all substrings in b. A suffix tree would be even better, as it is more efficient.
Consider all possible "centers" for potential palindromes in the string a. So these can be between two consecutive characters (when palindrome has an even size) or on a character (when palindrome has an odd size). For each of these centers do:
- Find the largest palindrome p at that center only considering string a
- extend p to the left as long as the added characters (in the order of being added) are a word in the trie of b. This is a potential solution. Compare it with the longest palindrome so far to retain the longest.
- If it is not possible to extend p in this way, then shorten p until a character is so removed that exists in b. In that case we have a potential solution.
- If in the latter case there are no characters in p that occur in b, then we have no suitable palindrome at the chosen center.

Then turn the tables and apply the above procedure where a becomes the reversal of b, and b the reversal of a. This practically means we search for palindrome centers in the original b.

Here is an implementation of that idea:

# Of the two given strings choose the one that is longest, or if equal, comes first in lexical order
def longest(x, y):
    return min((-len(x), x), (-len(y), y))[1]

def buildTrie(s):
    trie = {}
    for i in range(len(s)):
        node = trie
        for j in range(i, len(s)):
            node = node.setdefault(s[j], {})
    return trie

def buildPalindromeTrincot(s1, s2):
    palin = ""
    # Two passes: one for where the center of a palindrome is in s1, one for the reverse case
    for a, b in ((s1, s2), (s2[::-1], s1[::-1])):
        # Build a trie for B substrings (could be suffixtree for better performance)
        trie = buildTrie(b)
        n = len(a)
        # Visit all possible centers in A for a potential solution of at least 2 characters 
        for center in range(2*n-1, 0, -1):
            # Get the offsets of the innermost characters that must be equal 
            #   for a palindrome of at least two characters
            mid1 = (center - 1)//2
            mid2 = (center + 2)//2
            # Get largest palindrome at this center in A alone: 
            #   `left` will point to the left-neighboring character to that palindrome
            left = next((left for left, right in zip(range(mid1, 0, -1), range(mid2, n)) 
                         if a[left] != a[right]), 
                        max(0, mid1 + mid2 - n))
            # Must extend the palindrome with a substring from B
            node = trie.get(a[left], None)
            if node is not None:  # We can extend the palindrome using B
                for left in range(left-1, -1, -1):
                    if a[left] not in node:
                        left += 1
                        break
                    node = node[a[left]]                    
            else: 
                # See if we can drop characters from the palindrome in A 
                #    until we can replace one with the same character from B
                left = next((left for left in range(left+1, mid1+1) if a[left] in trie), None)
                if left is None:
                    continue  # No solution found here
            palin = longest(a[left:mid2] + a[left:mid1+1][::-1], palin)
                
    return palin or "-1"

For input strings of around 40, this implementation runs 100 times faster than the original code you provided. For inputs of size 70, that becomes a factor of 1000 times. For inputs with strings of size 500, this implementation returns an answer in less than a second.

There is still room for improvement, like:

Using a suffix tree instead of a trie
Exiting early when the future palindromes can never be longer than the one already found
Have the center move from the center of a+b and "fan" outward, so that a larger palindrome is found sooner. For this you'll need to have both tries built first as you'll toggle between one and the other.

But as the above code already brought a dramatic improvement, I didn't pursue any of these improvements.

Build Palindrome from two strings

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in ALGORITHM

Related Questions in PALINDROME

Trending Questions

Popular # Hahtags

Popular Questions