I want to write a python function that does this efficiently:
The function will take two strings, 'a' and 'b', and attempt to find the longest palindromic string that can be formed such that it is a concatenation of a non-empty substring of 'a' and a non-empty substring of 'b'. If there are multiple valid answers, it will return the lexicographically smallest one. If no such string can be formed, it will return '-1'.
I have an inefficient solution that generates all the substrings of both strings, and then creates all possible concatenations whle tracking the longest which is a valid palindrome:
def is_palindrome(word):
"""Check if a word is a palindrome."""
reversed_word = word[::-1]
return word == reversed_word
def all_substrings_of_word(word):
"""Generate all possible non-empty substrings of a given string."""
substrings = []
for sub_string_length in range(1, len(word) + 1):
for i in range(len(word) - sub_string_length + 1):
new_word = word[i:i + sub_string_length]
substrings.append(new_word)
return substrings
def buildPalindrome(a, b):
"""Attempt to find the longest palindromic string created by concatenating
a substring of `a` with a substring of `b`."""
sub_strings_a = all_substrings_of_word(a)
sub_strings_b = all_substrings_of_word(b)
# Generate all possible concatenations of substrings from `a` and `b`
multiplexed_array = [
word_a + word_b for word_a in sub_strings_a for word_b in sub_strings_b]
# Find the best palindrome (longest, then lexicographically smallest)
best_palindrome = ""
for word in multiplexed_array:
if is_palindrome(word):
if len(word) > len(best_palindrome):
best_palindrome = word
elif len(word) == len(best_palindrome) and word < best_palindrome:
best_palindrome = word
return best_palindrome if best_palindrome else "-1"
print(buildPalindrome("bac", "bac")) # EXPECTED OUTPUT -- aba
print(buildPalindrome("abc", "def")) # EXPECTED OUTPUT -- -1
print(buildPalindrome("jdfh", "fds")) # EXPECTED OUTPUT -- dfhfd
Can I please get an explanation on how this can be improved?
You could take this approach:
Build a trie for all substrings in
b. A suffix tree would be even better, as it is more efficient.Consider all possible "centers" for potential palindromes in the string
a. So these can be between two consecutive characters (when palindrome has an even size) or on a character (when palindrome has an odd size). For each of these centers do:Find the largest palindrome
pat that center only considering stringaextend
pto the left as long as the added characters (in the order of being added) are a word in the trie ofb. This is a potential solution. Compare it with the longest palindrome so far to retain the longest.If it is not possible to extend
pin this way, then shortenpuntil a character is so removed that exists inb. In that case we have a potential solution.If in the latter case there are no characters in
pthat occur inb, then we have no suitable palindrome at the chosen center.Then turn the tables and apply the above procedure where
abecomes the reversal ofb, andbthe reversal ofa. This practically means we search for palindrome centers in the originalb.Here is an implementation of that idea:
For input strings of around 40, this implementation runs 100 times faster than the original code you provided. For inputs of size 70, that becomes a factor of 1000 times. For inputs with strings of size 500, this implementation returns an answer in less than a second.
There is still room for improvement, like:
a+band "fan" outward, so that a larger palindrome is found sooner. For this you'll need to have both tries built first as you'll toggle between one and the other.But as the above code already brought a dramatic improvement, I didn't pursue any of these improvements.