Is String.find() faster than O(N) in python?

79 Views Asked by At

There's a question on leetcode which goes as follows:

Given strings s1 and s2, return the minimum contiguous substring part of s1, so that s2 is a subsequence of the part. If there is no such window in s1 that covers all characters in s2, return the empty string "". If there are multiple such minimum-length windows, return the one with the left-most starting index.

My proposed solution was:

def minWindow(self, s1: str, s2: str) -> str:
    if len(s1) == 0 or len(s2) == 0:
        return ""
    recordKeeper = []

    minLengthSubsequence = float('inf')
    for i in range(len(s1)):
        if s1[i] == s2[0]:
            t = time.time()
            matchLength = self.matchMinSubsequence(s1[i + 1:], s2[1:])
            print(time.time() - t)
            if matchLength == float('inf'):
                continue
            recordKeeper.append((i, matchLength + 1))
            minLengthSubsequence = min(minLengthSubsequence, matchLength + 1)

    for i in recordKeeper:
        if i[1] == minLengthSubsequence:
            return s1[i[0]:i[0] + i[1]]
    return ""

# Searches if s2 is a subsequence of s1 and returns the minimum length which can make it a subsequence or returns inf otherwise.
def matchMinSubsequence(self, s1, s2):
    if len(s2) == 0:
        return 0
    target = s2[0]

    # search for the next match
    searchResult = -1
    for i in range(len(s1)):
        if s1[i] == target:
            searchResult = i
            break

    if searchResult is not -1:
        return searchResult + 1 + self.matchMinSubsequence(s1[searchResult+1:], s2[1:])

    return float('inf')

This TLEs. However, if I make a small change to the matchMinSubsequence function and search for the next match using searchResult = s1.find(target). The solution gets accepted by a good margin. The function becomes as follows after the change:

    def matchMinSubsequence(self, s1, s2):
    if len(s2) == 0:
        return 0

    target = s2[0]

    searchResult = s1.find(target)
    if searchResult is not -1:
        return searchResult + 1 + self.matchMinSubsequence(s1[searchResult+1:], s2[1:])

    return float('inf')

My Question then is: Finding a first occurrence of a character in a string is an O(N) operation. What does s1.find() do so it becomes as fast as 100x. I have checked the time taken is of order 10^-3 for manual search and 10^-5 for s1.find.

The string is not ordered or sorted that we could use any b-search. The only O(1) algo is to create a map which in itself takes O(N) to construct. I am confused...

0

There are 0 best solutions below