How can I find all the matches of two sorted arrays in O(n) with limitations on number of comparisons?

105 Views Asked by At

I have two arrays. Array A contains n elements, array B contains m elements, and m < n. They are sorted in ascending order. I want to find all the matches between the two arrays. Two adjacent elements in array A may correspond to the same element of array B.

There are two constraints:

  1. The algorithm should have a time complexity of O(n)

  2. I can only compare each element in A a limited number of times with values in B: at most ceil(log(n)) times. This means at most one binary search can be performed per value in A.

To compare the elements, I need to use a compare function that runs in O(1). This function can have three different return values:

  1. if the two given values match, or
  2. the first is less than the second, or
  3. the first is greater than the second.

So we cannot simply merge the two arrays.

How can I finish this in O(n) time?

I have tried the following ways:

  1. Compare each element of these two arrays from beginning to the end. However, if all of A's elements match B's last element we will exceed the compare time limitation of ceil(log(n)).

  2. Pick the middle element of A, binary search it in B, and then split the B array into 2 partitions. If we do not consider the worst situation, the overall time complexity is determined by

     T(n) = 2T(n/2) + logm, 
    

    ...which is OK. However, in the worst case, all of A's elements correspond to the minimum or maximum element in B, and then the time complexity is determined by

     T(n) = 2T(n/2) + logm, 
    

    ...which is not O(n).

I am out of other ideas. How can I design this algorithm with the given constraints? Or maybe I have a mistake in how I calculated the time complexity?

3

There are 3 best solutions below

4
Bobojon Bobojonov On
public static List<Integer> findMatchedElements(int[] arr1, int[] arr2) {
        List<Integer> matchedElements= new ArrayList<>();
        int i = 0; // Pointer for arr1
        int j = 0; // Pointer for arr2
        
        while (i < arr1.length && j < arr2.length) {
            if (arr1[i] == arr2[j]) {
                matchedElements.add(arr1[i]);
                i++;
                j++;
            } else if (arr1[i] < arr2[j]) {
                i++;
            } else {
                j++;
            }
        }
        
        return matchedElements;
    }
1
Bobojon Bobojonov On
public static List<Integer> findMatchedElements(int[] arr1, int[] arr2) {
    List<Integer> matchedElements = new ArrayList<>();
    int i = 0; // Pointer for arr1
    int j = 0; // Pointer for arr2

    while (i < arr1.length && j < arr2.length) {
        if (arr1[i] == arr2[j]) {
            matchedElements.add(arr1[i]);
            i++;
            j++;
        } else if (arr1[i] < arr2[j]) {
            i++;
        } else {
            // Use binary search to find the next greater or equal element in arr2
            int searchResult = binarySearch(arr2, arr1[i], j, arr2.length - 1);
            if (searchResult != -1) {
                j = searchResult;
            } else {
                break; // No need to continue, no more matches possible
            }
        }
    }

    return matchedElements;
}

private static int binarySearch(int[] arr2, int target, int left, int right) {
    while (left <= right) {
        int mid = left + (right - left) / 2;
        if (arr2[mid] == target) {
            return mid;
        } else if (arr2[mid] < target) {
            left = mid + 1;
        } else {
            right = mid - 1;
        }
    }
    // If no exact match found, return the index of the next greater element
    if (left < arr2.length) {
        return left;
    }
    return -1; // No greater or equal element found
}
2
Hans Olsson On

Instead of a normal binary search for finding all matches you use an algorithm corresponding to binary search in an infinite array, so after matching A[i] with B[j] you don't perform a normal binary search in B, but instead compare A[i] with B[j+1], B[j+2], B[j+4], B[j+8], B[j+16], etc until you get hit (or run into the end of the array) and then you do a normal binary search based on that result and skip ahead in B.

That keeps the overall time to O(n) and doesn't need more than O(log n) searches per element in A.

(Could add detailed algorithm later.)