Optimizing Algorithm for Image Translation with maximum common foreground pixel

104 Views Asked by At

I'm searching for an algorithm to efficiently determine the optimal translation of one 2D image onto another (limited to x and y axes, without rotation). The goal is to find the maximum common pixel, where pixels are considered black on a white background and vice versa. The foreground-to-background ratio is significantly skewed towards the background.

Example:

enter image description here

enter image description here

Here, the best translation is [-171, 97].

I've already implemented an algorithm that compares all pixels and iterates through translations by incrementing x and y. However, this approach is time-consuming. To address this, I've attempted to speed up the process by focusing only on the translation of white pixels to other white pixels in the second image. Although it works, it's still slow.

Below is the code snippet of my current algorithm (arrayImage is only contains 0 and 1, it's a binary image).

public static int[] findBestTranslation(int[][] arrayImage1, int[][] arrayImage2) {
    int[] bestTranslation = new int[2];
    int bestSimilarity = Integer.MIN_VALUE;

    int img2Length = arrayImage2.length;
    int img2Width = arrayImage2[0].length;
    
    List<int[]> whitePixelsImage1 = findWhitePixels(arrayImage1);
    List<int[]> whitePixelsImage2 = findWhitePixels(arrayImage2);
    boolean[][] checkedOffsets = new boolean[img2Length * 2][img2Width * 2];
    
    int i = 0;
    for (int[] pixel1 : whitePixelsImage1) {
        System.out.println(i++ + ": " + bestSimilarity);
        for (int[] pixel2 : whitePixelsImage2) {
            //calculate translation
            int xOffset = pixel2[0] - pixel1[0];
            int yOffset = pixel2[1] - pixel1[1];

            // Check if this offset has been selected before
            if (checkedOffsets[xOffset + img2Length][yOffset + img2Width]) {
                continue;
            } else {
                checkedOffsets[xOffset + img2Length][yOffset + img2Width] = true;
            }

            int similarity = 0;
            for (int[] pixelNotTranslated : whitePixelsImage1) {
                int xTranslated = pixelNotTranslated[0] + xOffset;
                int yTranslated = pixelNotTranslated[1] + yOffset;
                if (xTranslated >= 0 && xTranslated < img2Length && yTranslated >= 0 && yTranslated < img2Width) {
                    similarity += arrayImage2[xTranslated][yTranslated];
                }
            }

            if (similarity > bestSimilarity) {
                bestSimilarity = similarity;
                bestTranslation[0] = xOffset;
                bestTranslation[1] = yOffset;
            }
        }
    }

    return bestTranslation;
}
1

There are 1 best solutions below

1
On BEST ANSWER

I don't speak Java, but I think you want a "phase correlation", which you can do with scikit-image in Python as follows:

#!/usr/bin/env python3

import numpy as np
from skimage import io
from skimage.registration import phase_cross_correlation

# Define image names and load them
im1name = 'MVlJh.png'
im2name = 'XSj05.png'
im1 = io.imread(im1name)
im2 = io.imread(im2name)

# Peform phase correlation
shift, error, diffphase = phase_cross_correlation(im1, im2)

print(f'Detected pixel offset (y, x): {shift}')

Output

For your two images, it outputs the following in around 1 second:

Detected pixel offset (y, x): [-97. 171.   0.]

I guess you could either "shell out" to Python with some type of exec() or system() function, or find a Java implementation of the phase correlation technique.


There is also an OpenCV implementation available, which may have Java bindings. It is noticeably faster than the method above and gets the result in well under 1 second:

#!/usr/bin/env python3

import numpy as np
import cv2 as cv

im1name = 'MVlJh.png'
im2name = 'XSj05.png'

im1 = cv.imread(im1name, cv.IMREAD_GRAYSCALE)
im2 = cv.imread(im2name, cv.IMREAD_GRAYSCALE)

# Peform phase correlation
retval, response = cv.phaseCorrelate(im1.astype(float), im2.astype(float))

print(f'{retval=}')

Output

retval=(-171.0011860055988, 97.00465314052326)

Here's what I get if I plot one image in green, then the second image in red shifted by the calculated amount.

enter image description here

I used a "hardmix" blending mode, so the yellow parts are where red and green coincide. For my own reference, this was my blend command with ImageMagick:

magick MVlJh.png -fill lime -opaque white \( XSj05.png -geometry +171-97  -fill red -opaque white  \) -compose hardmix -composite result.png