make a Fourier result invariant to rotation, scale, translation

5k Views Asked by At

I would like to know which values in a fourier matrix are responsible on the change of size, rotation, translation and so on, of an image.

I have coded a 2 dimensional DFT function that output a dft matrix of complex numbers.

how can i remove the values responsible on the scaling, translation, and rotation of an image, such that when i have 2 images for example:

Image1

Image2 = Image1 rotated by 90degree

once we compare the DFT matrix of both images, we find that they're equal.

here's the code of the DFT function that i have:

%----------------------------------------------------------------
function [Xk] = dft1(xn)
N=length(xn);
n = 0:1:N-1; % row vector for n
k = 0:1:N-1; % row vecor for k
WN = exp(-1j*2*pi/N); % Twiddle factor (w)
nk = n'*k; % creates a N by N matrix of nk values
WNnk = WN .^ nk; % DFT matrix
Xk = (WNnk*xn );
%----------------------------------------------------------------

%----------------------------------------------------------------
function out=dft2(x)

y=zeros(size(x));
y1=y;
C=size(x,2); %number of columns
for c=1:C
y(:,c)=dft1(x(:,c));
end
R=size(x,1); %number of rows
for r=1:R
y1(r,:)=dft1(y(r,:).');
end
out=y1;
%----------------------------------------------------------------
2

There are 2 best solutions below

4
On

Assuming your image is infinite (because we do not want boundary effects clouding the theory here).

Changing image size / scale results in a corresponding scale in the Fourier domain, scaling an image by factor s scales the frequency domain by 1/s. For finite images, this means that if you scale your image by 2, you loose the upper half of the frequency domain - that is the high frequencies and details in the image.

Rotating the image corresponds to a similar rotation in the frequency domain.

Translating the image amounts to change in the phase of the Fourier coefficients: translating by x pixels result in a factor exp( -j pi x ) in the frequency domain (up to some const scaling).

A nice summary of these properties of the Fourier transform can be found here.


Now, although the theory suggests nice and clear conditions where the DFT of two signals can be matched despite scaling/rotation/translation in practice it is not always that simple.
Consider for example the case of translation: the properties of Fourier transform suggests that two signals that differ only by translation, their DFT differs only by modulation, thus ideally dividing the DFT of the two signals will yield only the modulation component and reveal not only that the signals are identical up to translation but also the amount of translation between the two signals.
However, in practice this is not the case. For finite signals, translating one signal causes boundary effect: some pixels are lost (translated "outside" the visible signal) while some new pixels are introduced. Since DFT is global (that is each value of the transformed signal is affected by all values of the original signal) these boundary effects causes all values of the transformed translated signals to be different not only by the desired modulation but altogether different, making the ratio between the two signals quite arbitrary.
The same goes for scaling and rotation as well.

2
On

Take the magnitude of your DFT, transform it to logpolar coordinates, take the FFT of that, then take the magnitude. It will give you translation, rotation and scale invariance, but it will not give you robusteness against other transforms.