Statistical meaning of pre-probability formula used in Python computer vision code? (matplotlib.image.imread)

100 Views Asked by At
import numpy as np
import matplotlib.image as img;
C = 1.0-np.mean(img.imread('circle.png'),axis=2);
C /= np.sum(C);

The image has 256 X 256 pixels. The resulting array C appears to be a 256 X 256 array containing probabilities of each pixel (color, depth, I don't know what the probability is of due to my lack of knowledge of what matplotlib.image.imread does.

I understand the 4th line of code to mean probability values (real data normalized by their sum, so that they sum to 1), but what is the meaning of the formula on the 3rd line of the code, in and outside of the image context, i.e. statistical intuition?

Ultimately, I would like to apply the above probability transformation to a single time series of real data, so I'm wondering if the formula 1.0-np.mean() would still apply for my (non-computer vision, but distributional) application

1

There are 1 best solutions below

2
On BEST ANSWER

an image contains pixels.

an image can have one color channel (grayscale) or multiple (red-green-blue).

"depth" is a term describing the gradations of the pixel values. 8 bits are common and that means 2^8 = 256 different levels per channel, or 256^3 = 16.7 million different colors. 1 bit would be black and white. advanced cameras and scanners may have 10, 12, or more bits of depth.

I see nothing involving probabilities here.

img.imread('circle.png') read the image. you will get a numpy array of shape (height, width, 3) because the image is likely in color. the third dimension (dimension 2) expresses the channels of color per pixel. I am guessing that this routine loads images as floating point values with a range of 0.0 to 1.0.

np.mean(..., axis=2) takes the average of all color channels for every pixel. it does an average/mean calculation along axis 2 (the third one), which contains the color values of every pixel. the shape of this array will be (height, width) and represents the input image as grayscale. the weighting of colors is a little questionable. usually the green channel gets more weight (brightest) and the blue channel gets less weight (darkest).

C = 1.0- ... inverts the picture.

np.sum(C) simply sums up all the grayscale pixels of C. you will get a measure of the overall brightness of the picture.

C /= np.sum(C) divides by that brightness measure. you will get a picture with normalized brightness. there is a factor for the image's size (width*height) missing so these values will be very small (dim/dark).

Use this instead (mean instead of sum) to adjust the intensity values to be 0.5 (gray) on average.

C /= np.mean(C) # now 1.0 on average
C *= 0.5
# or just C *= 0.5 / np.mean(C)