I have 256 different colorIDs matching to some set of rgb values when doing pixel segmentation and need to write a script that retrieves the colorID of each pixel in a segmented rgb picture. There is going to be a lot of high resolution pictures so I need a effective way to this in python. I have a 256x1 png which contains the rgb values as the color of each pixel and the index of the pixels from left to right corresponds to the colorID values.
I have a working code, but it has some flaws as you will see. The idea is to deconstruct each image into single pixels and retrieve the rgb values of each pixel. These rgb values can then be used as indexes or keys to retrieve the colorID using a converter based on the 256x1 png.
Here is the code for generating the converter from rgb to colorID:
def generate_rgb_to_colorID_converter(self):
color_palette = Image.open(path_to_image)
rgb_to_colorID_converter = np.zeros((256, 256, 256), dtype=np.uint8) # rgb values will be used as index and the corresponding colorID will be the elements
for x_coord in range(256):
pixel_coord = (x_coord, 0)
rgb = color_palette.getpixel(pixel_coord) # Returns decimal rgb value at coordinate in a tuple
rgb_to_colorID_converter[rgb[0], rgb[1], rgb[2]] = x_coord # Sets the element at given index in the 3D array to the colorID value
return rgb_to_colorID_converter
This is the function that translates the rgb values over to its corresponding colorID:
def rgb_to_colorID(self, rgb_array, rgb_to_colorID_converter):
"""rgb_array[:, :, 0] returns the r-channel for every pixel while the indexes 1 and 2 does the same for g and b
[rgb_array[:, :, 0], rgb_array[:, :, 1], rgb_array[:, :, 2]] then returns the rgb-value at every pixel which is used as index in rgb_to_colorID_converter"""
mapped_image = rgb_to_colorID_converter[rgb_array[:, :, 0], rgb_array[:, :, 1], rgb_array[:, :, 2]]
return mapped_image
As you might see, the problem with this code is that the rgb_to_colorID_converter contains 256^3 elements while only 256 of them actually holds colorID values (the rest are zeros). Using tracemalloc this array seems to be using about 16 MB of memory which is just wasteful since it's only used for 256 uint8 values. Running this implementation of the code in my full system takes an average of 0.017 seconds to convert a 1920x1080 picture from rgb to grayscale with the colorID values.
Trying to remove this wastefull memory usage I rewrote the functions in the following ways:
def generate_rgb_to_colorID_converter(self):
color_palette = Image.open(path_to_image)
rgb_to_colorID_converter = {}
for x_coord in range(256):
pixel_coord = (x_coord, 0)
rgb = color_palette.getpixel(pixel_coord) # Returns decimal rgb value at coordinate in a tuple
rgb_to_colorID_converter[(rgb[0], rgb[1], rgb[2])] = np.uint8(x_coord) # Sets the element at given index in the 3D array to the colorID value
return rgb_to_colorID_converter
def rgb_to_colorID(self, rgb_array, rgb_to_colorID_converter):
columns = len(rgb_array[0]) # columns av rows of the original array is needed to reshape the colorID array back to the right shape
rows = len(rgb_array)
rgb_array = rgb_array.reshape(-1,3) # Turns the 3D array into a 2D array where each element of the 2D array is a single array containing the rgb values for a single pixel. Makes it easier to loop through the array
colorID_array = np.empty(len(rgb_array), dtype=np.uint8) # will be filled with the colorID values for each pixel
for pixel_index, pixel in enumerate(rgb_array):
colorID_array[pixel_index] = rgb_to_colorID_converter[tuple(pixel)] # indexes in the rgb and colorID arrays correspond. tuple(index) correponds to the key of the dictionary
colorID_array = colorID_array.reshape(rows, columns) # reshapes it back, but its now a 2D array since each pixel only has a single color channel and not 3
return colorID_array
The memory usage of the dictionary is very low. But the problem now is the speed of conversion from rgb to colorID arrays. The for-loop which goes through every pixel and changes it from rgb to colorID takes an average of 8.1 seconds in the full system. This is not usable at all. Taking these statistics into consideration, the first implementation is the absolute winner.
The thing I am really asking about is how to possibly shrink the array size of rgb_to_colorID_converter in the first implementation or speed up/replace the for-loop in the second implementation.
Assuming the performance problem is in the dictionary indexed by tuples, you can replace it by some lower-level implementation.
For example, implement a hash table using a numpy array.
You might want to generate random hash functions until you find a perfect one; not sure how useful this idea is.
All the above is a wild guess about where the performance bottleneck is. If you don't want to believe me (you really shouldn't) and want to be systematic about it, you should use a profiler. In this case, where the code is so small, you specifically need a line profiler to pinpoint the problem to a specific place in code.