How could you prevent this rgb to colorID conversion from creating 256^3 sized arrays and still keep it vectorized?

Question

How could you prevent this rgb to colorID conversion from creating 256^3 sized arrays and still keep it vectorized?

68 Views Asked by volticus At 23 November 2023 at 17:37

I have 256 different colorIDs matching to some set of rgb values when doing pixel segmentation and need to write a script that retrieves the colorID of each pixel in a segmented rgb picture. There is going to be a lot of high resolution pictures so I need a effective way to this in python. I have a 256x1 png which contains the rgb values as the color of each pixel and the index of the pixels from left to right corresponds to the colorID values.

I have a working code, but it has some flaws as you will see. The idea is to deconstruct each image into single pixels and retrieve the rgb values of each pixel. These rgb values can then be used as indexes or keys to retrieve the colorID using a converter based on the 256x1 png.

Here is the code for generating the converter from rgb to colorID:

def generate_rgb_to_colorID_converter(self):      
        color_palette = Image.open(path_to_image)
        rgb_to_colorID_converter = np.zeros((256, 256, 256), dtype=np.uint8) # rgb values will be used as index and the corresponding colorID will be the elements 
        for x_coord in range(256):
            pixel_coord = (x_coord, 0)
            rgb = color_palette.getpixel(pixel_coord) # Returns decimal rgb value at coordinate in a tuple
            rgb_to_colorID_converter[rgb[0], rgb[1], rgb[2]] = x_coord # Sets the element at given index in the 3D array to the colorID value     
       return rgb_to_colorID_converter

This is the function that translates the rgb values over to its corresponding colorID:

def rgb_to_colorID(self, rgb_array, rgb_to_colorID_converter):
        """rgb_array[:, :, 0] returns the r-channel for every pixel while the indexes 1 and 2 does the same for g and b
        [rgb_array[:, :, 0], rgb_array[:, :, 1], rgb_array[:, :, 2]] then returns the rgb-value at every pixel which is used as index in rgb_to_colorID_converter"""
        mapped_image = rgb_to_colorID_converter[rgb_array[:, :, 0], rgb_array[:, :, 1], rgb_array[:, :, 2]]
        return mapped_image

As you might see, the problem with this code is that the rgb_to_colorID_converter contains 256^3 elements while only 256 of them actually holds colorID values (the rest are zeros). Using tracemalloc this array seems to be using about 16 MB of memory which is just wasteful since it's only used for 256 uint8 values. Running this implementation of the code in my full system takes an average of 0.017 seconds to convert a 1920x1080 picture from rgb to grayscale with the colorID values.

Trying to remove this wastefull memory usage I rewrote the functions in the following ways:

def generate_rgb_to_colorID_converter(self):
    
    
    color_palette = Image.open(path_to_image) 
    rgb_to_colorID_converter = {}
    for x_coord in range(256):
        pixel_coord = (x_coord, 0)
        rgb = color_palette.getpixel(pixel_coord) # Returns decimal rgb value at coordinate in a tuple
        rgb_to_colorID_converter[(rgb[0], rgb[1], rgb[2])] = np.uint8(x_coord) # Sets the element at given index in the 3D array to the colorID value     

    return rgb_to_colorID_converter

def rgb_to_colorID(self, rgb_array, rgb_to_colorID_converter):
    columns = len(rgb_array[0]) # columns av rows of the original array is needed to reshape the colorID array back to the right shape
    rows = len(rgb_array) 
    rgb_array = rgb_array.reshape(-1,3) # Turns the 3D array into a 2D array where each element of the 2D array is a single array containing the rgb values for a single pixel. Makes it easier to loop through the array
    colorID_array = np.empty(len(rgb_array), dtype=np.uint8) # will be filled with the colorID values for each pixel
  
    for pixel_index, pixel in enumerate(rgb_array):
        colorID_array[pixel_index] = rgb_to_colorID_converter[tuple(pixel)] # indexes in the rgb and colorID arrays correspond. tuple(index) correponds to the key of the dictionary

    colorID_array = colorID_array.reshape(rows, columns) # reshapes it back, but its now a 2D array since each pixel only has a single color channel and not 3
    return colorID_array

The memory usage of the dictionary is very low. But the problem now is the speed of conversion from rgb to colorID arrays. The for-loop which goes through every pixel and changes it from rgb to colorID takes an average of 8.1 seconds in the full system. This is not usable at all. Taking these statistics into consideration, the first implementation is the absolute winner.

The thing I am really asking about is how to possibly shrink the array size of rgb_to_colorID_converter in the first implementation or speed up/replace the for-loop in the second implementation.

Original Q&A

There are 1 best solutions below

**anatolyg** · Answer 1 · 2023-11-24T20:51:27.643000

Assuming the performance problem is in the dictionary indexed by tuples, you can replace it by some lower-level implementation.

For example, implement a hash table using a numpy array.

You might want to generate random hash functions until you find a perfect one; not sure how useful this idea is.

All the above is a wild guess about where the performance bottleneck is. If you don't want to believe me (you really shouldn't) and want to be systematic about it, you should use a profiler. In this case, where the code is so small, you specifically need a line profiler to pinpoint the problem to a specific place in code.

How could you prevent this rgb to colorID conversion from creating 256^3 sized arrays and still keep it vectorized?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in IMAGE-PROCESSING

Related Questions in VECTORIZATION

Related Questions in IMAGE-SEGMENTATION

Related Questions in COLOR-PALETTE

Trending Questions

Popular # Hahtags

Popular Questions