High performance variable blurring in very big images using Python

Question

High performance variable blurring in very big images using Python

2.8k Views Asked by Chau At 17 October 2025 at 10:02

I have a large collection of large images (ex. 15000x15000 pixels) that I would like to blur. I need to blur the images using a distance function, so the further away I move from some areas in the image the more heavier the blurring should be. I have a distance map describing how far a given pixel is from the areas.

Due to the large amount of images I have to consider performance. I have looked at NumPY/SciPY, they have some great functions but they seem to use a fixed kernel size and I need to reduce or increase the kernel size depending on the distance to the previous mentioned areas.

How can I solve this problem in python?

UPDATE: My solution so far based on the answer by rth:

# cython: boundscheck=False
# cython: cdivision=True
# cython: wraparound=False

import numpy as np
cimport numpy as np

def variable_average(int [:, ::1] data, int[:,::1] kernel_size):
    cdef int width, height, i, j, ii, jj
    width = data.shape[1]
    height = data.shape[0]
    cdef double [:, ::1] data_blurred = np.empty([width, height])
    cdef double res
    cdef int sigma, weight

    for i in range(width):
        for j in range(height):
            weight = 0
            res = 0
            sigma =  kernel_size[i, j]
            for ii in range(i - sigma, i + sigma + 1):
                for jj in range(j - sigma, j + sigma + 1):
                    if ii < 0 or ii >= width or jj < 0 or jj >= height:
                        continue
                    res += data[ii, jj]
                    weight += 1
            data_blurred[i, j] = res/weight

    return data_blurred

Test:

data = np.random.randint(256, size=(1024,1024))
kernel = np.random.randint(256, size=(1024,1024)) + 1
result = np.asarray(variable_average(data, kernel))

The method using the above settings takes around 186seconds to run. Is that what I can expect to ultimately squeeze out of the method or are there optimizations that I can use to further increase the performance (still using Python)?

Original Q&A

There are 2 best solutions below

Matt On 30 April 2021 at 18:02

I came across this while googling and thought I would share my own solution which is mostly vectorized and doesn't include any for loops on pixels. You can approximate a Gaussian blur by running a box blur multiple times in a row. So the approach I decided to use is to iteratively box blur the image, but to vary the number of iterations per pixel using a weighting function.

If you need a large blur radius, the number of iterations grows quadratically, so consider increasing the ksize.

Here is the implementation

import cv2

def variable_blur(im, sigma, ksize=3):
    """Blur an image with a variable Gaussian kernel.
    
    Parameters
    ----------
    im: numpy array, (h, w)
    
    sigma: numpy array, (h, w)
    
    ksize: int
        The box blur kernel size. Should be an odd number >= 3.
        
    Returns
    -------
    im_blurred: numpy array, (h, w)
    
    """
    variance = box_blur_variance(ksize)
    # Number of times to blur per-pixel
    num_box_blurs = 2 * sigma**2 / variance
    # Number of rounds of blurring
    max_blurs = int(np.ceil(np.max(num_box_blurs))) * 3
    # Approximate blurring a variable number of times
    blur_weight = num_box_blurs / max_blurs

    current_im = im
    for i in range(max_blurs):
        next_im = cv2.blur(current_im, (ksize, ksize))
        current_im = next_im * blur_weight + current_im * (1 - blur_weight)
    return current_im

def box_blur_variance(ksize):
    x = np.arange(ksize) - ksize // 2
    x, y = np.meshgrid(x, x)
    return np.mean(x**2 + y**2)

And here is an example

im = np.random.rand(300, 300)
sigma = 3

# Variable
x = np.linspace(0, 1, im.shape[1])
y = np.linspace(0, 1, im.shape[0])
x, y = np.meshgrid(x, y)
sigma_arr = sigma * (x + y)
im_variable = variable_blur(im, sigma_arr)

# Gaussian
ksize = sigma * 8 + 1
im_gauss = cv2.GaussianBlur(im, (ksize, ksize), sigma)

# Gaussian replica
sigma_arr = np.full_like(im, sigma)
im_approx = variable_blur(im, sigma_arr)

Blurring results

The plot is:

Top left: Source image
Top right: Variable blurring
Bottom left: Gaussian blurring
Bottom right: Approximated Gaussian blurring

**rth** · Accepted Answer

As you have noted related scipy functions do not support variable size blurring. You could implement this in pure python with for loops, then use Cython, Numba or PyPy to get a C-like performance.

Here is a low level python implementation, than uses numpy only for data storage,

import numpy as np

def variable_blur(data, kernel_size):
    """ Blur with a variable window size
    Parameters:
      - data: 2D ndarray of floats or integers
      - kernel_size: 2D ndarray of integers, same shape as data
    Returns:
      2D ndarray
    """
    data_blurred = np.empty(data.shape)
    Ni, Nj = data.shape
    for i in range(Ni):
        for j in range(Nj):
            res = 0.0
            weight = 0
            sigma =  kernel_size[i, j]
            for ii in range(i - sigma, i+sigma+1):
                for jj in range(j - sigma, j+sigma+1):
                    if ii<0 or ii>=Ni or jj < 0 or jj >= Nj:
                        continue
                    res += data[ii, jj]
                    weight += 1
            data_blurred[i, j] = res/weight
    return data_blurred

data = np.random.rand(50, 20)
kernel_size = 3*np.ones((50, 20), dtype=np.int)
variable_blur(data, kernel_size)

that calculates an arithmetic average of pixels with a variable kernel size. It is a bad implementation with respect to numpy, in a sense that is it not vectorized. However, this makes it convenient to port to other high performance solutions:

Cython: simply statically typing variables, and compiling should give you C-like performance,

def variable_blur(double [:, ::1] data, long [:, ::1] kernel_size):
     cdef double [:, ::1] data_blurred = np.empty(data.shape)
     cdef Py_ssize_t Ni, Nj
     Ni = data.shape[0]
     Nj = data.shape[1]
     for i in range(Ni):
         # [...] etc.

see this post for a complete example, as well as the compilation notes.

Numba: Wrapping the above function with the @jit decorator, should be mostly sufficient.
PyPy: installing PyPy + the experimental numpy branch, could be another alternative worth trying. Although, then you would have to use PyPy for all your code, which might not be possible at present.

Once you have a fast implementation, you can then use multiprocessing, etc. to process different images in parallel, if need be. Or even parallelize with OpenMP in Cython the outer for loop.

High performance variable blurring in very big images using Python

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in IMAGE-PROCESSING

Related Questions in NUMPY

Related Questions in SCIPY

Related Questions in CYTHON

Trending Questions

Popular # Hahtags

Popular Questions