What is the shortest way to calculate running median in python?

1.2k Views Asked by At

I need to calculate a running median in python. Currently I do it like this:

med_y = []
med_x = []
for i in numpy.arange(240, 380, 1):

    med_y.append(numpy.median(dy[(dx > i)*(dx < i+20)]))
    med_x.append(i + 10)

Here the data is stored in dx (x-coordinate) and dy (y-coordinate) and the median is taken over the dy and plotted agaist the dx (which has to be shifted by window/2). Assuming even spacing for x and window size here is 20.

Is there a shorter way?

For example, running average can be done like this:

cs = numpy.cumsum(dy)
y_20 = (cs[20:] - cs[:-20])/20.0
x_20 = dx[10:-10]

Predefined running X functions in site-packages are also ok.

2

There are 2 best solutions below

0
On BEST ANSWER

This is the shortest:

from scipy.ndimage import median_filter
values = [1,1,1,0,1,1,1,1,1,1,1,2,1,1,1,10,1,1,1,1,1,1,1,1,1,1,0,1]
print median_filter(values, 7, mode='mirror')

and it works correctly in the edges (or you can choose how it works at the edges).

And any general running X is done like this (running standard deviation as an example):

import numpy
from scipy.ndimage.filters import generic_filter
values = numpy.array([0,1,2,3,4,5,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1]).astype('float')
print(generic_filter(values, numpy.std, size=7, mode='mirror'))

In the above, float input type is important.

Useful links:

https://nickc1.github.io/python,/matlab/2016/05/17/Standard-Deviation-(Filters)-in-Matlab-and-Python.html

improving code efficiency: standard deviation on sliding windows

1
On

Googling after writing the question revealed signal prosessing function called medfilt, e.g. scipy.signal.medfilt with two input parameters: list of numbers and window size.

It works when:

  • window size is uneven
  • more than (window+1)/2 distance from edges

Near the edges it gives the minimum inside window/2. I guess the reason is that it is meant originally for reducing black error pixels in images and you want the edges to be black.

For example:

from scipy.signal import medfilt 
values = [1,1,1,0,1,1,1,1,1,1,1,2,1,1,1,10,1,1,1,1,1,1,1,1,1,1,0,1]
print medfilt(values,7)

works perfectly for values[4:-4] and gives min(values[:4]) and min(values[-4:]) for edges. The output of above example is:

output = [0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.]