I have a dataset X of lets say 2000 lines. I want to take each 2 lines and mean them col by col together. The result should be a 1000 line dataset (columns count should stay the same).
I did this in matlab already
#matlab function
function [ divisibleMatrix ] = meanDown( matrix )
%MEANDOWN takes a matrix and means every 2 lines (makes it half the size)
newSize = (floor(size(matrix, 1)/2)*2); %make it divisible
divisibleMatrix = matrix(1:newSize, 1:end);
D = divisibleMatrix;
m=size(divisibleMatrix, 1);
n=size(divisibleMatrix, 2);
% compute mean for two neighboring rows
D=reshape(D, 2, m/2*n);
D=(D(1,:)+D(2,:))/2;
D=reshape(D, m/2, n);
divisibleMatrix = D;
end
This is a 1-liner. The key is that groupby can take an arbitrary mapping of index -> group.
df.index / 2
gives what you are looking.