Implementation of cross correlation function for pitch frequency detection

762 Views Asked by At

For my sound processing project (specifically pitch detection) I need to implement a cross correlation function and I'm having trouble with the results, I have 400 frames and all frames have 512 samples, the frames have 50 percent overlap this is the formula of the cross correlation I have tried so many ways to do it correct but i couldn't here is my last code:

import numpy as np

    
def pitch_detection(self, frame_matrix, frame_number, lag_vector, frequency):
        np.seterr(divide='ignore', invalid='ignore')
        pitch_freq_vector = []
        for frame in range(frame_number):
            ccf = [] 
            frame_expand_1 = frame_matrix[frame-1, :]
            frame_expand_2 = frame_matrix[frame-2, :]
            temp_corr_1 = frame_matrix[frame, :]
            temp_corr_2 = np.append(frame_expand_1[256:], temp_corr_1, axis=0)
            temp_corr_2 = np.append(frame_expand_2[192:256], temp_corr_2, axis=0)
            len_tc2 = len(temp_corr_2)
            for lag in lag_vector: #pitch is the highest correlation in lag vector
                ccf.append(np.sum(temp_corr_1*temp_corr_2[len_tc2-lag-512:len_tc2-lag]))
            max_index, max_value = max(enumerate(ccf), key=operator.itemgetter(1))
            if max(ccf) > 0.3*np.sum(np.power(temp_corr_1, 2)): #if more than 30 detect pitch
                pitch_freq_vector.append(max_index)
            else:
                pitch_freq_vector.append(-1)
        return pitch_freq_vector

The problem is the maximum always is in the last arg of the ccf but it should vary in different frames. note that pitch frequency for human varies between 50-400 the index of vector maps into those frequencies later and for frames that has no pitch -1 is appended to the list

The implementation works properly It should be used for a frame matrix input I hope you enjoy it

0

There are 0 best solutions below