Ive found simlar posts before about this but nothing really answers the question.
In my fingerprinting, i produce a recordset which has 5 integers. For example: 33,42,88,121,194
These correspond to the frequencies which have the highest magnitude for a particular sample of music. Eg: for 30ms of audio sample i have buckets of the following frequencies:
0-40
40-80
80-120
120-180
180-250
Im trying to produce a hash (a forgiving one) which will perhaps produce the same hash for 33,42,88,121,194 as it would for say
33,43,88,122,195
where there are minor differences in the frequencies a similar hash would be formed.
1st off is this LSH? as i have read that this is best for Audio Fingerprinting.
If not, could anyone provide some psuedocode or c# for a function that might do what im looking for? i have read up on LSH and matlab and perl implementations but i dont understand them so posting a link to them won't really help me too much.
thanks again!
This might be a duplicate of this: Compare two spectogram to find the offset where they match algorithm, what it appears you are trying to do is produce a histogram for the rough distribution of the peaks in the sample. There are several methods to do this, another "example" is here: Compare two spectogram to find the offset where they match algorithm
One method of doing this is to use a Fast-Fourier-Transform of the peak data and its distribution (over time) to produce a rough equivalence of the sample in a distilled form. To do this you do something roughly similar to:
To compare the fingerprint, you run the same process over the second sample, and then use a Diff algorithm to compare the two, using some "fuzz" to decide how close they are. You will need to compare the fingerprints on two dimensions, the order of the discrete fingerprints, as well as the overall difference in each sample.
This article on making a rough Java equivalent to Shazaam was posted a while ago: http://www.redcode.nl/blog/2010/06/creating-shazam-in-java/ and may be of some help to you.