Recently, I and my partner developed a chord recognition tool using a neural network for research. For input, we are using the results from a pitch class profile.
There are 12 inputs as representations of each pitch class. The output is 5 nodes. We train the neural network based on input such as:
for chord c major: input: 1 0 0 0 1 0 0 1 0 0 0 0 and output: 1 0 0 0 0.
When we test it using c major.wav
, the actual input from the result of the pitch class profile method shows the good result. The 3 basic notes of the c major are more dominant compared with the other notes, but the value is too small, i.e. :
c: 0.7123345
c#: 0.00024521
d:0.00013312
e: 0.009123
f:0.445023
f#:0.0535852
g:0.000312
g#:0.51023
a:0.0002312
a#:0.1034
b:0.003122
b#:0.000102
If we check it manually, we can see that c, f,and g are dominant as it should be, but when we check it using neural networks, the result is not as we desired. What can we do to improve this?