Nothing trumps that feeling of seeing the results you so desire to see appear on screen. As the end of my internship period was approaching the stress started piling up on me, as nothing seemed to yield the results that I was looking for. However, this changed this week, with the use of MFCCs for timbre detection, the accuracy of the prediction of the model developed finally hit the peak that we were planning for.
At the start of this week, the extracted features covered in last week’s blog were tested on many models: a Naive Bayes, a random forest, an SVM, a softmax regression, and a Multi-perceptron. Nonetheless, nothing seemed to yield an accuracy beyond 30% which was almost as bad as the roll of a dice. The size of the data we had and the limitation on the number of extracted features were the primary reasons for such results.
Consequently, we opted for a different strategy, which is the use of MFCCs bands to detect timber in the musical pieces and train the model on them. MFCC stands for Mel-frequency cepstral coefficients, which is a way of representing audio signals that mimics the human ear as it cannot differentiate between different frequencies at a high level that well. The fast–fourier transform of the signal is taken, and a Mel-filter is applied, a log function, and finally the discrete cosine transform is taken giving us the final MFCCs. These were taken for each track and were used as features for the training of the previously mentioned models. This time there was an improvement, nonetheless, the best result we got was 60% due to over-fitting.
To solve this problem the MFCCs bands were taken again at different positions in the audio signal to increase the number of input data during training and increase the performance of the machine learning models, and it indeed did, with results of around 90%, the best results we achieved so far.
My days at Aeste are coming to an end soon, with my internship period wrapping up. As I was almost convinced that the work I have done would have been left feeling incomplete, it was nice to see it finally yield results. All the best to the future intern handling the project after me.