I challenge my self to finish an MIR course by the end of the year.
I’m a graduate student at Keio University’s grad school of Media and Governance and I research music science for cross cultural samples. Most of the research I have done is quite basic, and I haven’t really designed or built my own MIR algorithm. In 2018 I got to present some preliminary perceptual experiments where we asked three raters to test a bunch of pairs of songs by similarity and compared their ratings with automated judgements of similarity. Here’s a poster of what I made. I got to present this research at a late breaking demo session in ISMIR 2018, Paris.
I learnt a lot from going to ISMIR for the first time, but I also felt a tremendous lack of knowledge and understanding. I could barely grasp onto half of the words the people were passing around casually like Recurrent Neural Network, Convolution Neural Network, Power Spectrum Analysis etc, etc. This was all alien to me and I felt like an imposter eating the free food and enjoying the music. But I also took it on myself to make as many friends as I could. I had a pack of business cards with me and shamelessly advertised our lab that had literally just been formed. In the end I actually ended up getting to know a lot of cool people, some of whom I still talk to regularly today.
When I went back I got busy with college life and forgot about MIR for a good few months. That was until ISMIR 2019 submissions were due. I pulled together some last minute shoddy research work, and by that point all of our co-authors pulled out because our research was quite frankly, crap. A genius joint my project, and that year he went to present an LBD at ISMIR 2019 instead of me. Toward the end of that year my professor suggested that I go to the Center for Digital Music at Queen Mary University of London. I studied under the gracious and legendary MIR god Emmanouil Benetos. He gave me all the resources I needed to familiarize myself with MIR features, and some open source toolboxes + VAMP plugins for sonic visualizer.
Here is a list of all the resources he suggested:
1) Instead of a book and material on DSP, given your project I would suggest a book and material on music signal processing instead, which is a bit more focused on what you are working on. Given this, I would definitely recommend that you take a look at the "Fundamentals of Music Processing" book by Meinard Muller. The book can also be found in the QMUL library. The book has also an excellent companion piece, called the "FMP notebooks", which are Jupyter notebooks with various implementations of MIR methods and features: https://www.audiolabs-erlangen.de/resources/MIR/FMP/C0/C0.html
A similar set of python notebooks for MIR can be found at: https://musicinformationretrieval.com/
2) The website with all soundscape analysis datasets and methods is: http://dcase.community/
3) You can very easily use off-the-shelf VAMP plugins, which were designed for both Sonic Visualiser (for GUI-based analysis) and Sonic Annotator (for batch processing and batch feature extraction):https://www.vamp-plugins.org/ . In addition to that, you might want to take a look at the various code repositories that exist under the SoundSoftware website hosted by C4DM: https://code.soundsoftware.ac.uk/
And also the C4DM github repo: https://github.com/c4dm
After that he told me:
Following feature extraction, a next step would be on training machine learning models. Since your participant ratings do not refer to labels (used in supervised classification), but refer to similarity, the next step would be to start looking into metric learning methods.
A good starting point would be to go through the documentation for metric learning in scikit-learn:
In particular, you might want to put more attention to methods for weakly supervised metric learning, which appears to fit your problem statement well.
When I got back I tried to finish up some metric learning stuff for our paper for ISMIR (which got a pretty close rejection meaning that it was on the border of being accepted). The metric learning didn’t make sense for our paper and I didn’t really look closely at MIR again for another six months. Now that ISMIR 2020 is over, I have a newfound resolve to start getting good at MIR again.
Here is my study plan:
“Information Retrieval for Music and Audio Data” by Meinard Muller
“Fundamentals of Music Processing” book by Meinard Muller
“Music Similarity and Retrieval” book by Peter Knees and Markus Schedl
“Audio Signal Processing for Music Applications” online class by Xavier Serra
“Hands on machine learning with scikit-learn and tensorflow” by Aurelien Geron
Well that’s really more of list of books + 1 online class, but I plan on getting through them one by one and posting about it on substack/my blog. If you’re interested, you can subscribe, but I’m not really asking you to. This post was about my motivation to do this project. Anyway - Ciao!