May 2, 2022
Music emotion recognition method based on multifeature fusion
Software that can correlate musical changes in an audio recording of a song with perceived emotional content would be useful across the music industry, particularly in terms of cataloging music and developing music recommendation systems for streaming services and sales. The same approach might also have utility in musical composition and music teaching as well as in music-based therapy. Research in the International Journal of Arts and Technology, recognizes that there are numerous limitations in the current software and points the way forward to how such software might be improved.
Yali Zhang of the School of Music at Henan Polytechnic in Zhengzhou, China, explains how earlier research has focused on training a probabilistic neural network to recognize the nuance of a piece of music and correlate it with the likely emotional responses intended by the composer. However, such work has large error margins that Zhang hopes to preclude in developing her new approach to music emotion recognition. Zhang's approach involves processing the music signal in order to obfuscate a proportion of the low-frequency information that is not necessarily a part of the music's emotional content. Her approach also frames the sound signal and then divides the frames by a window function so that they can be processed by the emotion recognition software. In addition, noise is reduced by time-domain endpoint detection, she adds.
With the sound file thus pre-processed, the matter of recognition can begin and this involves analyzing pitch changes, the rise and fall of tone, and the rate at which those changes occur. Zhang explains that a "weight coefficient" of musical emotion can thus be extracted from a sound file. The characteristics thus extracted for known sound files with human-described emotive content can then be used to train the system so that it can automatically recognize the emotive content in a previously uncategorized piece of music. The approach reduces the error margins seen in earlier work considerably making the categorization of musical emotive content much more accurate.