Detecting Music-Induced Emotion Based on Acoustic Analysis and Physiological Sensing: A Multimodal Approach

Abstract

The subjectivity of listeners’ emotional responses to music is at the crux of optimizing emotion-aware music recommendation. To address this challenge, we constructed a new multimodal dataset (“HKU956”) with aligned peripheral physiological signals (i.e., heart rate, skin conductance, blood volume pulse, skin temperature) and self-reported emotion collected from 30 participants, as well as original audio of 956 music pieces listened to by the participants. A comprehensive set of features was extracted from physiological signals using methods in physiological computing. This study then compared performances of three feature sets (i.e., acoustic, physiological, and combined) on the task of classifying music-induced emotion. Moreover, the classifiers were also trained on subgroups of users with different Big-Five personality traits for further customized modeling. The results reveal that (1) physiological features contribute to improving performance on valence classification with statistical significance; (2) classification models built for users in different personality groups could sometimes further improve arousal prediction; and (3) the multimodal classifier outperformed single-modality ones on valence classification for most user groups. This study contributes to designing music retrieval systems which incorporate user physiological data and model listeners’ emotional responses to music in a customized manner.

Publication
In Applied Sciences
Fanjie Li
Fanjie Li
PhD student

Designer / explorer / educational data scientist