Multi-modal Analysis of Music: a large-scale Evaluation
MetadataShow full item record
Multimedia data by definition comprises several different types of content. Music specifically inherits audio at its core, text in the form of lyrics, images by means of album covers, and video in the form of music videos. Yet, in many Music Information Retrieval applications, only the audio content is utilised. A few recent studies have however shown the usefulness of incorporating also other modalities; in most of these studies, textual information in the form of song lyrics or also artist biographies, were employed. Following this direction, the contribution of this paper is a large-scale evaluation of the combination of audio and text (lyrics) features for genre classification, on a database comprising over 20.000 songs. We briefly present the audio and lyrics features employed, and provide an in-depth discussion of the experimental results.