Predicting the popularity of music is a complex task, but it’s one that could bring significant benefits to artists, producers, and streaming platforms. Until now, most research in this area has concentrated on audio features, social metadata, or the architecture of the models themselves. However, a new study from Yash Choudhary, Preeti Rao, and Pushpak Bhattacharyya is shedding light on an often-overlooked factor: the power of lyrics.
The researchers have developed an automated pipeline that uses a language model to extract high-dimensional lyric embeddings. These embeddings capture a wealth of information, including the semantics, syntax, and sequence of the lyrics. This rich, dense data is then integrated into a multimodal architecture called HitMusicLyricNet. This innovative system combines audio, lyrics, and social metadata to predict a popularity score on a scale of 0 to 100.
The results of this approach are impressive. When tested on the SpotGenTrack dataset, which contains over 100,000 tracks, HitMusicLyricNet outperformed existing baselines. The researchers reported a 9% improvement in Mean Absolute Error (MAE) and a 20% improvement in Mean Squared Error (MSE). Furthermore, ablation studies confirmed that these gains were largely due to the language model-driven lyrics feature pipeline, known as LyricsAENet. This underscores the significant value of dense lyric representations in predicting music popularity.
The implications of this research are far-reaching. By leveraging the power of lyrics, artists and producers may gain a new tool to craft songs that resonate with audiences. Streaming platforms could use this technology to better curate playlists and recommendations, enhancing user experience and satisfaction. As the music industry continues to evolve in the digital age, understanding and harnessing the influence of lyrics could be a game-changer.



