IDEAS home Printed from https://ideas.repec.org/p/osf/socarx/vba8f_v1.html

Genre and Temporal Dynamics in Spotify Popularity Prediction

Author

Listed:
  • Hausner, Ryan

Abstract

The rise of music streaming platforms has created modern access to track-specific data, allowing for analysis of song popularity. While existing literature has explored audio features as predictors of popularity, less attention has been given to the combined role of genre and temporal dynamics — this paper addresses that gap. Using a public dataset of 114,000 tracks from 2000 to 2022, we apply a data science framework combining iterative OLS regression, interaction modeling, random forest, and rolling coefficient analysis to explore the predictive power of Spotify audio characteristics: loudness, danceability, energy, liveness, and valence, as well as genre and release year. Four iterative OLS regression models are developed using an 80/20 train/test split, showing that genre accounts for the largest gain in explained variation, increasing R-squared from 0.042 to 0.434. A genre-year interaction model further improves R-squared to 0.641, with interaction terms jointly significant confirmed by a partial F-test (F(80,318,243)=653.72, p<.001), implying that the effect of genre on popularity varies across time — specifically that different genres rise and fall in prevalence at different periods. A random forest model confirms these findings, ranking genre and year significantly higher in feature importance based on impurity reduction. The most accurate model achieves RMSE=9.64 on a popularity scale of 0-100, with remaining variance likely attributable to unmeasured factors such as Spotify playlist algorithms and social media exposure. Rolling coefficient analysis further reveals the instability of audio features over time — energy's contribution to popularity turned strongly negative post-2010, while danceability peaked around 2015-2016 — suggesting that the streaming era has fundamentally reshaped which acoustic properties drive popularity.

Suggested Citation

  • Hausner, Ryan, 2026. "Genre and Temporal Dynamics in Spotify Popularity Prediction," SocArXiv vba8f_v1, Center for Open Science.
  • Handle: RePEc:osf:socarx:vba8f_v1
    DOI: 10.31219/osf.io/vba8f_v1
    as

    Download full text from publisher

    File URL: https://osf.io/download/69cc55c860c88fd37251f569/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/vba8f_v1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shelby C. McClelland & Deborah Bossio & Doria R. Gordon & Johannes Lehmann & Matthew N. Hayek & Stephen M. Ogle & Jonathan Sanderman & Stephen A. Wood & Yi Yang & Dominic Woolf, 2025. "Managing for climate and production goals on crop-lands," Nature Climate Change, Nature, vol. 15(6), pages 642-649, June.
    2. Backer, David & Billing, Trey, 2024. "Forecasting the prevalence of child acute malnutrition using environmental and conflict conditions as leading indicators," World Development, Elsevier, vol. 176(C).
    3. David Mouillot & Laure Velez & Camille Albouy & Nicolas Casajus & Joachim Claudet & Vincent Delbar & Rodolphe Devillers & Tom B. Letessier & Nicolas Loiseau & Stéphanie Manel & Laura Mannocci & Jessic, 2024. "The socioeconomic and environmental niche of protected areas reveals global conservation gaps and opportunities," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    4. Luis A Barboza & Shu-Wei Chou-Chen & Paola Vásquez & Yury E García & Juan G Calvo & Hugo G Hidalgo & Fabio Sanchez, 2023. "Assessing dengue fever risk in Costa Rica by using climate variables and machine learning techniques," PLOS Neglected Tropical Diseases, Public Library of Science, vol. 17(1), pages 1-13, January.
    5. Blum, Ricardo & Hiabu, Munir & Mammen, Enno & Meyer, Joseph T., 2025. "Pure interaction effects unseen by Random Forests," Computational Statistics & Data Analysis, Elsevier, vol. 212(C).
    6. Mariana Oliveira & Luís Torgo & Vítor Santos Costa, 2021. "Evaluation Procedures for Forecasting with Spatiotemporal Data," Mathematics, MDPI, vol. 9(6), pages 1-27, March.
    7. Massimo Bourquin & Hannes Peter & Grégoire Michoud & Susheel Bhanu Busi & Tyler J. Kohler & Andrew L. Robison & Mike Styllas & Leïla Ezzat & Aileen U. Geers & Matthias Huss & Stilianos Fodelianakis & , 2025. "Predicting climate-change impacts on the global glacier-fed stream microbiome," Nature Communications, Nature, vol. 16(1), pages 1-12, December.
    8. Frink, Nicolas & Schmid, Timo, 2025. "Small area prediction of counts under machine learning-type mixed models," Computational Statistics & Data Analysis, Elsevier, vol. 211(C).
    9. Yuanyuan Shi & Junyu Zhao & Xianchong Song & Zuoyu Qin & Lichao Wu & Huili Wang & Jian Tang, 2021. "Hyperspectral band selection and modeling of soil organic matter content in a forest using the Ranger algorithm," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-15, June.
    10. Marcela Mendoza-Suárez & Turgut Yigit Akyol & Marcin Nadzieja & Stig U. Andersen, 2024. "Increased diversity of beneficial rhizobia enhances faba bean growth," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    11. Andreas D. Meid & Lucas Wirbka, 2022. "Can Machine Learning from Real-World Data Support Drug Treatment Decisions? A Prediction Modeling Case for Direct Oral Anticoagulants," Medical Decision Making, , vol. 42(5), pages 587-598, July.
    12. Bokelmann, Björn & Lessmann, Stefan, 2024. "Improving uplift model evaluation on randomized controlled trial data," European Journal of Operational Research, Elsevier, vol. 313(2), pages 691-707.
    13. Joel Podgorski & Oliver Kracht & Luis Araguas-Araguas & Stefan Terzer-Wassmuth & Jodie Miller & Ralf Straub & Rolf Kipfer & Michael Berg, 2024. "Groundwater vulnerability to pollution in Africa’s Sahel region," Nature Sustainability, Nature, vol. 7(5), pages 558-567, May.
    14. Heinisch, Katja & Scaramella, Fabio & Schult, Christoph, 2025. "Assumption errors and forecast accuracy: A partial linear instrumental variable and double machine learning approach," IWH Discussion Papers 6/2025, Halle Institute for Economic Research (IWH).
    15. Christopher Weyant & Serin Lee & Jeremy D. Goldhaber-Fiebert, 2026. "Reinforcement Learning-Based Control of Epidemics on Networks of Communities and Correctional Facilities," Medical Decision Making, , vol. 46(2), pages 216-225, February.
    16. Nayiri Galestian Pour & Soudabeh Shemehsavar, 2024. "Learning from high dimensional data based on weighted feature importance in decision tree ensembles," Computational Statistics, Springer, vol. 39(1), pages 313-342, February.
    17. Bazyli Czyżewski & Jakub Staniszewski & Joanna Staniszewska & Marta Guth, 2025. "Does Increasing Agricultural Efficiency Contribute to Food Security—Trade‐Offs of Value Addition in Crop Production?," Sustainable Development, John Wiley & Sons, Ltd., vol. 33(S1), pages 939-970, November.
    18. Enrica, Garau & Josep, Pueyo-Ros & Amanda, Jiménez-Aceituno & Garry, Peterson & Albert, Norström & Anna, Ribas Palom & Josep, Vila-Subirós, 2023. "Landscape features shape people’s perception of ecosystem service supply areas," Ecosystem Services, Elsevier, vol. 64(C).
    19. Kouame, Anselme K.K. & Bindraban, Prem S. & Kissiedu, Isaac N. & Atakora, Williams K. & El Mejahed, Khalil, 2023. "Identifying drivers for variability in maize (Zea mays L.) yield in Ghana: A meta-regression approach," Agricultural Systems, Elsevier, vol. 209(C).
    20. Arjan S. Gosal & Janine A. McMahon & Katharine M. Bowgen & Catherine H. Hoppe & Guy Ziv, 2021. "Identifying and Mapping Groups of Protected Area Visitors by Environmental Awareness," Land, MDPI, vol. 10(6), pages 1-14, May.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:socarx:vba8f_v1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://arabixiv.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.