IDEAS home Printed from https://ideas.repec.org/h/spr/advbcp/978-94-6463-652-9_69.html

Unlocking Stock Return Predictions: Using Financial Statements with Random Forest and PCA

In: Proceedings of the International Workshop on Navigating the Digital Business Frontier for Sustainable Financial Innovation (ICDEBA 2024)

Author

Listed:
  • Yinan Jin

    (Beijing University of Technology, College of Computer Science)

Abstract

Financial statements are pivotal for forecasting the future performance of stocks. Harnessing the random forest machine learning model, this study aims to enhance the prediction of quarterly stock returns by focusing on twelve critical financial indicators. This paper utilized Principal Component Analysis (PCA) for dimensionality reduction and feature selection, aiming to optimize the model's predictive accuracy. The dataset encompassed quarterly financial statements and stock data for the 100 constituent stocks of the NASDAQ 100 index from 2010 to 2020. The PCA analysis revealed that reducing the input features to six dimensions significantly improved the model's predictive performance, as indicated by Mean Squared Error (MSE) and Mean Absolute Error (MAE). This finding suggests that an overabundance of components can introduce unnecessary complexity, potentially detracting from the model's predictive capabilities. The feature importance assessment, conducted using the random forest algorithm, identified Volatility, Revenue Growth Rate, and Return as the most influential predictors. Notably, the optimal predictive performance was achieved with the inclusion of seven and five top features, respectively, highlighting the non-linear relationship between the number of features and model performance. This comprehensive study underscores the utility of the random forest model in predicting stock returns and emphasizes the critical role of dimensionality reduction and feature selection refinement in enhancing predictive accuracy.

Suggested Citation

  • Yinan Jin, 2025. "Unlocking Stock Return Predictions: Using Financial Statements with Random Forest and PCA," Advances in Economics, Business and Management Research, in: Junfeng Lu (ed.), Proceedings of the International Workshop on Navigating the Digital Business Frontier for Sustainable Financial Innovation (ICDEBA 2024), pages 664-673, Springer.
  • Handle: RePEc:spr:advbcp:978-94-6463-652-9_69
    DOI: 10.2991/978-94-6463-652-9_69
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a
    for a similarly titled item that would be available.

    More about this item

    Keywords

    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:advbcp:978-94-6463-652-9_69. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.