IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2501.06587.html
   My bibliography  Save this paper

Optimizing Financial Data Analysis: A Comparative Study of Preprocessing Techniques for Regression Modeling of Apple Inc.'s Net Income and Stock Prices

Author

Listed:
  • Kevin Ungar
  • Camelia Oprean-Stan

Abstract

This article presents a comprehensive methodology for processing financial datasets of Apple Inc., encompassing quarterly income and daily stock prices, spanning from March 31, 2009, to December 31, 2023. Leveraging 60 observations for quarterly income and 3774 observations for daily stock prices, sourced from Macrotrends and Yahoo Finance respectively, the study outlines five distinct datasets crafted through varied preprocessing techniques. Through detailed explanations of aggregation, interpolation (linear, polynomial, and cubic spline) and lagged variables methods, the study elucidates the steps taken to transform raw data into analytically rich datasets. Subsequently, the article delves into regression analysis, aiming to decipher which of the five data processing methods best suits capital market analysis, by employing both linear and polynomial regression models on each preprocessed dataset and evaluating their performance using a range of metrics, including cross-validation score, MSE, MAE, RMSE, R-squared, and Adjusted R-squared. The research findings reveal that linear interpolation with polynomial regression emerges as the top-performing method, boasting the lowest validation MSE and MAE values, alongside the highest R-squared and Adjusted R-squared values.

Suggested Citation

  • Kevin Ungar & Camelia Oprean-Stan, 2025. "Optimizing Financial Data Analysis: A Comparative Study of Preprocessing Techniques for Regression Modeling of Apple Inc.'s Net Income and Stock Prices," Papers 2501.06587, arXiv.org.
  • Handle: RePEc:arx:papers:2501.06587
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2501.06587
    File Function: Latest version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Fried, Harold O. & Lovell, C. A. Knox & Schmidt, Shelton S. (ed.), 2008. "The Measurement of Productive Efficiency and Productivity Growth," OUP Catalogue, Oxford University Press, number 9780195183528, Decembrie.
    2. Andreas Lanz & Gregor Reich & Ole Wilms, 2022. "Adaptive grids for the estimation of dynamic models," Quantitative Marketing and Economics (QME), Springer, vol. 20(2), pages 179-238, June.
    3. Gründler, Klaus & Krieger, Tommy, 2022. "Should we care (more) about data aggregation?," European Economic Review, Elsevier, vol. 142(C).
    4. Jianhong Guo & Che-Jung Chang & Yingyi Huang & Xiaotian Zhang & Andrea Murari, 2022. "An Aggregating Prediction Model for Management Decision Analysis," Complexity, Hindawi, vol. 2022, pages 1-7, May.
    5. Esra’a Alshdaifat & Doa’a Alshdaifat & Ayoub Alsarhan & Fairouz Hussein & Subhieh Moh’d Faraj S. El-Salhi, 2021. "The Effect of Preprocessing Techniques, Applied to Numeric Features, on Classification Algorithms’ Performance," Data, MDPI, vol. 6(2), pages 1-23, January.
    6. Vasile Brătian & Ana-Maria Acu & Camelia Oprean-Stan & Emil Dinga & Gabriela-Mariana Ionescu, 2021. "Efficient or Fractal Market Hypothesis? A Stock Indexes Modelling Using Geometric Brownian Motion and Geometric Fractional Brownian Motion," Mathematics, MDPI, vol. 9(22), pages 1-20, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Barros, Carlos Pestana & Williams, Jonathan, 2013. "The random parameters stochastic frontier cost function and the effectiveness of public policy: Evidence from bank restructuring in Mexico," International Review of Financial Analysis, Elsevier, vol. 30(C), pages 98-108.
    2. Borisova, Ekaterina & Gründler, Klaus & Hackenberger, Armin & Harter, Anina & Potrafke, Niklas & Schoors, Koen, 2023. "Crisis experience and the deep roots of COVID-19 vaccination preferences," European Economic Review, Elsevier, vol. 160(C).
    3. Aleksandar Kemiveš & Lidija Barjaktarović & Milan Ranđelović & Milan Čabarkapa & Dragan Ranđelović, 2024. "Assessing the Efficiency of Foreign Investment in a Certification Procedure Using an Ensemble Machine Learning Model," Mathematics, MDPI, vol. 12(7), pages 1-26, March.
    4. Margarita Genius & Spiro Stefanou & Vangelis Tzouvelekas, 2009. "Productivity Growth and Efficiency under Leontief Technology: An Application to US Steam-Electric Power Generation Utilities," Working Papers 0913, University of Crete, Department of Economics.
    5. Vaneet Bhatia & Sankarshan Basu & Subrata Kumar Mitra & Pradyumna Dash, 2018. "A review of bank efficiency and productivity," OPSEARCH, Springer;Operational Research Society of India, vol. 55(3), pages 557-600, November.
    6. Deng, Yaguo & Lopes Moreira da Veiga, María Helena & Wiper, Michael Peter, 2016. "Efficiency evaluation of Spanish hotel chains," DES - Working Papers. Statistics and Econometrics. WS 23897, Universidad Carlos III de Madrid. Departamento de Estadística.
    7. Yu, Chenyang & Tan, Yuanfang & Zhou, Yu & Zang, Chuanxiang & Tu, Chenglin, 2022. "Can functional urban specialization improve industrial energy efficiency? Empirical evidence from China," Energy, Elsevier, vol. 261(PA).
    8. Martins-Filho, Carlos & Ziegelmann, Flávio Augusto & Torrent, Hudson da Silva, 2013. "Local Exponential Frontier Estimation," Brazilian Review of Econometrics, Sociedade Brasileira de Econometria - SBE, vol. 33(2), November.
    9. Barnabé Walheer, 2018. "Cost Malmquist productivity index: an output-specific approach for group comparison," Journal of Productivity Analysis, Springer, vol. 49(1), pages 79-94, February.
    10. Cazals Catherine & Dudley Paul & Florens Jean-Pierre & Jones Michael, 2011. "The Effect of Unobserved Heterogeneity in Stochastic Frontier Estimation: Comparison of Cross Section and Panel with Simulated Data for the Postal Sector," Review of Network Economics, De Gruyter, vol. 10(3), pages 1-22, September.
    11. Mohamed E. Chaffai, 2022. "New evidence on Islamic and conventional bank efficiency: A meta‐regression analysis," Bulletin of Economic Research, Wiley Blackwell, vol. 74(1), pages 221-246, January.
    12. Mustafa U. Karakaplan & Levent Kutlu, 2019. "School district consolidation policies: endogenous cost inefficiency and saving reversals," Empirical Economics, Springer, vol. 56(5), pages 1729-1768, May.
    13. Simone Gitto, 2017. "Efficiency change, technological change and capital accumulation in Italian regions: a sectoral study," International Review of Applied Economics, Taylor & Francis Journals, vol. 31(2), pages 191-207, March.
    14. Barros, Carlos P. & Guironnet, Jean-Pascal & Peypoch, Nicolas, 2011. "Productivity growth and biased technical change in French higher education," Economic Modelling, Elsevier, vol. 28(1-2), pages 641-646, January.
    15. Genius, Margarita & Stefanou, Spiro E. & Tzouvelekas, Vangelis, 2012. "Measuring productivity growth under factor non-substitution: An application to US steam-electric power generation utilities," European Journal of Operational Research, Elsevier, vol. 220(3), pages 844-852.
    16. Sutirtha Bagchi & Matthew J. Fagerstrom, 2023. "Wealth inequality and democracy," Public Choice, Springer, vol. 197(1), pages 89-136, October.
    17. Walden, John & Fissel, Ben & Squires, Dale & Vestergaard, Niels, 2015. "Productivity change in commercial fisheries: An introduction to the special issue," Marine Policy, Elsevier, vol. 62(C), pages 289-293.
    18. Massimo Filippini & Luis Orea, 2014. "Applications of the stochastic frontier approach in Energy Economics," Economics and Business Letters, Oviedo University Press, vol. 3(1), pages 35-42.
    19. Planinc Tanja & Kukanja Marko & Žnidaršič Anja, 2022. "The Interplay of Restaurant SMEs’ Entrepreneurial and Environmental Characteristics, Management of the Requisite Assets, and Operational Efficiency," Organizacija, Sciendo, vol. 55(2), pages 160-177, May.
    20. Hery Purnomo Tunggal & Tati Suhartati Joesron, 2019. "Technical Efficiency Analysis Of Indonesian Small And Micro Industries: A Stochastic Frontier Approach," Working Papers in Economics and Development Studies (WoPEDS) 201903, Department of Economics, Padjadjaran University, revised Nov 2019.

    More about this item

    JEL classification:

    • G14 - Financial Economics - - General Financial Markets - - - Information and Market Efficiency; Event Studies; Insider Trading
    • G21 - Financial Economics - - Financial Institutions and Services - - - Banks; Other Depository Institutions; Micro Finance Institutions; Mortgages
    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
    • G32 - Financial Economics - - Corporate Finance and Governance - - - Financing Policy; Financial Risk and Risk Management; Capital and Ownership Structure; Value of Firms; Goodwill

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2501.06587. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.