IDEAS home Printed from https://ideas.repec.org/a/kap/iaecre/v27y2021i1d10.1007_s11294-021-09815-6.html
   My bibliography  Save this article

An Introduction to Machine Learning for Panel Data

Author

Listed:
  • James Ming Chen

    (Michigan State University
    Silver Leaf Capital LLC)

Abstract

Machine learning has dramatically expanded the range of tools for evaluating economic panel data. This paper applies a variety of machine-learning methods to the Boston housing dataset, an iconic proving ground for machine learning. Though machine learning often lacks the overt interpretability of linear regression, methods based on decision trees score the relative importance of dataset features. In addition to addressing the theoretical tradeoff between bias and variance, this paper discusses practices rarely followed in traditional economics: the splitting of data into training, validation, and test sets; the scaling of data; and the preference for retaining all data. The choice between traditional and machine-learning methods hinges on practical rather than mathematical considerations. In settings emphasizing interpretative clarity through the scale and sign of regression coefficients, machine learning may best play an ancillary role. Wherever predictive accuracy is paramount, however, or where heteroskedasticity or high dimensionality might impair the clarity of linear methods, machine learning can deliver superior results.

Suggested Citation

  • James Ming Chen, 2021. "An Introduction to Machine Learning for Panel Data," International Advances in Economic Research, Springer;International Atlantic Economic Society, vol. 27(1), pages 1-16, February.
  • Handle: RePEc:kap:iaecre:v:27:y:2021:i:1:d:10.1007_s11294-021-09815-6
    DOI: 10.1007/s11294-021-09815-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11294-021-09815-6
    File Function: Abstract
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1007/s11294-021-09815-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Susan Athey & Guido W. Imbens, 2019. "Machine Learning Methods That Economists Should Know About," Annual Review of Economics, Annual Reviews, vol. 11(1), pages 685-725, August.
    2. Harrison, David Jr. & Rubinfeld, Daniel L., 1978. "Hedonic housing prices and the demand for clean air," Journal of Environmental Economics and Management, Elsevier, vol. 5(1), pages 81-102, March.
    3. Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
    4. Athey, Susan & Imbens, Guido W., 2019. "Machine Learning Methods Economists Should Know About," Research Papers 3776, Stanford University, Graduate School of Business.
    5. Roberto Rigobon, 2003. "Identification Through Heteroskedasticity," The Review of Economics and Statistics, MIT Press, vol. 85(4), pages 777-792, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Evaggelia Siopi & Thomas Poufinas & James Ming Chen & Charalampos Agiropoulos, 2023. "Can Regulation Affect the Solvency of Insurers? New Evidence from European Insurers," International Advances in Economic Research, Springer;International Atlantic Economic Society, vol. 29(1), pages 15-30, May.
    2. Zhu, Haibin & Bai, Lu & He, Lidan & Liu, Zhi, 2023. "Forecasting realized volatility with machine learning: Panel data perspective," Journal of Empirical Finance, Elsevier, vol. 73(C), pages 251-271.
    3. James Ming Chen & Mira Zovko & Nika Šimurina & Vatroslav Zovko, 2021. "Fear in a Handful of Dust: The Epidemiological, Environmental, and Economic Drivers of Death by PM 2.5 Pollution," IJERPH, MDPI, vol. 18(16), pages 1-59, August.
    4. Chen, James Ming & Rehman, Mobeen Ur & Vo, Xuan Vinh, 2021. "Clustering commodity markets in space and time: Clarifying returns, volatility, and trading regimes through unsupervised machine learning," Resources Policy, Elsevier, vol. 73(C).
    5. Charalampos Agiropoulos & Georgios Galanos & Thomas Poufinas, 2021. "Entrepreneurship, Income Inequality and Public Spending: A Spatial Analysis into Regional Determinants of Growing Firms in Greece," International Advances in Economic Research, Springer;International Atlantic Economic Society, vol. 27(3), pages 197-218, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Barzin,Samira & Avner,Paolo & Maruyama Rentschler,Jun Erik & O’Clery,Neave, 2022. "Where Are All the Jobs ? A Machine Learning Approach for High Resolution Urban Employment Prediction inDeveloping Countries," Policy Research Working Paper Series 9979, The World Bank.
    2. Mehmet Güney Celbiş & Pui-Hang Wong & Karima Kourtit & Peter Nijkamp, 2021. "Innovativeness, Work Flexibility, and Place Characteristics: A Spatial Econometric and Machine Learning Approach," Sustainability, MDPI, vol. 13(23), pages 1-29, December.
    3. Islam, Towhidul & Meade, Nigel & Carson, Richard T. & Louviere, Jordan J. & Wang, Juan, 2022. "The usefulness of socio-demographic variables in predicting purchase decisions: Evidence from machine learning procedures," Journal of Business Research, Elsevier, vol. 151(C), pages 324-338.
    4. Mehmet Güney Celbiş, 2021. "A machine learning approach to rural entrepreneurship," Papers in Regional Science, Wiley Blackwell, vol. 100(4), pages 1079-1104, August.
    5. Mehmet Güney Celbiş & Pui‐hang Wong & Karima Kourtit & Peter Nijkamp, 2023. "Impacts of the COVID‐19 outbreak on older‐age cohorts in European Labor Markets: A machine learning exploration of vulnerable groups," Regional Science Policy & Practice, Wiley Blackwell, vol. 15(3), pages 559-584, April.
    6. Akash Malhotra, 2021. "A hybrid econometric–machine learning approach for relative importance analysis: prioritizing food policy," Eurasian Economic Review, Springer;Eurasia Business and Economics Society, vol. 11(3), pages 549-581, September.
    7. Gabriel Okasa & Kenneth A. Younge, 2022. "Sample Fit Reliability," Papers 2209.06631, arXiv.org.
    8. Sophie-Charlotte Klose & Johannes Lederer, 2020. "A Pipeline for Variable Selection and False Discovery Rate Control With an Application in Labor Economics," Papers 2006.12296, arXiv.org, revised Jun 2020.
    9. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    10. Ay, Jean-Sauveur & Le Gallo, Julie, 2021. "The Signaling Values of Nested Wine Names," Working Papers 321851, American Association of Wine Economists.
    11. Dangxing Chen & Luyao Zhang, 2023. "Monotonicity for AI ethics and society: An empirical study of the monotonic neural additive model in criminology, education, health care, and finance," Papers 2301.07060, arXiv.org.
    12. Daniel Levy & Tamir Mayer & Alon Raviv, 2020. "Academic Scholarship in Light of the 2008 Financial Crisis: Textual Analysis of NBER Working Papers," Working Papers hal-02488796, HAL.
    13. Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    14. Arenas, Andreu & Calsamiglia, Caterina, 2022. "Gender Differences in High-Stakes Performance and College Admission Policies," IZA Discussion Papers 15550, Institute of Labor Economics (IZA).
    15. Tsang, Andrew, 2021. "Uncovering Heterogeneous Regional Impacts of Chinese Monetary Policy," MPRA Paper 110703, University Library of Munich, Germany.
    16. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    17. Daniel Goller, 2023. "Analysing a built-in advantage in asymmetric darts contests using causal machine learning," Annals of Operations Research, Springer, vol. 325(1), pages 649-679, June.
    18. Doumpos, Michalis & Zopounidis, Constantin & Gounopoulos, Dimitrios & Platanakis, Emmanouil & Zhang, Wenke, 2023. "Operational research and artificial intelligence methods in banking," European Journal of Operational Research, Elsevier, vol. 306(1), pages 1-16.
    19. Hannes Wallimann & Silvio Sticher, 2023. "On suspicious tracks: machine-learning based approaches to detect cartels in railway-infrastructure procurement," Papers 2304.11888, arXiv.org.
    20. Rodríguez-Vargas, Adolfo, 2020. "Forecasting Costa Rican inflation with machine learning methods," Latin American Journal of Central Banking (previously Monetaria), Elsevier, vol. 1(1).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:kap:iaecre:v:27:y:2021:i:1:d:10.1007_s11294-021-09815-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.