IDEAS home Printed from https://ideas.repec.org/p/clm/pomwps/1003.html
   My bibliography  Save this paper

The Paradox of Big Data

Author

Listed:
  • Smith, Gary

    (Pomona College)

Abstract

Data-mining is often used to discover patterns in Big Data. It is tempting believe that because an unearthed pattern is unusual it must be meaningful, but patterns are inevitable in Big Data and usually meaningless. The paradox of Big Data is that data mining is most seductive when there are a large number of variables, but a large number of variables exacerbates the perils of data mining.

Suggested Citation

  • Smith, Gary, 2019. "The Paradox of Big Data," Economics Department, Working Paper Series 1003, Economics Department, Pomona College, revised 04 Jun 2019.
  • Handle: RePEc:clm:pomwps:1003
    as

    Download full text from publisher

    File URL: https://scholarship.claremont.edu/cgi/viewcontent.cgi?article=1003&context=pomona_fac_econ
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    2. Gordon Tullock, 2001. "A Comment on Daniel Klein's "A Plea to Economists Who Favor Liberty."," Eastern Economic Journal, Eastern Economic Association, vol. 27(2), pages 203-207, Spring.
    3. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    4. Susan Athey, 2018. "The Impact of Machine Learning on Economics," NBER Chapters, in: The Economics of Artificial Intelligence: An Agenda, pages 507-547, National Bureau of Economic Research, Inc.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Akash Malhotra, 2018. "A hybrid econometric-machine learning approach for relative importance analysis: Prioritizing food policy," Papers 1806.04517, arXiv.org, revised Aug 2020.
    2. Croux, Christophe & Jagtiani, Julapa & Korivi, Tarunsai & Vulanovic, Milos, 2020. "Important factors determining Fintech loan default: Evidence from a lendingclub consumer platform," Journal of Economic Behavior & Organization, Elsevier, vol. 173(C), pages 270-296.
    3. Galdo, Virgilio & Li, Yue & Rama, Martin, 2021. "Identifying urban areas by combining human judgment and machine learning: An application to India," Journal of Urban Economics, Elsevier, vol. 125(C).
    4. Robertas Damasevicius, 2023. "Progress, Evolving Paradigms and Recent Trends in Economic Analysis," Financial Economics Letters, Anser Press, vol. 2(2), pages 35-47, October.
    5. Onder Ozgur & Erdal Tanas Karagol & Fatih Cemil Ozbugday, 2021. "Machine learning approach to drivers of bank lending: evidence from an emerging economy," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 7(1), pages 1-29, December.
    6. Byron Botha & Rulof Burger & Kevin Kotzé & Neil Rankin & Daan Steenkamp, 2023. "Big data forecasting of South African inflation," Empirical Economics, Springer, vol. 65(1), pages 149-188, July.
    7. Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    8. Paolo Brunori & Vito Peragine & Laura Serlenga, 2019. "Upward and downward bias when measuring inequality of opportunity," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 52(4), pages 635-661, April.
    9. Andini, Monica & Boldrini, Michela & Ciani, Emanuele & de Blasio, Guido & D'Ignazio, Alessio & Paladini, Andrea, 2022. "Machine learning in the service of policy targeting: The case of public credit guarantees," Journal of Economic Behavior & Organization, Elsevier, vol. 198(C), pages 434-475.
    10. Fabio Pammolli & Paolo Bonaretti & Massimo Riccaboni & Valentina Tortolini, 2019. "Quali Regole per la Spesa Farmaceutica? - Criticità, Impatti, Proposte," Working Papers CERM 01-2019, Competitività, Regole, Mercati (CERM).
    11. Kea BARET, 2021. "Fiscal rules’ compliance and Social Welfare," Working Papers of BETA 2021-38, Bureau d'Economie Théorique et Appliquée, UDS, Strasbourg.
    12. Pablo Picardo, 2019. "Predicción de precios de vivienda: Aprendizaje estadístico con datos de oferta y transacciones para la ciudad de Montevideo," Documentos de trabajo 2019002, Banco Central del Uruguay.
    13. Joey Blumberg & Gary Thompson, 2022. "Nonparametric segmentation methods: Applications of unsupervised machine learning and revealed preference," American Journal of Agricultural Economics, John Wiley & Sons, vol. 104(3), pages 976-998, May.
    14. Mehmet Güney Celbiş, 2021. "A machine learning approach to rural entrepreneurship," Papers in Regional Science, Wiley Blackwell, vol. 100(4), pages 1079-1104, August.
    15. James T. E. Chapman & Ajit Desai, 2023. "Macroeconomic Predictions Using Payments Data and Machine Learning," Forecasting, MDPI, vol. 5(4), pages 1-32, November.
    16. Andini, Monica & Ciani, Emanuele & de Blasio, Guido & D'Ignazio, Alessio & Salvestrini, Viola, 2018. "Targeting with machine learning: An application to a tax rebate program in Italy," Journal of Economic Behavior & Organization, Elsevier, vol. 156(C), pages 86-102.
    17. Michael T. Kiley, 2020. "Financial Conditions and Economic Activity: Insights from Machine Learning," Finance and Economics Discussion Series 2020-095, Board of Governors of the Federal Reserve System (U.S.).
    18. Brunori, Paolo & Hufe, Paul & Mahler, Daniel Gerszon, 2021. "The Roots of Inequality: Estimating Inequality of Opportunity from Regression Trees and Forests," IZA Discussion Papers 14689, Institute of Labor Economics (IZA).
    19. Sophie-Charlotte Klose & Johannes Lederer, 2020. "A Pipeline for Variable Selection and False Discovery Rate Control With an Application in Labor Economics," Papers 2006.12296, arXiv.org, revised Jun 2020.
    20. Philippe Goulet Coulombe & Maxime Leroux & Dalibor Stevanovic & Stéphane Surprenant, 2022. "How is machine learning useful for macroeconomic forecasting?," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 920-964, August.

    More about this item

    Keywords

    data mining; big data; machine learning;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:clm:pomwps:1003. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Candace Lebel (email available below). General contact details of provider: https://edirc.repec.org/data/depomus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.