IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i6p993-d774885.html
   My bibliography  Save this article

Statistical Methods with Applications in Data Mining: A Review of the Most Recent Works

Author

Listed:
  • Joaquim Fernando Pinto da Costa

    (CMUP, Departamento de Matemática, Faculdade de Ciências, Universidade do Porto, rua do Campo Alegre s/n, 4169-007 Porto, Portugal
    These authors contributed equally to this work.)

  • Manuel Cabral

    (Departamento de Matemática, Faculdade de Ciências, Universidade do Porto, rua do Campo Alegre s/n, 4169-007 Porto, Portugal
    These authors contributed equally to this work.)

Abstract

The importance of statistical methods in finding patterns and trends in otherwise unstructured and complex large sets of data has grown over the past decade, as the amount of data produced keeps growing exponentially and knowledge obtained from understanding data allows to make quick and informed decisions that save time and provide a competitive advantage. For this reason, we have seen considerable advances over the past few years in statistical methods in data mining. This paper is a comprehensive and systematic review of these recent developments in the area of data mining.

Suggested Citation

  • Joaquim Fernando Pinto da Costa & Manuel Cabral, 2022. "Statistical Methods with Applications in Data Mining: A Review of the Most Recent Works," Mathematics, MDPI, vol. 10(6), pages 1-22, March.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:6:p:993-:d:774885
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/6/993/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/6/993/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Cattaneo, Matias D & Jansson, Michael & Ma, Xinwei, 2020. "Simple Local Polynomial Density Estimators," University of California at San Diego, Economics Working Paper Series qt9vt997qn, Department of Economics, UC San Diego.
    2. Yaowu Liu & Jun Xie, 2020. "Cauchy Combination Test: A Powerful Test With Analytic p-Value Calculation Under Arbitrary Dependency Structures," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(529), pages 393-402, January.
    3. Gökcen Eraslan & Lukas M. Simon & Maria Mircea & Nikola S. Mueller & Fabian J. Theis, 2019. "Single-cell RNA-seq denoising using a deep count autoencoder," Nature Communications, Nature, vol. 10(1), pages 1-14, December.
    4. Mudong Zeng & Yujie Liao & Runze Li & Agus Sudjianto, 2022. "Local Linear Approximation Algorithm for Neural Network," Mathematics, MDPI, vol. 10(3), pages 1-22, February.
    5. Kwon, Sunghoon & Lee, Sangin & Kim, Yongdai, 2015. "Moderately clipped LASSO," Computational Statistics & Data Analysis, Elsevier, vol. 92(C), pages 53-67.
    6. Andrew Gelman & Ben Goodrich & Jonah Gabry & Aki Vehtari, 2019. "R-squared for Bayesian Regression Models," The American Statistician, Taylor & Francis Journals, vol. 73(3), pages 307-309, July.
    7. Cattaneo, Matias D & Jansson, Michael & Ma, Xinwei, 2020. "Simple Local Polynomial Density Estimators," Department of Economics, Working Paper Series qt9vt997qn, Department of Economics, Institute for Business and Economic Research, UC Berkeley.
    8. Daniel W. Apley & Jingyu Zhu, 2020. "Visualizing the effects of predictor variables in black box supervised learning models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(4), pages 1059-1086, September.
    9. Matias D. Cattaneo & Michael Jansson & Xinwei Ma, 2020. "Simple Local Polynomial Density Estimators," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(531), pages 1449-1455, July.
    10. Gao Wang & Abhishek Sarkar & Peter Carbonetto & Matthew Stephens, 2020. "A simple new approach to variable selection in regression, with application to genetic fine mapping," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(5), pages 1273-1300, December.
    11. Babacar Gaye & Dezheng Zhang & Aziguli Wulamu, 2021. "Improvement of Support Vector Machine Algorithm in Big Data Background," Mathematical Problems in Engineering, Hindawi, vol. 2021, pages 1-9, June.
    12. Tingyou Zhou & Liping Zhu & Chen Xu & Runze Li, 2020. "Model-Free Forward Screening Via Cumulative Divergence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(531), pages 1393-1405, July.
    13. Bethany Lusch & J. Nathan Kutz & Steven L. Brunton, 2018. "Deep learning for universal linear embeddings of nonlinear dynamics," Nature Communications, Nature, vol. 9(1), pages 1-10, December.
    14. Quentin F. Gronau & Alexander Ly & Eric-Jan Wagenmakers, 2020. "Informed Bayesian t-Tests," The American Statistician, Taylor & Francis Journals, vol. 74(2), pages 137-143, April.
    15. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    16. Qiang Sun & Wen-Xin Zhou & Jianqing Fan, 2020. "Adaptive Huber Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(529), pages 254-265, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Adelaida Ojeda-Beltrán & Andrés Solano-Barliza & Wilson Arrubla-Hoyos & Danny Daniel Ortega & Dora Cama-Pinto & Juan Antonio Holgado-Terriza & Miguel Damas & Gilberto Toscano-Vanegas & Alejandro Cama-, 2023. "Characterisation of Youth Entrepreneurship in Medellín-Colombia Using Machine Learning," Sustainability, MDPI, vol. 15(13), pages 1-19, June.
    2. Khishigsuren Davagdorj & Ling Wang & Meijing Li & Van-Huy Pham & Keun Ho Ryu & Nipon Theera-Umpon, 2022. "Discovering Thematically Coherent Biomedical Documents Using Contextualized Bidirectional Encoder Representations from Transformers-Based Clustering," IJERPH, MDPI, vol. 19(10), pages 1-21, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Andini, Monica & Boldrini, Michela & Ciani, Emanuele & de Blasio, Guido & D'Ignazio, Alessio & Paladini, Andrea, 2022. "Machine learning in the service of policy targeting: The case of public credit guarantees," Journal of Economic Behavior & Organization, Elsevier, vol. 198(C), pages 434-475.
    2. Francesco Decarolis & Raymond Fisman & Paolo Pinotti & Silvia Vannutelli, 2019. "Rules, Discretion, and Corruption in Procurement: Evidence from Italian Government Contracting," Boston University - Department of Economics - The Institute for Economic Development Working Papers Series dp-344, Boston University - Department of Economics.
    3. Eibich, Peter & Siedler, Thomas, 2020. "Retirement, intergenerational time transfers, and fertility," European Economic Review, Elsevier, vol. 124(C).
    4. Luis R. Martinez & Jonas Jessen & Guo Xu, 2023. "A Glimpse of Freedom: Allied Occupation and Political Resistance in East Germany," American Economic Journal: Applied Economics, American Economic Association, vol. 15(1), pages 68-106, January.
    5. Aaron Albert & Nathan Wozny, 2024. "The Impact of Academic Probation: Do Intensive Interventions Help?," Journal of Human Resources, University of Wisconsin Press, vol. 59(3), pages 852-878.
    6. Annika Lindskog & Dick Durevall, 2021. "To educate a woman and to educate a man: Gender‐specific sexual behavior and human immunodeficiency virus responses to an education reform in Botswana," Health Economics, John Wiley & Sons, Ltd., vol. 30(3), pages 642-658, March.
    7. Albanese, Andrea & Picchio, Matteo & Ghirelli, Corinna, 2020. "Timed to Say Goodbye: Does Unemployment Benefit Eligibility Affect Worker Layoffs?," Labour Economics, Elsevier, vol. 65(C).
    8. Canaan, Serena & Mouganie, Pierre & Zhang, Peng, 2022. "The Long-Run Educational Benefits of High-Achieving Classrooms," IZA Discussion Papers 15039, Institute of Labor Economics (IZA).
    9. Johnsen, Julian V. & Willén, Alexander, 2022. "The effect of negative income shocks on pensioners," Labour Economics, Elsevier, vol. 76(C).
    10. Abel Brodeur, Nikolai M. Cook, Anthony Heyes, 2022. "We Need to Talk about Mechanical Turk: What 22,989 Hypothesis Tests Tell Us about Publication Bias and p-Hacking in Online Experiments," LCERPA Working Papers am0133, Laurier Centre for Economic Research and Policy Analysis.
    11. Bagues, Manuel & Campa, Pamela, 2021. "Can gender quotas in candidate lists empower women? Evidence from a regression discontinuity design," Journal of Public Economics, Elsevier, vol. 194(C).
    12. Meltem Dayioglu & Müşerref Küçükbayrak & Semih Tumen, 2022. "The impact of age-specific minimum wages on youth employment and education: a regression discontinuity analysis," International Journal of Manpower, Emerald Group Publishing Limited, vol. 43(6), pages 1352-1377, March.
    13. Federico Boffa & Vincenzo Mollisi & Giacomo A. M. Ponzetto, 2023. "Do incompetent politicians breed populist voters? Evidence from Italian municipalities," Economics Working Papers 1861, Department of Economics and Business, Universitat Pompeu Fabra.
    14. Gonzalez-Eiras, Martín & Sanz, Carlos, 2021. "Women’s representation in politics: The effect of electoral systems," Journal of Public Economics, Elsevier, vol. 198(C).
    15. repec:irs:cepswp:2024-01 is not listed on IDEAS
    16. Gurgand, Marc & Lorenceau, Adrien & Mélonio, Thomas, 2023. "Student loans: Credit constraints and higher education in South Africa," Journal of Development Economics, Elsevier, vol. 161(C).
    17. Isabelle Chort & Maëlys de la Rupelle, 2022. "Managing the impact of climate on migration: evidence from Mexico," Journal of Population Economics, Springer;European Society for Population Economics, vol. 35(4), pages 1777-1819, October.
    18. Seungho Choi & Raphael Jonghyeon & Simon Xu, 2023. "The Strategic Use of Corporate Philanthropy: Evidence from Bank Donations," Review of Finance, European Finance Association, vol. 27(5), pages 1883-1930.
    19. Elliott, Graham & Kudrin, Nikolay & Wüthrich, Kaspar, 2022. "Detecting p‐Hacking," University of California at San Diego, Economics Working Paper Series qt2p04s3dr, Department of Economics, UC San Diego.
    20. De Benedetto, Marco Alberto & De Paola, Maria & Scoppa, Vincenzo & Smirnova, Janna, 2023. "Erasmus Program and Labor Market Outcomes: Evidence from a Fuzzy Regression Discontinuity Design," IZA Discussion Papers 16181, Institute of Labor Economics (IZA).
    21. Babii, Andrii & Kumar, Rohit, 2023. "Isotonic regression discontinuity designs," Journal of Econometrics, Elsevier, vol. 234(2), pages 371-393.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:6:p:993-:d:774885. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.