IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i22p4632-d1279268.html
   My bibliography  Save this article

Efficient Estimation and Validation of Shrinkage Estimators in Big Data Analytics

Author

Listed:
  • Salomi du Plessis

    (Department of Statistics, Faculty of Natural and Agricultural Science, University of Pretoria, Pretoria 0028, South Africa)

  • Mohammad Arashi

    (Department of Statistics, Faculty of Natural and Agricultural Science, University of Pretoria, Pretoria 0028, South Africa
    Department of Statistics, Faculty of Mathematical Sciences, Ferdowsi University of Mashhdad, Mashhad 9177948974, Iran)

  • Gaonyalelwe Maribe

    (Department of Statistics, Faculty of Natural and Agricultural Science, University of Pretoria, Pretoria 0028, South Africa)

  • Salomon M. Millard

    (Department of Statistics, Faculty of Natural and Agricultural Science, University of Pretoria, Pretoria 0028, South Africa)

Abstract

Shrinkage estimators are often used to mitigate the consequences of multicollinearity in linear regression models. Despite the ease with which these techniques can be applied to small- or moderate-size datasets, they encounter significant challenges in the big data domain. Some of these challenges are that the volume of data often exceeds the storage capacity of a single computer and that the time required to obtain results becomes infeasible due to the computational burden of a high volume of data. We propose an algorithm for the efficient model estimation and validation of various well-known shrinkage estimators to be used in scenarios where the volume of the data is large. Our proposed algorithm utilises sufficient statistics that can be computed and updated at the row level, thus minimizing access to the entire dataset. A simulation study, as well as an application on a real-world dataset, illustrates the efficiency of the proposed approach.

Suggested Citation

  • Salomi du Plessis & Mohammad Arashi & Gaonyalelwe Maribe & Salomon M. Millard, 2023. "Efficient Estimation and Validation of Shrinkage Estimators in Big Data Analytics," Mathematics, MDPI, vol. 11(22), pages 1-11, November.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:22:p:4632-:d:1279268
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/22/4632/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/22/4632/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Nusrat Shaheen & Ismail Shah & Amani Almohaimeed & Sajid Ali & Hana N. Alqifari, 2023. "Some Modified Ridge Estimators for Handling the Multicollinearity Problem," Mathematics, MDPI, vol. 11(11), pages 1-19, May.
    2. Tonglin Zhang & Baijian Yang, 2017. "An exact approach to ridge regression for big data," Computational Statistics, Springer, vol. 32(3), pages 909-928, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.

      Corrections

      All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:22:p:4632-:d:1279268. See general information about how to correct material in RePEc.

      If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

      If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

      If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

      For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

      Please note that corrections may take a couple of weeks to filter through the various RePEc services.

      IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.