IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v64y2023i1d10.1007_s00362-022-01308-w.html
   My bibliography  Save this article

Squared error-based shrinkage estimators of discrete probabilities and their application to variable selection

Author

Listed:
  • Małgorzata Łazȩcka

    (Warsaw University of Technology
    Polish Academy of Sciences Warsaw)

  • Jan Mielniczuk

    (Warsaw University of Technology
    Polish Academy of Sciences Warsaw)

Abstract

In the paper we consider a new approach to regularize the maximum likelihood estimator of a discrete probability distribution and its application in variable selection. The method relies on choosing a parameter of its convex combination with a low-dimensional target distribution by minimising the squared error (SE) instead of the mean SE (MSE). The choice of an optimal parameter for every sample results in not larger MSE than MSE for James–Stein shrinkage estimator of discrete probability distribution. The introduced parameter is estimated by cross-validation and is shown to perform promisingly for synthetic dependence models. The method is applied to introduce regularized versions of information based variable selection criteria which are investigated in numerical experiments and turn out to work better than commonly used plug-in estimators under several scenarios.

Suggested Citation

  • Małgorzata Łazȩcka & Jan Mielniczuk, 2023. "Squared error-based shrinkage estimators of discrete probabilities and their application to variable selection," Statistical Papers, Springer, vol. 64(1), pages 41-72, February.
  • Handle: RePEc:spr:stpapr:v:64:y:2023:i:1:d:10.1007_s00362-022-01308-w
    DOI: 10.1007/s00362-022-01308-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-022-01308-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-022-01308-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hall, Peter, 1982. "Limit theorems for stochastic measures of the accuracy of density estimators," Stochastic Processes and their Applications, Elsevier, vol. 13(1), pages 11-25, July.
    2. Ledoit, Olivier & Wolf, Michael, 2003. "Improved estimation of the covariance matrix of stock returns with an application to portfolio selection," Journal of Empirical Finance, Elsevier, vol. 10(5), pages 603-621, December.
    3. Hall, Peter, 1984. "Central limit theorem for integrated square error of multivariate nonparametric density estimators," Journal of Multivariate Analysis, Elsevier, vol. 14(1), pages 1-16, February.
    4. Scutari, Marco, 2010. "Learning Bayesian Networks with the bnlearn R Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 35(i03).
    5. Marron, James Stephen & Härdle, Wolfgang, 1986. "Random approximations to some measures of accuracy in nonparametric curve estimation," Journal of Multivariate Analysis, Elsevier, vol. 20(1), pages 91-113, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Delsol, Laurent & Ferraty, Frédéric & Vieu, Philippe, 2011. "Structural test in regression on functional variables," Journal of Multivariate Analysis, Elsevier, vol. 102(3), pages 422-447, March.
    2. Estévez-Pérez, Graciela, 2002. "On convergence rates for quadratic errors in kernel hazard estimation," Statistics & Probability Letters, Elsevier, vol. 57(3), pages 231-241, April.
    3. Fakoor, Vahid & Jomhoori, Sarah & Azarnoosh, Hasanali, 2009. "Asymptotic expansion for ISE of kernel density estimators under censored dependent model," Statistics & Probability Letters, Elsevier, vol. 79(17), pages 1809-1817, September.
    4. Majid Mojirsheibani & William Pouliot, 2017. "Weighted bootstrapped kernel density estimators in two-sample problems," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 29(1), pages 61-84, January.
    5. Hannart, Alexis & Naveau, Philippe, 2014. "Estimating high dimensional covariance matrices: A new look at the Gaussian conjugate framework," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 149-162.
    6. Cui, Xueting & Zhu, Shushang & Sun, Xiaoling & Li, Duan, 2013. "Nonlinear portfolio selection using approximate parametric Value-at-Risk," Journal of Banking & Finance, Elsevier, vol. 37(6), pages 2124-2139.
    7. Marcelo Fernandes & Breno Neri, 2010. "Nonparametric Entropy-Based Tests of Independence Between Stochastic Processes," Econometric Reviews, Taylor & Francis Journals, vol. 29(3), pages 276-306.
    8. Prabal Das & D. A. Sachindra & Kironmala Chanda, 2022. "Machine Learning-Based Rainfall Forecasting with Multiple Non-Linear Feature Selection Algorithms," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 36(15), pages 6043-6071, December.
    9. Candelon, B. & Hurlin, C. & Tokpavi, S., 2012. "Sampling error and double shrinkage estimation of minimum variance portfolios," Journal of Empirical Finance, Elsevier, vol. 19(4), pages 511-527.
    10. Su, Liangjun, 2006. "A simple test for multivariate conditional symmetry," Economics Letters, Elsevier, vol. 93(3), pages 374-378, December.
    11. Torben G. Andersen & Tim Bollerslev & Peter Christoffersen & Francis X. Diebold, 2007. "Practical Volatility and Correlation Modeling for Financial Market Risk Management," NBER Chapters, in: The Risks of Financial Institutions, pages 513-544, National Bureau of Economic Research, Inc.
    12. Fan, Jianqing & Liao, Yuan & Shi, Xiaofeng, 2015. "Risks of large portfolios," Journal of Econometrics, Elsevier, vol. 186(2), pages 367-387.
    13. Ouimet, Frédéric & Tolosana-Delgado, Raimon, 2022. "Asymptotic properties of Dirichlet kernel density estimators," Journal of Multivariate Analysis, Elsevier, vol. 187(C).
    14. Atanda Mustapha Saidi, 2017. "Working Paper 273 - Stock (Mis)pricing and investment dynamics in Africa," Working Paper Series 2390, African Development Bank.
    15. Sven Husmann & Antoniya Shivarova & Rick Steinert, 2019. "Cross-validated covariance estimators for high-dimensional minimum-variance portfolios," Papers 1910.13960, arXiv.org, revised Oct 2020.
    16. Jianqing Fan & Xu Han, 2017. "Estimation of the false discovery proportion with unknown dependence," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 1143-1164, September.
    17. Fernandes, Marcelo & Grammig, Joachim, 2005. "Nonparametric specification tests for conditional duration models," Journal of Econometrics, Elsevier, vol. 127(1), pages 35-68, July.
    18. Roland R. Ramsahai, 2020. "Connecting actuarial judgment to probabilistic learning techniques with graph theory," Papers 2007.15475, arXiv.org.
    19. Tang, Kayu & Parsons, David J. & Jude, Simon, 2019. "Comparison of automatic and guided learning for Bayesian networks to analyse pipe failures in the water distribution system," Reliability Engineering and System Safety, Elsevier, vol. 186(C), pages 24-36.
    20. repec:ebl:ecbull:v:3:y:2005:i:11:p:1-10 is not listed on IDEAS
    21. Hoderlein, Stefan & Su, Liangjun & White, Halbert & Yang, Thomas Tao, 2016. "Testing for monotonicity in unobservables under unconfoundedness," Journal of Econometrics, Elsevier, vol. 193(1), pages 183-202.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:64:y:2023:i:1:d:10.1007_s00362-022-01308-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.