IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v138y2019icp201-221.html
   My bibliography  Save this article

Estimating the mean and variance of a high-dimensional normal distribution using a mixture prior

Author

Listed:
  • Sinha, Shyamalendu
  • Hart, Jeffrey D.

Abstract

A framework is provided for estimating the mean and variance of a high-dimensional normal density. The main setting considered is a fixed number of vectors following a high-dimensional normal distribution with unknown mean and diagonal covariance matrix. The diagonal covariance matrix can be known or unknown. If the covariance matrix is unknown, the sample size can be as small as 2. The proposed estimator is based on the idea that the unobserved mean/variance pairs across dimensions are drawn from an unknown bivariate distribution, which is modeled as a mixture of normal-inverse gammas. The mixture of normal-inverse gamma distributions provides advantages over more traditional empirical Bayes methods, which are based on a normal–normal model. When fitting a mixture model, the algorithm is essentially clustering the unobserved mean and variance pairs into different groups, with each group having a different normal-inverse gamma distribution. The proposed estimator of each mean is the posterior mean of shrinkage estimates, each of which shrinks a sample mean towards a different component of the mixture distribution. The proposed estimator of variance has an analogous interpretation in terms of sample variances and components of the mixture distribution. If the diagonal covariance matrix is known, then the sample size can be as small as 1, and the pairs of known variances and unknown means across dimensions are treated as random observations coming from a flexible mixture of normal-inverse gamma distributions.

Suggested Citation

  • Sinha, Shyamalendu & Hart, Jeffrey D., 2019. "Estimating the mean and variance of a high-dimensional normal distribution using a mixture prior," Computational Statistics & Data Analysis, Elsevier, vol. 138(C), pages 201-221.
  • Handle: RePEc:eee:csdana:v:138:y:2019:i:c:p:201-221
    DOI: 10.1016/j.csda.2019.04.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947319300908
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2019.04.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Matthew Stephens, 2000. "Dealing with label switching in mixture models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(4), pages 795-809.
    2. Xianchao Xie & S. C. Kou & Lawrence D. Brown, 2012. "SURE Estimates for a Heteroscedastic Hierarchical Model," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1465-1479, December.
    3. repec:dau:papers:123456789/4648 is not listed on IDEAS
    4. Asaf Weinstein & Zhuang Ma & Lawrence D. Brown & Cun-Hui Zhang, 2018. "Group-Linear Empirical Bayes Estimates for a Heteroscedastic Normal Mean," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(522), pages 698-710, April.
    5. Bing-Yi Jing & Zhouping Li & Guangming Pan & Wang Zhou, 2016. "On SURE-Type Double Shrinkage Estimation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1696-1704, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lang Zhao & Yuan Zeng & Zhidong Wang & Yizheng Li & Dong Peng & Yao Wang & Xueying Wang, 2023. "Robust Optimal Scheduling of Integrated Energy Systems Considering the Uncertainty of Power Supply and Load in the Power Market," Energies, MDPI, vol. 16(14), pages 1-14, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Koen Jochmans & Martin Weidner, 2018. "Inference on a distribution from noisy draws," CeMMAP working papers CWP14/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    2. Jiafeng Chen, 2022. "Empirical Bayes When Estimation Precision Predicts Parameters," Papers 2212.14444, arXiv.org, revised Apr 2024.
    3. Wan-Lun Wang, 2019. "Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(1), pages 196-222, March.
    4. Mark S. Handcock & Adrian E. Raftery & Jeremy M. Tantrum, 2007. "Model‐based clustering for social networks," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 170(2), pages 301-354, March.
    5. Arman Oganisian & Nandita Mitra & Jason A. Roy, 2021. "A Bayesian nonparametric model for zero‐inflated outcomes: Prediction, clustering, and causal estimation," Biometrics, The International Biometric Society, vol. 77(1), pages 125-135, March.
    6. Yao, Weixin & Wei, Yan & Yu, Chun, 2014. "Robust mixture regression using the t-distribution," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 116-127.
    7. Rufo, M.J. & Pérez, C.J. & Martín, J., 2009. "Local parametric sensitivity for mixture models of lifetime distributions," Reliability Engineering and System Safety, Elsevier, vol. 94(7), pages 1238-1244.
    8. Jeong Eun Lee & Christian Robert, 2013. "Imortance Sampling Schemes for Evidence Approximation in Mixture Models," Working Papers 2013-42, Center for Research in Economics and Statistics.
    9. Aßmann, Christian & Boysen-Hogrefe, Jens & Pape, Markus, 2012. "The directional identification problem in Bayesian factor analysis: An ex-post approach," Kiel Working Papers 1799, Kiel Institute for the World Economy (IfW Kiel).
    10. Sphiwe B. Skhosana & Salomon M. Millard & Frans H. J. Kanfer, 2023. "A Novel EM-Type Algorithm to Estimate Semi-Parametric Mixtures of Partially Linear Models," Mathematics, MDPI, vol. 11(5), pages 1-20, February.
    11. Sun-Joo Cho & Allan S. Cohen, 2010. "A Multilevel Mixture IRT Model With an Application to DIF," Journal of Educational and Behavioral Statistics, , vol. 35(3), pages 336-370, June.
    12. Bing-Yi Jing & Zhouping Li & Guangming Pan & Wang Zhou, 2016. "On SURE-Type Double Shrinkage Estimation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1696-1704, October.
    13. Ungolo, Francesco & Kleinow, Torsten & Macdonald, Angus S., 2020. "A hierarchical model for the joint mortality analysis of pension scheme data with missing covariates," Insurance: Mathematics and Economics, Elsevier, vol. 91(C), pages 68-84.
    14. Ioannis Ntzoufras & Claudia Tarantola, 2012. "Conjugate and Conditional Conjugate Bayesian Analysis of Discrete Graphical Models of Marginal Independence," Quaderni di Dipartimento 178, University of Pavia, Department of Economics and Quantitative Methods.
    15. Brian Hartley, 2020. "Corridor stability of the Kaleckian growth model: a Markov-switching approach," Working Papers 2013, New School for Social Research, Department of Economics, revised Nov 2020.
    16. Park, Byung-Jung & Zhang, Yunlong & Lord, Dominique, 2010. "Bayesian mixture modeling approach to account for heterogeneity in speed data," Transportation Research Part B: Methodological, Elsevier, vol. 44(5), pages 662-673, June.
    17. Papastamoulis, Panagiotis, 2018. "Overfitting Bayesian mixtures of factor analyzers with an unknown number of components," Computational Statistics & Data Analysis, Elsevier, vol. 124(C), pages 220-234.
    18. Simen Alexander Linge Johnsen & Jörg Bollmann, 2020. "Coccolith mass and morphology of different Emiliania huxleyi morphotypes: A critical examination using Canary Islands material," PLOS ONE, Public Library of Science, vol. 15(3), pages 1-29, March.
    19. Nichole E. Carlson & Timothy D. Johnson & Morton B. Brown, 2009. "A Bayesian Approach to Modeling Associations Between Pulsatile Hormones," Biometrics, The International Biometric Society, vol. 65(2), pages 650-659, June.
    20. M. Rufo & J. Martín & C. Pérez, 2006. "Bayesian analysis of finite mixture models of distributions from exponential families," Computational Statistics, Springer, vol. 21(3), pages 621-637, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:138:y:2019:i:c:p:201-221. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.