IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v165y2018icp86-100.html
   My bibliography  Save this article

A general algorithm for covariance modeling of discrete data

Author

Listed:
  • Popovic, Gordana C.
  • Hui, Francis K.C.
  • Warton, David I.

Abstract

We propose an algorithm that generalizes to discrete data any given covariance modeling algorithm originally intended for Gaussian responses, via a Gaussian copula approach. Covariance modeling is a powerful tool for extracting meaning from multivariate data, and fast algorithms for Gaussian data, such as factor analysis and Gaussian graphical models, are widely available. Our algorithm makes these tools generally available to analysts of discrete data and can combine any likelihood-based covariance modeling method for Gaussian data with any set of discrete marginal distributions. Previously, tools for discrete data were generally specific to one family of distributions or covariance modeling paradigm, or otherwise did not exist. Our algorithm is more flexible than alternate methods, takes advantage of existing fast algorithms for Gaussian data, and simulations suggest that it outperforms competing graphical modeling and factor analysis procedures for count and binomial data. We additionally show that in a Gaussian copula graphical model with discrete margins, conditional independence relationships in the latent Gaussian variables are inherited by the discrete observations. Our method is illustrated with a graphical model and factor analysis on an overdispersed ecological count dataset of species abundances.

Suggested Citation

  • Popovic, Gordana C. & Hui, Francis K.C. & Warton, David I., 2018. "A general algorithm for covariance modeling of discrete data," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 86-100.
  • Handle: RePEc:eee:jmvana:v:165:y:2018:i:c:p:86-100
    DOI: 10.1016/j.jmva.2017.12.002
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X17307522
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2017.12.002?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Olivier Ledoit & Michael Wolf, 2003. "Honey, I shrunk the sample covariance matrix," Economics Working Papers 691, Department of Economics and Business, Universitat Pompeu Fabra.
    2. J. G. Booth & J. P. Hobert, 1999. "Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(1), pages 265-285.
    3. Jianqing Fan & Han Liu & Yang Ning & Hui Zou, 2017. "High dimensional semiparametric latent graphical model for mixed data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(2), pages 405-421, March.
    4. Ming Yuan & Yi Lin, 2007. "Model selection and estimation in the Gaussian graphical model," Biometrika, Biometrika Trust, vol. 94(1), pages 19-35.
    5. Joe, Harry, 2005. "Asymptotic efficiency of the two-stage estimation method for copula-based models," Journal of Multivariate Analysis, Elsevier, vol. 94(2), pages 401-419, June.
    6. Klaus Holst & Esben Budtz-Jørgensen, 2013. "Linear latent variable models: the lava-package," Computational Statistics, Springer, vol. 28(4), pages 1385-1452, August.
    7. Heinen, Andreas & Rengifo, Erick, 2007. "Multivariate autoregressive modeling of time series count data using copulas," Journal of Empirical Finance, Elsevier, vol. 14(4), pages 564-583, September.
    8. Carvalho, Carlos M. & Chang, Jeffrey & Lucas, Joseph E. & Nevins, Joseph R. & Wang, Quanli & West, Mike, 2008. "High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1438-1456.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Raphaëlle Momal & Stéphane Robin & Christophe Ambroise, 2021. "Accounting for missing actors in interaction network inference from abundance data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(5), pages 1230-1258, November.
    2. Nurudeen A. Adegoke & Andrew Punnett & Marti J. Anderson, 2022. "Estimation of Multivariate Dependence Structures via Constrained Maximum Likelihood," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 27(2), pages 240-260, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Patton, Andrew, 2013. "Copula Methods for Forecasting Multivariate Time Series," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 899-960, Elsevier.
    2. Sung, Bongjung & Lee, Jaeyong, 2023. "Covariance structure estimation with Laplace approximation," Journal of Multivariate Analysis, Elsevier, vol. 198(C).
    3. Fan, Xinyan & Zhang, Qingzhao & Ma, Shuangge & Fang, Kuangnan, 2021. "Conditional score matching for high-dimensional partial graphical models," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).
    4. Kevin H. Lee & Qian Chen & Wayne S. DeSarbo & Lingzhou Xue, 2022. "Estimating Finite Mixtures of Ordinal Graphical Models," Psychometrika, Springer;The Psychometric Society, vol. 87(1), pages 83-106, March.
    5. Shu Yang & Jae Kwang Kim, 2016. "Likelihood-based Inference with Missing Data Under Missing-at-Random," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(2), pages 436-454, June.
    6. Avagyan, Vahe & Alonso Fernández, Andrés Modesto & Nogales, Francisco J., 2015. "D-trace Precision Matrix Estimation Using Adaptive Lasso Penalties," DES - Working Papers. Statistics and Econometrics. WS 21775, Universidad Carlos III de Madrid. Departamento de Estadística.
    7. Bouteska, Ahmed & Sharif, Taimur & Abedin, Mohammad Zoynul, 2023. "COVID-19 and stock returns: Evidence from the Markov switching dependence approach," Research in International Business and Finance, Elsevier, vol. 64(C).
    8. Kim, Hyun Hak & Swanson, Norman R., 2018. "Mining big data using parsimonious factor, machine learning, variable selection and shrinkage methods," International Journal of Forecasting, Elsevier, vol. 34(2), pages 339-354.
    9. Byrd, Michael & Nghiem, Linh H. & McGee, Monnie, 2021. "Bayesian regularization of Gaussian graphical models with measurement error," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    10. Hemant Kulkarni & Jayabrata Biswas & Kiranmoy Das, 2019. "A joint quantile regression model for multiple longitudinal outcomes," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 103(4), pages 453-473, December.
    11. Tatiyana V. Apanasovich & David Ruppert & Joanne R. Lupton & Natasa Popovic & Nancy D. Turner & Robert S. Chapkin & Raymond J. Carroll, 2008. "Aberrant Crypt Foci and Semiparametric Modeling of Correlated Binary Data," Biometrics, The International Biometric Society, vol. 64(2), pages 490-500, June.
    12. BAUWENS, Luc & HAUTSCH, Nikolaus, 2003. "Dynamic latent factor models for intensity processes," LIDAM Discussion Papers CORE 2003103, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    13. Sylvia Fruhwirth-Schnatter, 2023. "Generalized Cumulative Shrinkage Process Priors with Applications to Sparse Bayesian Factor Analysis," Papers 2303.00473, arXiv.org.
    14. Duo Jiang & Thomas Sharpton & Yuan Jiang, 2021. "Microbial Interaction Network Estimation via Bias-Corrected Graphical Lasso," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(2), pages 329-350, July.
    15. Li, Feng & Kang, Yanfei, 2018. "Improving forecasting performance using covariate-dependent copula models," International Journal of Forecasting, Elsevier, vol. 34(3), pages 456-476.
    16. Lam, Clifford, 2008. "Estimation of large precision matrices through block penalization," LSE Research Online Documents on Economics 31543, London School of Economics and Political Science, LSE Library.
    17. Giraud Christophe & Huet Sylvie & Verzelen Nicolas, 2012. "Graph Selection with GGMselect," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(3), pages 1-52, February.
    18. Seunghwan Lee & Sang Cheol Kim & Donghyeon Yu, 2023. "An efficient GPU-parallel coordinate descent algorithm for sparse precision matrix estimation via scaled lasso," Computational Statistics, Springer, vol. 38(1), pages 217-242, March.
    19. Ricardo Smith Ramírez, 2007. "FIML estimation of treatment effect models with endogenous selection and multiple censored responses via a Monte Carlo EM Algorithm," Working papers DTE 403, CIDE, División de Economía.
    20. Matteo Iacopini & Carlo R.M.A. Santagiustina, 2021. "Filtering the intensity of public concern from social media count data with jumps," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(4), pages 1283-1302, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:165:y:2018:i:c:p:86-100. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.