IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v79y2023i3p2592-2604.html
   My bibliography  Save this article

Infinite hidden Markov models for multiple multivariate time series with missing data

Author

Listed:
  • Lauren Hoskovec
  • Matthew D. Koslovsky
  • Kirsten Koehler
  • Nicholas Good
  • Jennifer L. Peel
  • John Volckens
  • Ander Wilson

Abstract

Exposure to air pollution is associated with increased morbidity and mortality. Recent technological advancements permit the collection of time‐resolved personal exposure data. Such data are often incomplete with missing observations and exposures below the limit of detection, which limit their use in health effects studies. In this paper, we develop an infinite hidden Markov model for multiple asynchronous multivariate time series with missing data. Our model is designed to include covariates that can inform transitions among hidden states. We implement beam sampling, a combination of slice sampling and dynamic programming, to sample the hidden states, and a Bayesian multiple imputation algorithm to impute missing data. In simulation studies, our model excels in estimating hidden states and state‐specific means and imputing observations that are missing at random or below the limit of detection. We validate our imputation approach on data from the Fort Collins Commuter Study. We show that the estimated hidden states improve imputations for data that are missing at random compared to existing approaches. In a case study of the Fort Collins Commuter Study, we describe the inferential gains obtained from our model including improved imputation of missing data and the ability to identify shared patterns in activity and exposure among repeated sampling days for individuals and among distinct individuals.

Suggested Citation

  • Lauren Hoskovec & Matthew D. Koslovsky & Kirsten Koehler & Nicholas Good & Jennifer L. Peel & John Volckens & Ander Wilson, 2023. "Infinite hidden Markov models for multiple multivariate time series with missing data," Biometrics, The International Biometric Society, vol. 79(3), pages 2592-2604, September.
  • Handle: RePEc:bla:biomet:v:79:y:2023:i:3:p:2592-2604
    DOI: 10.1111/biom.13715
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13715
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13715?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Chung, Yeonseung & Dunson, David B., 2009. "Nonparametric Bayes Conditional Distribution Modeling With Variable Selection," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1646-1660.
    2. Teh, Yee Whye & Jordan, Michael I. & Beal, Matthew J. & Blei, David M., 2006. "Hierarchical Dirichlet Processes," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1566-1581, December.
    3. Dias, José G. & Vermunt, Jeroen K. & Ramos, Sofia, 2015. "Clustering financial time series: New insights from an extended hidden Markov model," European Journal of Operational Research, Elsevier, vol. 243(3), pages 852-864.
    4. Altman, Rachel MacKay, 2007. "Mixed Hidden Markov Models: An Extension of the Hidden Markov Model to the Longitudinal Data Setting," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 201-210, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marco Guerra & Francesca Bassi & José G. Dias, 2020. "A Multiple-Indicator Latent Growth Mixture Model to Track Courses with Low-Quality Teaching," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 147(2), pages 361-381, January.
    2. Trindade, Graça & Dias, José G. & Ambrósio, Jorge, 2017. "Extracting clusters from aggregate panel data: A market segmentation study," Applied Mathematics and Computation, Elsevier, vol. 296(C), pages 277-288.
    3. Qi Li & Juan Lin & Jeffrey S. Racine, 2013. "Optimal Bandwidth Selection for Nonparametric Conditional Distribution and Quantile Functions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(1), pages 57-65, January.
    4. Redivo, Edoardo & Nguyen, Hien D. & Gupta, Mayetri, 2020. "Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions," Computational Statistics & Data Analysis, Elsevier, vol. 152(C).
    5. Igari, Ryosuke & Hoshino, Takahiro, 2018. "A Bayesian data combination approach for repeated durations under unobserved missing indicators: Application to interpurchase-timing in marketing," Computational Statistics & Data Analysis, Elsevier, vol. 126(C), pages 150-166.
    6. Eric Benhamou & David Saltiel & Sandrine Ungari & Abhishek Mukhopadhyay & Jamal Atif, 2020. "AAMDRL: Augmented Asset Management with Deep Reinforcement Learning," Papers 2010.08497, arXiv.org.
    7. Jin, Xin & Maheu, John M., 2016. "Bayesian semiparametric modeling of realized covariance matrices," Journal of Econometrics, Elsevier, vol. 192(1), pages 19-39.
    8. Xiaoyue Li & A. Sinem Uysal & John M. Mulvey, 2021. "Multi-Period Portfolio Optimization using Model Predictive Control with Mean-Variance and Risk Parity Frameworks," Papers 2103.10813, arXiv.org.
    9. Pati, Debdeep & Dunson, David B. & Tokdar, Surya T., 2013. "Posterior consistency in conditional distribution estimation," Journal of Multivariate Analysis, Elsevier, vol. 116(C), pages 456-472.
    10. Parvin Ahmadi & Iman Gholampour & Mahmoud Tabandeh, 2018. "Cluster-based sparse topical coding for topic mining and document clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(3), pages 537-558, September.
    11. Spezia, L. & Cooksley, S.L. & Brewer, M.J. & Donnelly, D. & Tree, A., 2014. "Modelling species abundance in a river by Negative Binomial hidden Markov models," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 599-614.
    12. Marino, Maria Francesca & Alfó, Marco, 2016. "Gaussian quadrature approximations in mixed hidden Markov models for longitudinal data: A simulation study," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 193-209.
    13. Ryo Kato & Takahiro Hoshino, 2020. "Semiparametric Bayesian multiple imputation for regression models with missing mixed continuous–discrete covariates," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(3), pages 803-825, June.
    14. Jeffrey L. Furman & Florenta Teodoridis, 2020. "Automation, Research Technology, and Researchers’ Trajectories: Evidence from Computer Science and Electrical Engineering," Organization Science, INFORMS, vol. 31(2), pages 330-354, March.
    15. Xin Jin & John M. Maheu & Qiao Yang, 2019. "Bayesian parametric and semiparametric factor models for large realized covariance matrices," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 34(5), pages 641-660, August.
    16. Csereklyei, Zsuzsanna & Anantharama, Nandini & Kallies, Anne, 2021. "Electricity market transitions in Australia: Evidence using model-based clustering," Energy Economics, Elsevier, vol. 103(C).
    17. Shu-Ping Shi & Yong Song, 2012. "Identifying Speculative Bubbles with an Infinite Hidden Markov Model," Working Paper series 26_12, Rimini Centre for Economic Analysis.
    18. Lu Huang & Xiang Chen & Yi Zhang & Changtian Wang & Xiaoli Cao & Jiarun Liu, 2022. "Identification of topic evolution: network analytics with piecewise linear representation and word embedding," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5353-5383, September.
    19. Gael M. Martin & David T. Frazier & Ruben Loaiza-Maya & Florian Huber & Gary Koop & John Maheu & Didier Nibbering & Anastasios Panagiotelis, 2023. "Bayesian Forecasting in the 21st Century: A Modern Review," Monash Econometrics and Business Statistics Working Papers 1/23, Monash University, Department of Econometrics and Business Statistics.
    20. Jin, Xin & Maheu, John M. & Yang, Qiao, 2022. "Infinite Markov pooling of predictive distributions," Journal of Econometrics, Elsevier, vol. 228(2), pages 302-321.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:79:y:2023:i:3:p:2592-2604. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.