IDEAS home Printed from https://ideas.repec.org/a/wly/envmet/v32y2021i7ne2681.html
   My bibliography  Save this article

Spatiotemporal clustering using Gaussian processes embedded in a mixture model

Author

Listed:
  • Jarno Vanhatalo
  • Scott D. Foster
  • Geoffrey R. Hosack

Abstract

The categorization of multidimensional data into clusters is a common task in statistics. Many applications of clustering, including the majority of tasks in ecology, use data that is inherently spatial and is often also temporal. However, spatiotemporal dependence is typically ignored when clustering multivariate data. We present a finite mixture model for spatial and spatiotemporal clustering that incorporates spatial and spatiotemporal autocorrelation by including appropriate Gaussian processes (GP) into a model for the mixing proportions. We also allow for flexible and semiparametric dependence on environmental covariates, once again using GPs. We propose to use Bayesian inference through three tiers of approximate methods: a Laplace approximation that allows efficient analysis of large datasets, and both partial and full Markov chain Monte Carlo (MCMC) approaches that improve accuracy at the cost of increased computational time. Comparison of the methods shows that the Laplace approximation is a useful alternative to the MCMC methods. A decadal analysis of 253 species of teleost fish from 854 samples collected along the biodiverse northwestern continental shelf of Australia between 1986 and 1997 shows the added clarity provided by accounting for spatial autocorrelation. For these data, the temporal dependence is comparatively small, which is an important finding given the changing human pressures over this time.

Suggested Citation

  • Jarno Vanhatalo & Scott D. Foster & Geoffrey R. Hosack, 2021. "Spatiotemporal clustering using Gaussian processes embedded in a mixture model," Environmetrics, John Wiley & Sons, Ltd., vol. 32(7), November.
  • Handle: RePEc:wly:envmet:v:32:y:2021:i:7:n:e2681
    DOI: 10.1002/env.2681
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/env.2681
    Download Restriction: no

    File URL: https://libkey.io/10.1002/env.2681?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Brian Neelon & Alan E. Gelfand & Marie Lynn Miranda, 2014. "A multivariate spatial mixture model for areal data: examining regional differences in standardized test scores," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 63(5), pages 737-761, November.
    2. Scott D. Foster & Nicole A. Hill & Mitchell Lyons, 2017. "Ecological grouping of survey sites when sampling artefacts are present," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 66(5), pages 1031-1047, November.
    3. Matthew Stephens, 2000. "Dealing with label switching in mixture models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(4), pages 795-809.
    4. Massimo Bilancia & Giacomo Demarinis, 2014. "Bayesian scanning of spatial disease rates with integrated nested Laplace approximation (INLA)," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 23(1), pages 71-94, March.
    5. Wall, Melanie M. & Liu, Xuan, 2009. "Spatial latent class analysis model for spatially distributed multivariate binary data," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3057-3069, June.
    6. S.D. Foster & G.H. Givens & G.J. Dornan & P.K. Dunstan & R. Darnell, 2013. "Modelling biological regions from multi‐species and environmental data," Environmetrics, John Wiley & Sons, Ltd., vol. 24(7), pages 489-499, November.
    7. Fionn Murtagh & Michael J. Kurtz, 2016. "The Classification Society’s Bibliography Over Four Decades: History and Content Analysis," Journal of Classification, Springer;The Classification Society, vol. 33(1), pages 6-29, April.
    8. Håvard Rue & Sara Martino & Nicolas Chopin, 2009. "Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 319-392, April.
    9. Ephraim M. Hanks & Erin M. Schliep & Mevin B. Hooten & Jennifer A. Hoeting, 2015. "Restricted spatial regression in practice: geostatistical models, confounding, and robustness under model misspecification," Environmetrics, John Wiley & Sons, Ltd., vol. 26(4), pages 243-254, June.
    10. Andrew B. Lawson & Rachel Carroll & Christel Faes & Russell S. Kirby & Mehreteab Aregay & Kevin Watjou, 2017. "Spatiotemporal multivariate mixture models for Bayesian model selection in disease mapping," Environmetrics, John Wiley & Sons, Ltd., vol. 28(8), December.
    11. Jukka Corander & Jukka Sirén & Elja Arjas, 2008. "Bayesian spatial modeling of genetic population structure," Computational Statistics, Springer, vol. 23(1), pages 111-129, January.
    12. Green P.J. & Richardson S., 2002. "Hidden Markov Models and Disease Mapping," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1055-1070, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Douglas R. M. Azevedo & Marcos O. Prates & Dipankar Bandyopadhyay, 2021. "MSPOCK: Alleviating Spatial Confounding in Multivariate Disease Mapping Models," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 26(3), pages 464-491, September.
    2. Matthew J. Heaton & Abhirup Datta & Andrew O. Finley & Reinhard Furrer & Joseph Guinness & Rajarshi Guhaniyogi & Florian Gerber & Robert B. Gramacy & Dorit Hammerling & Matthias Katzfuss & Finn Lindgr, 2019. "A Case Study Competition Among Methods for Analyzing Large Spatial Data," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 24(3), pages 398-425, September.
    3. Scott D. Foster & Nicole A. Hill & Mitchell Lyons, 2017. "Ecological grouping of survey sites when sampling artefacts are present," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 66(5), pages 1031-1047, November.
    4. Francesco Bartolucci & Alessio Farcomeni, 2022. "A hidden Markov space–time model for mapping the dynamics of global access to food," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(1), pages 246-266, January.
    5. Sijia Xiang & Weixin Yao, 2020. "Semiparametric mixtures of regressions with single-index for model based clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 261-292, June.
    6. Janine B. Illian & David F. R. P. Burslem, 2017. "Improving the usability of spatial point process methodology: an interdisciplinary dialogue between statistics and ecology," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 101(4), pages 495-520, October.
    7. E. Lázaro & C. Armero & V. Gómez-Rubio, 2020. "Approximate Bayesian inference for mixture cure models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(3), pages 750-767, September.
    8. Erin M. Schliep, 2018. "Comments on: Process modeling for slope and aspect with application to elevation data maps," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 27(4), pages 778-782, December.
    9. Rufo, M.J. & Perez, C.J. & Martin, J., 2007. "Bayesian analysis of finite mixtures of multinomial and negative-multinomial distributions," Computational Statistics & Data Analysis, Elsevier, vol. 51(11), pages 5452-5466, July.
    10. K. Shuvo Bakar & Nicholas Biddle & Philip Kokic & Huidong Jin, 2020. "A Bayesian spatial categorical model for prediction to overlapping geographical areas in sample surveys," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(2), pages 535-563, February.
    11. Nikoline N. Knudsen & Jörg Schullehner & Birgitte Hansen & Lisbeth F. Jørgensen & Søren M. Kristiansen & Denitza D. Voutchkova & Thomas A. Gerds & Per K. Andersen & Kristine Bihrmann & Morten Grønbæk , 2017. "Lithium in Drinking Water and Incidence of Suicide: A Nationwide Individual-Level Cohort Study with 22 Years of Follow-Up," IJERPH, MDPI, vol. 14(6), pages 1-13, June.
    12. Cho, Daegon & Hwang, Youngdeok & Park, Jongwon, 2018. "More buzz, more vibes: Impact of social media on concert distribution," Journal of Economic Behavior & Organization, Elsevier, vol. 156(C), pages 103-113.
    13. Brown, Paul T. & Joshi, Chaitanya & Joe, Stephen & Rue, Håvard, 2021. "A novel method of marginalisation using low discrepancy sequences for integrated nested Laplace approximations," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
    14. Yao, Weixin & Wei, Yan & Yu, Chun, 2014. "Robust mixture regression using the t-distribution," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 116-127.
    15. Michaela Prokešová & Eva Jensen, 2013. "Asymptotic Palm likelihood theory for stationary point processes," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 65(2), pages 387-412, April.
    16. Mayer Alvo & Jingrui Mu, 2023. "COVID-19 Data Analysis Using Bayesian Models and Nonparametric Geostatistical Models," Mathematics, MDPI, vol. 11(6), pages 1-13, March.
    17. Jeong Eun Lee & Christian Robert, 2013. "Imortance Sampling Schemes for Evidence Approximation in Mixture Models," Working Papers 2013-42, Center for Research in Economics and Statistics.
    18. Aßmann, Christian & Boysen-Hogrefe, Jens & Pape, Markus, 2012. "The directional identification problem in Bayesian factor analysis: An ex-post approach," Kiel Working Papers 1799, Kiel Institute for the World Economy (IfW Kiel).
    19. Sun-Joo Cho & Allan S. Cohen, 2010. "A Multilevel Mixture IRT Model With an Application to DIF," Journal of Educational and Behavioral Statistics, , vol. 35(3), pages 336-370, June.
    20. Yuan Yan & Eva Cantoni & Chris Field & Margaret Treble & Joanna Mills Flemming, 2023. "Spatiotemporal modeling of mature‐at‐length data using a sliding window approach," Environmetrics, John Wiley & Sons, Ltd., vol. 34(2), March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:envmet:v:32:y:2021:i:7:n:e2681. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.interscience.wiley.com/jpages/1180-4009/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.