IDEAS home Printed from https://ideas.repec.org/a/spr/jagbes/v29y2024i2d10.1007_s13253-023-00574-x.html
   My bibliography  Save this article

Covariance Clustering: Modelling Covariance in Designed Experiments When the Number of Variables is Greater than Experimental Units

Author

Listed:
  • Clayton R. Forknall

    (The University of Queensland
    Queensland Department of Agriculture and Fisheries)

  • Arūnas P. Verbyla

    (The University of Queensland)

  • Yoni Nazarathy

    (The University of Queensland)

  • Adel Yousif

    (University of Tasmania)

  • Sarah Osama

    (Department of Regional New South Wales)

  • Shirley H. Jones

    (University of Southern Queensland)

  • Edward Kerr

    (The University of Queensland)

  • Benjamin L. Schulz

    (The University of Queensland)

  • Glen P. Fox

    (University of California)

  • Alison M. Kelly

    (The University of Queensland)

Abstract

The size and complexity of datasets resulting from comparative research experiments in the agricultural domain is constantly increasing. Often the number of variables measured in an experiment exceeds the number of experimental units composing the experiment. When there is a necessity to model the covariance relationships that exist between variables in these experiments, estimation difficulties can arise due to the resulting covariance structure being of reduced rank. A statistical method, based in a linear mixed model framework, is presented for the analysis of designed experiments where datasets are characterised by a greater number of variables than experimental units, and for which the modelling of complex covariance structures between variables is desired. Aided by a clustering algorithm, the method enables the estimation of covariance through the introduction of covariance clusters as random effects into the modelling framework, providing an extension of the traditional variance components model for building covariance structures. The method was applied to a multi-phase mass spectrometry-based proteomics experiment, with the aim of exploring changes in the proteome of barley grain over time during the malting process. The modelling approach provides a new linear mixed model-based method for the estimation of covariance structures between variables measured from designed experiments, when there are a small number of experimental units, or observations, informing covariance parameter estimates.

Suggested Citation

  • Clayton R. Forknall & Arūnas P. Verbyla & Yoni Nazarathy & Adel Yousif & Sarah Osama & Shirley H. Jones & Edward Kerr & Benjamin L. Schulz & Glen P. Fox & Alison M. Kelly, 2024. "Covariance Clustering: Modelling Covariance in Designed Experiments When the Number of Variables is Greater than Experimental Units," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 29(2), pages 232-256, June.
  • Handle: RePEc:spr:jagbes:v:29:y:2024:i:2:d:10.1007_s13253-023-00574-x
    DOI: 10.1007/s13253-023-00574-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13253-023-00574-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13253-023-00574-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Coffey, N. & Hinde, J. & Holian, E., 2014. "Clustering longitudinal profiles using P-splines and mixed effects models applied to time-course gene expression data," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 14-29.
    2. Joanne De Faveri & Arūnas P. Verbyla & Brian R. Cullis & Wayne S. Pitchford & Robin Thompson, 2017. "Residual Variance–Covariance Modelling in Analysis of Multivariate Data from Variety Selection Trials," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 22(1), pages 1-22, March.
    3. Coffey Norma & Hinde John, 2011. "Analyzing Time-Course Microarray Data Using Functional Data Analysis - A Review," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-32, May.
    4. Arũnas P. Verbyla & Brian R. Cullis & Michael G. Kenward & Sue J. Welham, 1999. "The Analysis of Designed Experiments and Longitudinal Data by Using Smoothing Splines," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 48(3), pages 269-311.
    5. C. J. Brien & R. A. Bailey, 2006. "Multiple randomizations," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(4), pages 571-609, September.
    6. Arūnas P. Verbyla & Joanne Faveri & John D. Wilkie & Tom Lewis, 2018. "Tensor Cubic Smoothing Splines in Designed Experiments Requiring Residual Modelling," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(4), pages 478-508, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Arūnas P. Verbyla & Joanne Faveri & John D. Wilkie & Tom Lewis, 2018. "Tensor Cubic Smoothing Splines in Designed Experiments Requiring Residual Modelling," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(4), pages 478-508, December.
    2. Lee, Dae-Jin & Durbán, María, 2009. "P-spline anova-type interaction models for spatio-temporal smoothing," DES - Working Papers. Statistics and Econometrics. WS ws093312, Universidad Carlos III de Madrid. Departamento de Estadística.
    3. Welham, S.J. & Thompson, R., 2009. "A note on bimodality in the log-likelihood function for penalized spline mixed models," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 920-931, February.
    4. Ruixue Du & Hiroshi Yamada, 2020. "Principle of Duality in Cubic Smoothing Spline," Mathematics, MDPI, vol. 8(10), pages 1-19, October.
    5. Beran, Jan & Liu, Haiyan, 2016. "Estimation of eigenvalues, eigenvectors and scores in FDA models with dependent errors," Journal of Multivariate Analysis, Elsevier, vol. 147(C), pages 218-233.
    6. Nicholas Longford, 2014. "On the inefficiency of the restricted maximum likelihood," Economics Working Papers 1415, Department of Economics and Business, Universitat Pompeu Fabra.
    7. Martin P. Boer & Hans-Peter Piepho & Emlyn R. Williams, 2020. "Linear Variance, P-splines and Neighbour Differences for Spatial Adjustment in Field Trials: How are they Related?," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 25(4), pages 676-698, December.
    8. Murphy, Sean R. & Boschma, Suzanne P. & Harden, Steven, 2022. "A lucerne-digit grass pasture offers herbage production and rainwater productivity equal to a digit grass pasture fertilized with applied nitrogen," Agricultural Water Management, Elsevier, vol. 259(C).
    9. M. P. Wand, 2003. "Smoothing and mixed models," Computational Statistics, Springer, vol. 18(2), pages 223-249, July.
    10. Fang, Kuangnan & Chen, Yuanxing & Ma, Shuangge & Zhang, Qingzhao, 2022. "Biclustering analysis of functionals via penalized fusion," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    11. Kuparinen, Anna & Björklund, Mats, 2011. "Theory put into practice: An R implementation of the infinite-dimensional model," Ecological Modelling, Elsevier, vol. 222(12), pages 2027-2030.
    12. Jan Serroyen & Geert Molenberghs & Marc Aerts & Ellen Vloeberghs & Peter Paul De Deyn & Geert Verbeke, 2010. "Flexible estimation of serial correlation in nonlinear mixed models," Journal of Applied Statistics, Taylor & Francis Journals, vol. 37(5), pages 833-846.
    13. Bradley Jones & Peter Goos, 2009. "D-optimal design of split-split-plot experiments," Biometrika, Biometrika Trust, vol. 96(1), pages 67-82.
    14. Lee, Dae-Jin & Durbán, María, 2008. "Smooth-car mixed models for spatial count data," DES - Working Papers. Statistics and Econometrics. WS ws085820, Universidad Carlos III de Madrid. Departamento de Estadística.
    15. Dale Zimmerman & Vicente Núñez-Antón & Timothy Gregoire & Oliver Schabenberger & Jeffrey Hart & Michael Kenward & Geert Molenberghs & Geert Verbeke & Mohsen Pourahmadi & Philippe Vieu & Dela Zimmerman, 2001. "Parametric modelling of growth curve data: An overview," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 10(1), pages 1-73, June.
    16. Eilers, Paul H.C. & Currie, Iain D. & Durban, Maria, 2006. "Fast and compact smoothing on large multidimensional grids," Computational Statistics & Data Analysis, Elsevier, vol. 50(1), pages 61-76, January.
    17. Camarda, Carlo Giovanni & Durbán, María, 2008. "Goodness of fit in models for mortality data," DES - Working Papers. Statistics and Econometrics. WS ws083909, Universidad Carlos III de Madrid. Departamento de Estadística.
    18. Lee, Dae-Jin & Durbán, María, 2009. "Smooth-CAR mixed models for spatial count data," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 2968-2979, June.
    19. Maria Durbán & Iain D. Currie, 2003. "A note on P-spline additive models with correlated errors," Computational Statistics, Springer, vol. 18(2), pages 251-262, July.
    20. Richard G. Jarrett & Katya Ruggiero, 2008. "Design and Analysis of Two-Phase Experiments for Gene Expression Microarrays—Part I," Biometrics, The International Biometric Society, vol. 64(1), pages 208-216, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jagbes:v:29:y:2024:i:2:d:10.1007_s13253-023-00574-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.