IDEAS home Printed from https://ideas.repec.org/a/oup/biomet/v113y2026i1pasaf057..html

Decomposing Gaussians with unknown covariance

Author

Listed:
  • A Dharamshi
  • A Neufeld
  • L L Gao
  • J Bien
  • D Witten

Abstract

Common workflows in machine learning and statistics rely on the ability to partition the information in a dataset into independent portions. Recent work has shown that this may be possible even when conventional sample splitting is not, such as when the number of samples, , is one or when observations are not independent and identically distributed. In the case of multivariate Gaussian data, these alternatives to sample splitting require knowledge of the covariance matrix. In many important problems, such as in spatial or longitudinal data analysis and in graphical modelling, the covariance matrix may be unknown and even of primary interest. Therefore, in this work we develop new approaches for decomposing multivariate Gaussians with unknown covariance. First, we present a general algorithm that encompasses all previous decomposition methods for Gaussian data as special cases and which can further handle the case of unknown covariance. It yields a new and more flexible alternative to sample splitting when . When , we prove that it is impossible to partition the information in a multivariate Gaussian into independent portions without knowing the covariance matrix. Hence, we use the general algorithm to decompose a single multivariate Gaussian with unknown covariance into dependent parts with tractable conditional distributions and demonstrate their use for inference and validation. The proposed decomposition strategy extends naturally to Gaussian processes. In simulations and for electroencephalography data, we apply these decompositions to the tasks of model selection and post-selection inference in settings where alternative strategies are unavailable.

Suggested Citation

  • A Dharamshi & A Neufeld & L L Gao & J Bien & D Witten, 2026. "Decomposing Gaussians with unknown covariance," Biometrika, Biometrika Trust, vol. 113(1), pages 1-057..
  • Handle: RePEc:oup:biomet:v:113:y:2026:i:1:p:asaf057.
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1093/biomet/asaf057
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:oup:biomet:v:113:y:2026:i:1:p:asaf057.. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Oxford University Press (email available below). General contact details of provider: https://academic.oup.com/biomet .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.