Author
Listed:
- Tatjana Pavlenko
(KTH Royal Institute of Technology, Department of Mathematics)
- Annika Tillander
(Linköping University, Department of Statistics and Machine Learning)
- Justine Debelius
(Karolinska Institutet, The Centre for Translational Microbiome Research (CTMR), Department of Microbiology, Tumor, and Cell Biology)
- Fredrik Boulund
(Karolinska Institutet, The Centre for Translational Microbiome Research (CTMR), Department of Microbiology, Tumor, and Cell Biology)
Abstract
We present a family of goodness-of-fit (GOF) test statistics specifically designed for detection of sparse-weak mixtures, where only a small fraction of the observational units are contaminated arising from a different distribution. The test statistics are constructed as sup-functionals of weighted empirical processes where the weight functions employed are the Chibisov-O’Reilly functions of a Brownian bridge. The study recovers and extends a number of previously known results on sparse detection using a weighted GOF (wGOF) approach. In particular, the results obtained demonstrate the advantage of our approach over a common approach that utilizes a family of regularly varying weight functions. We show that the Chibisov-O’Reilly family has important advantages over better known approaches as it allows for optimally adaptive, fully data-driven test procedures. The theory is further developed to demonstrate that the entire family is a flexible device that adapts to many interesting situations of modern scientific practice where the number of observations stays fixed or grows very slowly while the number of automatically measured features grows dramatically and only a small fraction of these features are useful. Numerical studies are performed to investigate the finite sample properties of the theoretical results. We shown that the Chibisov-O’Reilly family compares favorably to related test statistics over a broad range of sparsity and weakness regimes for the Gaussian and high-dimensional Dirichlet types of sparse mixture. Finally, an example of human gut microbiome data set is presented to illustrate that the family of tests has found applications in real-life sparse signal detection problems where the sample size is small in relation to the features dimension.
Suggested Citation
Tatjana Pavlenko & Annika Tillander & Justine Debelius & Fredrik Boulund, 2020.
"Detection of Sparse and Weak Effects in High-Dimensional Feature Space, with an Application to Microbiome Data Analysis,"
Springer Books, in: Thomas Holgersson & Martin Singull (ed.), Recent Developments in Multivariate and Random Matrix Analysis, chapter 0, pages 287-311,
Springer.
Handle:
RePEc:spr:sprchp:978-3-030-56773-6_17
DOI: 10.1007/978-3-030-56773-6_17
Download full text from publisher
To our knowledge, this item is not available for
download. To find whether it is available, there are three
options:
1. Check below whether another version of this item is available online.
2. Check on the provider's
web page
whether it is in fact available.
3. Perform a
for a similarly titled item that would be
available.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:sprchp:978-3-030-56773-6_17. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.