IDEAS home Printed from https://ideas.repec.org/a/eee/ecosta/v25y2023icp66-86.html
   My bibliography  Save this article

Constructing a polygenic risk score for childhood obesity using functional data analysis

Author

Listed:
  • Craig, Sarah J.C.
  • Kenney, Ana M.
  • Lin, Junli
  • Paul, Ian M.
  • Birch, Leann L.
  • Savage, Jennifer S.
  • Marini, Michele E.
  • Chiaromonte, Francesca
  • Reimherr, Matthew L.
  • Makova, Kateryna D.

Abstract

Obesity is a highly heritable condition that affects increasing numbers of adults and, concerningly, of children. However, only a small fraction of its heritability has been attributed to specific genetic variants. These variants are traditionally ascertained from genome-wide association studies (GWAS), which utilize samples with tens or hundreds of thousands of individuals for whom a single summary measurement (e.g., BMI) is collected. An alternative approach is to focus on a smaller, more deeply characterized sample in conjunction with advanced statistical models that leverage longitudinal phenotypes. Novel functional data analysis (FDA) techniques are used to capitalize on longitudinal growth information from a cohort of children between birth and three years of age. In an ultra-high dimensional setting, hundreds of thousands of single nucleotide polymorphisms (SNPs) are screened, and selected SNPs are used to construct two polygenic risk scores (PRS) for childhood obesity using a weighting approach that incorporates the dynamic and joint nature of SNP effects. These scores are significantly higher in children with (vs. without) rapid infant weight gain—a predictor of obesity later in life. Using two independent cohorts, it is shown that the genetic variants identified in very young children are also informative in older children and in adults, consistent with early childhood obesity being predictive of obesity later in life. In contrast, PRSs based on SNPs identified by adult obesity GWAS are not predictive of weight gain in the cohort of young children. This provides an example of a successful application of FDA to GWAS. This application is complemented with simulations establishing that a deeply characterized sample can be just as, if not more, effective than a comparable study with a cross-sectional response. Overall, it is demonstrated that a deep, statistically sophisticated characterization of a longitudinal phenotype can provide increased statistical power to studies with relatively small sample sizes; and shows how FDA approaches can be used as an alternative to the traditional GWAS.

Suggested Citation

  • Craig, Sarah J.C. & Kenney, Ana M. & Lin, Junli & Paul, Ian M. & Birch, Leann L. & Savage, Jennifer S. & Marini, Michele E. & Chiaromonte, Francesca & Reimherr, Matthew L. & Makova, Kateryna D., 2023. "Constructing a polygenic risk score for childhood obesity using functional data analysis," Econometrics and Statistics, Elsevier, vol. 25(C), pages 66-86.
  • Handle: RePEc:eee:ecosta:v:25:y:2023:i:c:p:66-86
    DOI: 10.1016/j.ecosta.2021.10.014
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S2452306221001295
    Download Restriction: Full text for ScienceDirect subscribers only. Contains open access articles

    File URL: https://libkey.io/10.1016/j.ecosta.2021.10.014?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    2. Jingyuan Liu & Runze Li & Rongling Wu, 2014. "Feature Selection for Varying Coefficient Models With Ultrahigh-Dimensional Covariates," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(505), pages 266-274, March.
    3. Fan, Zhaohu & Reimherr, Matthew, 2017. "High-dimensional adaptive function-on-scalar regression," Econometrics and Statistics, Elsevier, vol. 1(C), pages 167-183.
    4. Ulla Sovio & Dennis O Mook-Kanamori & Nicole M Warrington & Robert Lawrence & Laurent Briollais & Colin N A Palmer & Joanne Cecil & Johanna K Sandling & Ann-Christine Syvänen & Marika Kaakinen & Lawri, 2011. "Association between Common Variation at the FTO Locus and Changes in Body Mass Index from Infancy to Late Childhood: The Complex Nature of Genetic Association through Growth and Development," PLOS Genetics, Public Library of Science, vol. 7(2), pages 1-13, February.
    5. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    6. Runze Li & Wei Zhong & Liping Zhu, 2012. "Feature Screening via Distance Correlation Learning," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(499), pages 1129-1139, September.
    7. Jianqing Fan & Yunbei Ma & Wei Dai, 2014. "Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Varying Coefficient Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1270-1284, September.
    8. Yao, Fang & Muller, Hans-Georg & Wang, Jane-Ling, 2005. "Functional Data Analysis for Sparse Longitudinal Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 577-590, June.
    9. Robert Tibshirani, 2011. "Regression shrinkage and selection via the lasso: a retrospective," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(3), pages 273-282, June.
    10. Adam E. Locke & Bratati Kahali & Sonja I. Berndt & Anne E. Justice & Tune H. Pers & Felix R. Day & Corey Powell & Sailaja Vedantam & Martin L. Buchkovich & Jian Yang & Damien C. Croteau-Chonka & Tonu , 2015. "Genetic studies of body mass index yield new insights for obesity biology," Nature, Nature, vol. 518(7538), pages 197-206, February.
    11. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    12. Jared O'Connell & Deepti Gurdasani & Olivier Delaneau & Nicola Pirastu & Sheila Ulivi & Massimiliano Cocca & Michela Traglia & Jie Huang & Jennifer E Huffman & Igor Rudan & Ruth McQuillan & Ross M Fra, 2014. "A General Approach for Haplotype Phasing across the Full Spectrum of Relatedness," PLOS Genetics, Public Library of Science, vol. 10(4), pages 1-21, April.
    13. Hyunphil Choi & Matthew Reimherr, 2018. "A geometric approach to confidence regions and bands for functional parameters," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(1), pages 239-260, January.
    14. Jiguo Cao & James Ramsay, 2007. "Parameter cascades and profiling in functional data analysis," Computational Statistics, Springer, vol. 22(3), pages 335-351, September.
    15. Mousavi, Seyed Nourollah & Sørensen, Helle, 2017. "Multinomial functional regression with wavelets and LASSO penalization," Econometrics and Statistics, Elsevier, vol. 1(C), pages 150-166.
    16. Xiaofeng Shao & Jingsi Zhang, 2014. "Martingale Difference Correlation and Its Use in High-Dimensional Variable Screening," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1302-1318, September.
    17. Julia Wrobel & Vadim Zipunnikov & Jennifer Schrack & Jeff Goldsmith, 2019. "Registration for exponential family functional data," Biometrics, The International Biometric Society, vol. 75(1), pages 48-57, March.
    18. Park, So Young & Xiao, Luo & Willbur, Jayson D. & Staicu, Ana-Maria & Jumbe, N. L’ntshotsholé, 2018. "A joint design for functional data with application to scheduling ultrasound scans," Computational Statistics & Data Analysis, Elsevier, vol. 122(C), pages 101-114.
    19. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    20. Miguel Chagnon & Jennifer O’Loughlin & James C Engert & Igor Karp & Marie-Pierre Sylvestre, 2018. "Missing single nucleotide polymorphisms in Genetic Risk Scores: A simulation study," PLOS ONE, Public Library of Science, vol. 13(7), pages 1-14, July.
    21. Suzanne Vogelezang & Jonathan P Bradfield & Tarunveer S Ahluwalia & John A Curtin & Timo A Lakka & Niels Grarup & Markus Scholz & Peter J van der Most & Claire Monnereau & Evie Stergiakouli & Anni Hei, 2020. "Novel loci for childhood body mass index and shared heritability with adult cardiometabolic traits," PLOS Genetics, Public Library of Science, vol. 16(10), pages 1-26, October.
    22. Bryan N Howie & Peter Donnelly & Jonathan Marchini, 2009. "A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies," PLOS Genetics, Public Library of Science, vol. 5(6), pages 1-15, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhang, Shucong & Zhou, Yong, 2018. "Variable screening for ultrahigh dimensional heterogeneous data via conditional quantile correlations," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 1-13.
    2. Wang, Christina Dan & Chen, Zhao & Lian, Yimin & Chen, Min, 2022. "Asset selection based on high frequency Sharpe ratio," Journal of Econometrics, Elsevier, vol. 227(1), pages 168-188.
    3. Zhang, Shucong & Pan, Jing & Zhou, Yong, 2018. "Robust conditional nonparametric independence screening for ultrahigh-dimensional data," Statistics & Probability Letters, Elsevier, vol. 143(C), pages 95-101.
    4. Chen, Xiaolin & Chen, Xiaojing & Wang, Hong, 2018. "Robust feature screening for ultra-high dimensional right censored data via distance correlation," Computational Statistics & Data Analysis, Elsevier, vol. 119(C), pages 118-138.
    5. Zhong, Wei & Wang, Jiping & Chen, Xiaolin, 2021. "Censored mean variance sure independence screening for ultrahigh dimensional survival data," Computational Statistics & Data Analysis, Elsevier, vol. 159(C).
    6. Xiaolin Chen & Xiaojing Chen & Yi Liu, 2019. "A note on quantile feature screening via distance correlation," Statistical Papers, Springer, vol. 60(5), pages 1741-1762, October.
    7. Yi Chu & Lu Lin, 2020. "Conditional SIRS for nonparametric and semiparametric models by marginal empirical likelihood," Statistical Papers, Springer, vol. 61(4), pages 1589-1606, August.
    8. Xiang-Jie Li & Xue-Jun Ma & Jing-Xiao Zhang, 2017. "Robust feature screening for varying coefficient models via quantile partial correlation," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 80(1), pages 17-49, January.
    9. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    10. He, Yong & Zhang, Liang & Ji, Jiadong & Zhang, Xinsheng, 2019. "Robust feature screening for elliptical copula regression model," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 568-582.
    11. Min Chen & Yimin Lian & Zhao Chen & Zhengjun Zhang, 2017. "Sure explained variability and independence screening," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 29(4), pages 849-883, October.
    12. Yang, Baoying & Yin, Xiangrong & Zhang, Nan, 2019. "Sufficient variable selection using independence measures for continuous response," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 480-493.
    13. Dai, Linlin & Chen, Kani & Sun, Zhihua & Liu, Zhenqiu & Li, Gang, 2018. "Broken adaptive ridge regression and its asymptotic properties," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 334-351.
    14. Ke, Chenlu & Yang, Wei & Yuan, Qingcong & Li, Lu, 2023. "Partial sufficient variable screening with categorical controls," Computational Statistics & Data Analysis, Elsevier, vol. 187(C).
    15. Li, Xingxiang & Cheng, Guosheng & Wang, Liming & Lai, Peng & Song, Fengli, 2017. "Ultrahigh dimensional feature screening via projection," Computational Statistics & Data Analysis, Elsevier, vol. 114(C), pages 88-104.
    16. Liming Wang & Xingxiang Li & Xiaoqing Wang & Peng Lai, 2022. "Unified mean-variance feature screening for ultrahigh-dimensional regression," Computational Statistics, Springer, vol. 37(4), pages 1887-1918, September.
    17. Xia, Xiaochao & Yang, Hu & Li, Jialiang, 2016. "Feature screening for generalized varying coefficient models with application to dichotomous responses," Computational Statistics & Data Analysis, Elsevier, vol. 102(C), pages 85-97.
    18. Jing Zhang & Haibo Zhou & Yanyan Liu & Jianwen Cai, 2021. "Conditional screening for ultrahigh-dimensional survival data in case-cohort studies," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 27(4), pages 632-661, October.
    19. Dong, Yuexiao & Yu, Zhou & Zhu, Liping, 2020. "Model-free variable selection for conditional mean in regression," Computational Statistics & Data Analysis, Elsevier, vol. 152(C).
    20. Ma, Xuejun & Zhang, Jingxiao, 2016. "Robust model-free feature screening via quantile correlation," Journal of Multivariate Analysis, Elsevier, vol. 143(C), pages 472-480.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ecosta:v:25:y:2023:i:c:p:66-86. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/econometrics-and-statistics .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.