IDEAS home Printed from https://ideas.repec.org/a/bla/jorssc/v71y2022i3p541-561.html
   My bibliography  Save this article

Regularized regression on compositional trees with application to MRI analysis

Author

Listed:
  • Bingkai Wang
  • Brian S. Caffo
  • Xi Luo
  • Chin‐Fu Liu
  • Andreia V. Faria
  • Michael I. Miller
  • Yi Zhao
  • for the Alzheimer's Disease Neuroimaging Initiative*

Abstract

A compositional tree refers to a tree structure on a set of random variables where each random variable is a node and composition occurs at each non‐leaf node of the tree. As a generalization of compositional data, compositional trees handle more complex relationships among random variables and appear in many disciplines, such as brain imaging, genomics and finance. We consider the problem of sparse regression on data that are associated with a compositional tree and propose a transformation‐free tree‐based regularized regression method for component selection. The regularization penalty is designed based on the tree structure and encourages a sparse tree representation. We prove that our proposed estimator for regression coefficients is both consistent and model selection consistent. In the simulation study, our method shows higher accuracy than competing methods under different scenarios. By analysing a brain imaging data set from studies of Alzheimer's disease, our method identifies meaningful associations between memory decline and volume of brain regions that are consistent with current understanding.

Suggested Citation

  • Bingkai Wang & Brian S. Caffo & Xi Luo & Chin‐Fu Liu & Andreia V. Faria & Michael I. Miller & Yi Zhao & for the Alzheimer's Disease Neuroimaging Initiative*, 2022. "Regularized regression on compositional trees with application to MRI analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(3), pages 541-561, June.
  • Handle: RePEc:bla:jorssc:v:71:y:2022:i:3:p:541-561
    DOI: 10.1111/rssc.12545
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssc.12545
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssc.12545?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Robert Tibshirani & Michael Saunders & Saharon Rosset & Ji Zhu & Keith Knight, 2005. "Sparsity and smoothness via the fused lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(1), pages 91-108, February.
    2. Wei Lin & Pixu Shi & Rui Feng & Hongzhe Li, 2014. "Variable selection in regression with compositional covariates," Biometrika, Biometrika Trust, vol. 101(4), pages 785-797.
    3. Yuhong Yang, 2005. "Can the strengths of AIC and BIC be shared? A conflict between model indentification and regression estimation," Biometrika, Biometrika Trust, vol. 92(4), pages 937-950, December.
    4. Ali Shojaie & George Michailidis, 2010. "Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs," Biometrika, Biometrika Trust, vol. 97(3), pages 519-538.
    5. Xiaohan Yan & Jacob Bien, 2021. "Rare Feature Selection in High Dimensions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(534), pages 887-900, April.
    6. Papke, Leslie E & Wooldridge, Jeffrey M, 1996. "Econometric Methods for Fractional Response Variables with an Application to 401(K) Plan Participation Rates," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 11(6), pages 619-632, Nov.-Dec..
    7. Tao Wang & Hongyu Zhao, 2017. "Constructing Predictive Microbial Signatures at Multiple Taxonomic Levels," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(519), pages 1022-1031, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yi Zhao & Bingkai Wang & Chin‐Fu Liu & Andreia V. Faria & Michael I. Miller & Brian S. Caffo & Xi Luo, 2023. "Identifying brain hierarchical structures associated with Alzheimer's disease using a regularized regression method with tree predictors," Biometrics, The International Biometric Society, vol. 79(3), pages 2333-2345, September.
    2. Jacob Fiksel & Scott Zeger & Abhirup Datta, 2022. "A transformation‐free linear regression for compositional outcomes and predictors," Biometrics, The International Biometric Society, vol. 78(3), pages 974-987, September.
    3. Xiaofei Wu & Rongmei Liang & Hu Yang, 2022. "Penalized and constrained LAD estimation in fixed and high dimension," Statistical Papers, Springer, vol. 63(1), pages 53-95, February.
    4. Haixiang Zhang & Jun Chen & Zhigang Li & Lei Liu, 2021. "Testing for Mediation Effect with Application to Human Microbiome Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(2), pages 313-328, July.
    5. Kwon, Sunghoon & Oh, Seungyoung & Lee, Youngjo, 2016. "The use of random-effect models for high-dimensional variable selection problems," Computational Statistics & Data Analysis, Elsevier, vol. 103(C), pages 401-412.
    6. Howard D. Bondell & Brian J. Reich, 2009. "Simultaneous Factor Selection and Collapsing Levels in ANOVA," Biometrics, The International Biometric Society, vol. 65(1), pages 169-177, March.
    7. Laura Freijeiro‐González & Manuel Febrero‐Bande & Wenceslao González‐Manteiga, 2022. "A Critical Review of LASSO and Its Derivatives for Variable Selection Under Dependence Among Covariates," International Statistical Review, International Statistical Institute, vol. 90(1), pages 118-145, April.
    8. Gen Li & Yan Li & Kun Chen, 2023. "It's all relative: Regression analysis with compositional predictors," Biometrics, The International Biometric Society, vol. 79(2), pages 1318-1329, June.
    9. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    10. Jinsuk Yang & Qing Hao & Mahmut Yaşar, 2023. "Institutional investors and cross‐border mergers and acquisitions: The 2000–2018 period," International Review of Finance, International Review of Finance Ltd., vol. 23(3), pages 553-583, September.
    11. Alexander Klein & Karl Gunnar Persson & Paul Sharp, 2023. "Populism and the first wave of globalization: Evidence from the 1892 US presidential election," Rivista di storia economica, Società editrice il Mulino, issue 2, pages 163-202.
    12. Alperovych, Yan & Hübner, Georges & Lobet, Fabrice, 2015. "How does governmental versus private venture capital backing affect a firm's efficiency? Evidence from Belgium," Journal of Business Venturing, Elsevier, vol. 30(4), pages 508-525.
    13. Giuliani, Elisa & Martinelli, Arianna & Rabellotti, Roberta, 2016. "Is Co-Invention Expediting Technological Catch Up? A Study of Collaboration between Emerging Country Firms and EU Inventors," World Development, Elsevier, vol. 77(C), pages 192-205.
    14. Matthias Schmid & Florian Wickler & Kelly O Maloney & Richard Mitchell & Nora Fenske & Andreas Mayr, 2013. "Boosted Beta Regression," PLOS ONE, Public Library of Science, vol. 8(4), pages 1-15, April.
    15. Christophe Hurlin & Jérémy Leymarie & Antoine Patin, 2018. "Loss functions for LGD model comparison," Working Papers halshs-01516147, HAL.
    16. Blackburn, McKinley L. & Vermilyea, Todd, 2012. "The prevalence and impact of misstated incomes on mortgage loan applications," Journal of Housing Economics, Elsevier, vol. 21(2), pages 151-168.
    17. de Rassenfosse, Gaétan, 2013. "Do firms face a trade-off between the quantity and the quality of their inventions?," Research Policy, Elsevier, vol. 42(5), pages 1072-1079.
    18. Mazen Hassan & Sarah Mansour & Stefan Voigt & May Gadallah, 2022. "When Syria was in Egypt’s land: Egyptians cooperate with Syrians, but less with each other," Public Choice, Springer, vol. 191(3), pages 337-362, June.
    19. Qun Bao & Jiuli Huang & Yanling Wang, 2015. "Productivity and Firms’ Sales Destination: Chinese Characteristics," Review of International Economics, Wiley Blackwell, vol. 23(3), pages 620-637, August.
    20. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssc:v:71:y:2022:i:3:p:541-561. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.