IDEAS home Printed from https://ideas.repec.org/a/bla/scjsta/v49y2022i3p917-942.html
   My bibliography  Save this article

Ultrahigh‐dimensional generalized additive model: Unified theory and methods

Author

Listed:
  • Kaixu Yang
  • Tapabrata Maiti

Abstract

Generalized additive model is a powerful statistical learning and predictive modeling tool that has been applied in a wide range of applications. The need of high‐dimensional additive modeling is eminent in the context of dealing with high throughput data such as genetics data analysis. In this article, we studied a two‐step selection and estimation method for ultrahigh‐dimensional generalized additive models. The first step applies group lasso on the expanded bases of the functions. With high probability this selects all nonzero functions without having too much over selection. The second step uses adaptive group lasso with any initial estimators, including the group lasso estimator, that satisfies some regular conditions. The adaptive group lasso estimator is shown to be selection consistent with improved convergence rates. Tuning parameter selection is also discussed and shown to select the true model consistently under generalized information criterion procedure. The theoretical properties are supported by extensive numerical study.

Suggested Citation

  • Kaixu Yang & Tapabrata Maiti, 2022. "Ultrahigh‐dimensional generalized additive model: Unified theory and methods," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(3), pages 917-942, September.
  • Handle: RePEc:bla:scjsta:v:49:y:2022:i:3:p:917-942
    DOI: 10.1111/sjos.12548
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/sjos.12548
    Download Restriction: no

    File URL: https://libkey.io/10.1111/sjos.12548?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Hansheng Wang & Bo Li & Chenlei Leng, 2009. "Shrinkage tuning parameter selection with a diverging number of parameters," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(3), pages 671-683, June.
    2. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    3. A. Belloni & V. Chernozhukov & L. Wang, 2011. "Square-root lasso: pivotal recovery of sparse signals via conic programming," Biometrika, Biometrika Trust, vol. 98(4), pages 791-806.
    4. Jiahua Chen & Zehua Chen, 2008. "Extended Bayesian information criteria for model selection with large model spaces," Biometrika, Biometrika Trust, vol. 95(3), pages 759-771.
    5. Gerhard Tutz & Harald Binder, 2006. "Generalized Additive Modeling with Implicit Variable Selection by Likelihood-Based Boosting," Biometrics, The International Biometric Society, vol. 62(4), pages 961-971, December.
    6. Zhang, Yiyun & Li, Runze & Tsai, Chih-Ling, 2010. "Regularization Parameter Selections via Generalized Information Criterion," Journal of the American Statistical Association, American Statistical Association, vol. 105(489), pages 312-323.
    7. Lee, Gee Y & Manski, Scott & Maiti, Tapabrata, 2020. "Actuarial Applications Of Word Embedding Models," ASTIN Bulletin, Cambridge University Press, vol. 50(1), pages 1-24, January.
    8. Marra, Giampiero & Wood, Simon N., 2011. "Practical variable selection for generalized additive models," Computational Statistics & Data Analysis, Elsevier, vol. 55(7), pages 2372-2387, July.
    9. Siddhartha Nandy & Chae Young Lim & Tapabrata Maiti, 2017. "Additive model building for spatial regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(3), pages 779-800, June.
    10. Nancy R. Zhang & David O. Siegmund, 2007. "A Modified Bayes Information Criterion with Applications to the Analysis of Comparative Genomic Hybridization Data," Biometrics, The International Biometric Society, vol. 63(1), pages 22-32, March.
    11. Yingying Fan & Cheng Yong Tang, 2013. "Tuning parameter selection in high dimensional penalized likelihood," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(3), pages 531-552, June.
    12. Qingliang Fan & Wei Zhong, 2018. "Nonparametric Additive Instrumental Variable Estimator: A Group Shrinkage Estimation Perspective," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 36(3), pages 388-399, July.
    13. Umberto Amato & Anestis Antoniadis & Italia De Feis, 2016. "Additive model selection," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 25(4), pages 519-564, November.
    14. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    15. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Burman, Prabir & Paul, Debashis, 2017. "Smooth predictive model fitting in regression," Journal of Multivariate Analysis, Elsevier, vol. 155(C), pages 165-179.
    2. Fei Jin & Lung-fei Lee, 2018. "Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices," Econometrics, MDPI, vol. 6(1), pages 1-24, February.
    3. Qingliang Fan & Yaqian Wu, 2020. "Endogenous Treatment Effect Estimation with some Invalid and Irrelevant Instruments," Papers 2006.14998, arXiv.org.
    4. Luke Mosley & Idris A. Eckley & Alex Gibberd, 2022. "Sparse temporal disaggregation," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 2203-2233, October.
    5. Jin, Fei & Lee, Lung-fei, 2018. "Irregular N2SLS and LASSO estimation of the matrix exponential spatial specification model," Journal of Econometrics, Elsevier, vol. 206(2), pages 336-358.
    6. Zhang, Shucong & Zhou, Yong, 2018. "Variable screening for ultrahigh dimensional heterogeneous data via conditional quantile correlations," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 1-13.
    7. Heng Lian, 2012. "Variable selection in high-dimensional partly linear additive models," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 24(4), pages 825-839, December.
    8. Fabian Scheipl & Thomas Kneib & Ludwig Fahrmeir, 2013. "Penalized likelihood and Bayesian function selection in regression models," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 97(4), pages 349-385, October.
    9. Gabriel E Hoffman & Benjamin A Logsdon & Jason G Mezey, 2013. "PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data," PLOS Computational Biology, Public Library of Science, vol. 9(6), pages 1-19, June.
    10. Luke Mosley & Idris Eckley & Alex Gibberd, 2021. "Sparse Temporal Disaggregation," Papers 2108.05783, arXiv.org, revised Oct 2022.
    11. Lian, Heng, 2014. "Semiparametric Bayesian information criterion for model selection in ultra-high dimensional additive models," Journal of Multivariate Analysis, Elsevier, vol. 123(C), pages 304-310.
    12. Eduardo F. Mendes & Gabriel J. P. Pinto, 2023. "Generalized Information Criteria for Structured Sparse Models," Papers 2309.01764, arXiv.org.
    13. Fan, Rui & Lee, Ji Hyung & Shin, Youngki, 2023. "Predictive quantile regression with mixed roots and increasing dimensions: The ALQR approach," Journal of Econometrics, Elsevier, vol. 237(2).
    14. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    15. Gaorong Li & Liugen Xue & Heng Lian, 2012. "SCAD-penalised generalised additive models with non-polynomial dimensionality," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 24(3), pages 681-697.
    16. Yunxiao Chen & Xiaoou Li & Jingchen Liu & Zhiliang Ying, 2017. "Regularized Latent Class Analysis with Application in Cognitive Diagnosis," Psychometrika, Springer;The Psychometric Society, vol. 82(3), pages 660-692, September.
    17. Zhang, Ting & Wang, Lei, 2020. "Smoothed empirical likelihood inference and variable selection for quantile regression with nonignorable missing response," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    18. Sakyajit Bhattacharya & Paul McNicholas, 2014. "A LASSO-penalized BIC for mixture model selection," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(1), pages 45-61, March.
    19. Zanhua Yin, 2020. "Variable selection for sparse logistic regression," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(7), pages 821-836, October.
    20. Achim Ahrens & Christian B. Hansen & Mark E. Schaffer, 2020. "lassopack: Model selection and prediction with regularized regression in Stata," Stata Journal, StataCorp LP, vol. 20(1), pages 176-235, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:scjsta:v:49:y:2022:i:3:p:917-942. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0303-6898 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.