IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i19p3695-d937026.html
   My bibliography  Save this article

Recent Advances on Penalized Regression Models for Biological Data

Author

Listed:
  • Pei Wang

    (School of Mathematics and Statistics, Henan University, Kaifeng 475004, China
    Center for Applied Mathematics of Henan Province, Henan University, Kaifeng 475004, China)

  • Shunjie Chen

    (School of Mathematics and Statistics, Henan University, Kaifeng 475004, China)

  • Sijia Yang

    (School of Mathematics and Statistics, Henan University, Kaifeng 475004, China)

Abstract

Increasingly amounts of biological data promote the development of various penalized regression models. This review discusses the recent advances in both linear and logistic regression models with penalization terms. This review is mainly focused on various penalized regression models, some of the corresponding optimization algorithms, and their applications in biological data. The pros and cons of different models in terms of response prediction, sample classification, network construction and feature selection are also reviewed. The performances of different models in a real-world RNA-seq dataset for breast cancer are explored. Finally, some future directions are discussed.

Suggested Citation

  • Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:19:p:3695-:d:937026
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/19/3695/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/19/3695/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. H. Jeong & S. P. Mason & A.-L. Barabási & Z. N. Oltvai, 2001. "Lethality and centrality in protein networks," Nature, Nature, vol. 411(6833), pages 41-42, May.
    2. Huang, Zhensheng & Lin, Bingqing & Feng, Fan & Pang, Zhen, 2013. "Efficient penalized estimating method in the partially varying-coefficient single-index model," Journal of Multivariate Analysis, Elsevier, vol. 114(C), pages 189-200.
    3. Jian Huang & Shuange Ma & Huiliang Xie & Cun-Hui Zhang, 2009. "A group bridge approach for variable selection," Biometrika, Biometrika Trust, vol. 96(2), pages 339-355.
    4. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    5. Qifan Song & Faming Liang, 2015. "High-Dimensional Variable Selection With Reciprocal L 1 -Regularization," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1607-1620, December.
    6. Md. Ansari & Dinesh Jain & Haripriya Harikumar & Santu Rana & Sunil Gupta & Sandeep Budhiraja & Svetha Venkatesh, 2021. "Identification of predictors and model for predicting prolonged length of stay in dengue patients," Health Care Management Science, Springer, vol. 24(4), pages 786-798, December.
    7. P. Tseng & S. Yun, 2009. "Block-Coordinate Gradient Descent Method for Linearly Constrained Nonsmooth Separable Optimization," Journal of Optimization Theory and Applications, Springer, vol. 140(3), pages 513-535, March.
    8. Howard D. Bondell & Brian J. Reich, 2008. "Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR," Biometrics, The International Biometric Society, vol. 64(1), pages 115-123, March.
    9. Autcha Araveeporn, 2021. "The Higher-Order of Adaptive Lasso and Elastic Net Methods for Classification on High Dimensional Data," Mathematics, MDPI, vol. 9(10), pages 1-14, May.
    10. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    11. Lukas Meier & Sara Van De Geer & Peter Bühlmann, 2008. "The group lasso for logistic regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(1), pages 53-71, February.
    12. Wentao Wang & Jiaxuan Liang & Rong Liu & Yunquan Song & Min Zhang, 2022. "A Robust Variable Selection Method for Sparse Online Regression via the Elastic Net Penalty," Mathematics, MDPI, vol. 10(16), pages 1-18, August.
    13. Hakon Hakonarson & Struan F. A. Grant & Jonathan P. Bradfield & Luc Marchand & Cecilia E. Kim & Joseph T. Glessner & Rosemarie Grabs & Tracy Casalunovo & Shayne P. Taback & Edward C. Frackelton & Marg, 2007. "A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene," Nature, Nature, vol. 448(7153), pages 591-594, August.
    14. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    15. Yuan Jiang & Yunxiao He & Heping Zhang, 2016. "Variable Selection With Prior Information for Generalized Linear Models via the Prior LASSO Method," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 355-376, March.
    16. Ming Yuan & Yi Lin, 2007. "Model selection and estimation in the Gaussian graphical model," Biometrika, Biometrika Trust, vol. 94(1), pages 19-35.
    17. Douglas F. Easton & Karen A. Pooley & Alison M. Dunning & Paul D. P. Pharoah & Deborah Thompson & Dennis G. Ballinger & Jeffery P. Struewing & Jonathan Morrison & Helen Field & Robert Luben & Nicholas, 2007. "Genome-wide association study identifies novel breast cancer susceptibility loci," Nature, Nature, vol. 447(7148), pages 1087-1093, June.
    18. Robert Tibshirani & Michael Saunders & Saharon Rosset & Ji Zhu & Keith Knight, 2005. "Sparsity and smoothness via the fused lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(1), pages 91-108, February.
    19. Hang, Zihua & Dai, Penglin & Jia, Shanshan & Yu, Zhaofei, 2020. "Network structure reconstruction with symmetry constraint," Chaos, Solitons & Fractals, Elsevier, vol. 139(C).
    20. Abhijeet R Patil & Sangjin Kim, 2020. "Combination of Ensembles of Regularized Regression Models with Resampling-Based Lasso Feature Selection in High Dimensional Data," Mathematics, MDPI, vol. 8(1), pages 1-23, January.
    21. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    22. Qiang Sun & Wen-Xin Zhou & Jianqing Fan, 2020. "Adaptive Huber Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(529), pages 254-265, January.
    23. Jianhua Guo & Jianchang Hu & Bing-Yi Jing & Zhen Zhang, 2016. "Spline-Lasso in High-Dimensional Linear Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 288-297, March.
    24. Hui Lin & Chong Wang & Peng Liu & Derald Holtkamp, 2013. "Construction of disease risk scoring systems using logistic group lasso: application to porcine reproductive and respiratory syndrome survey data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 40(4), pages 736-746.
    25. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    26. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    27. Cai, Tony & Liu, Weidong & Luo, Xi, 2011. "A Constrained â„“1 Minimization Approach to Sparse Precision Matrix Estimation," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 594-607.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chen, Shunjie & Yang, Sijia & Wang, Pei & Xue, Liugen, 2023. "Two-stage penalized algorithms via integrating prior information improve gene selection from omics data," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 628(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    2. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    3. Siwei Xia & Yuehan Yang & Hu Yang, 2022. "Sparse Laplacian Shrinkage with the Graphical Lasso Estimator for Regression Problems," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 255-277, March.
    4. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    5. Chen, Shunjie & Yang, Sijia & Wang, Pei & Xue, Liugen, 2023. "Two-stage penalized algorithms via integrating prior information improve gene selection from omics data," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 628(C).
    6. Takumi Saegusa & Tianzhou Ma & Gang Li & Ying Qing Chen & Mei-Ling Ting Lee, 2020. "Variable Selection in Threshold Regression Model with Applications to HIV Drug Adherence Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(3), pages 376-398, December.
    7. Zanhua Yin, 2020. "Variable selection for sparse logistic regression," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(7), pages 821-836, October.
    8. Justin B. Post & Howard D. Bondell, 2013. "Factor Selection and Structural Identification in the Interaction ANOVA Model," Biometrics, The International Biometric Society, vol. 69(1), pages 70-79, March.
    9. Jiang, Liewen & Bondell, Howard D. & Wang, Huixia Judy, 2014. "Interquantile shrinkage and variable selection in quantile regression," Computational Statistics & Data Analysis, Elsevier, vol. 69(C), pages 208-219.
    10. Yanfang Zhang & Chuanhua Wei & Xiaolin Liu, 2022. "Group Logistic Regression Models with l p,q Regularization," Mathematics, MDPI, vol. 10(13), pages 1-15, June.
    11. Young Joo Yoon & Cheolwoo Park & Erik Hofmeister & Sangwook Kang, 2012. "Group variable selection in cardiopulmonary cerebral resuscitation data for veterinary patients," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(7), pages 1605-1621, January.
    12. Howard D. Bondell & Brian J. Reich, 2009. "Simultaneous Factor Selection and Collapsing Levels in ANOVA," Biometrics, The International Biometric Society, vol. 65(1), pages 169-177, March.
    13. Matsui, Hidetoshi, 2014. "Variable and boundary selection for functional data via multiclass logistic regression modeling," Computational Statistics & Data Analysis, Elsevier, vol. 78(C), pages 176-185.
    14. Mingqiu Wang & Guo-Liang Tian, 2016. "Robust group non-convex estimations for high-dimensional partially linear models," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(1), pages 49-67, March.
    15. Shanshan Qin & Hao Ding & Yuehua Wu & Feng Liu, 2021. "High-dimensional sign-constrained feature selection and grouping," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(4), pages 787-819, August.
    16. Kaida Cai & Hua Shen & Xuewen Lu, 2022. "Adaptive bi-level variable selection for multivariate failure time model with a diverging number of covariates," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(4), pages 968-993, December.
    17. Laura Freijeiro‐González & Manuel Febrero‐Bande & Wenceslao González‐Manteiga, 2022. "A Critical Review of LASSO and Its Derivatives for Variable Selection Under Dependence Among Covariates," International Statistical Review, International Statistical Institute, vol. 90(1), pages 118-145, April.
    18. Guan Yu & Yufeng Liu, 2016. "Sparse Regression Incorporating Graphical Structure Among Predictors," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(514), pages 707-720, April.
    19. Yize Zhao & Matthias Chung & Brent A. Johnson & Carlos S. Moreno & Qi Long, 2016. "Hierarchical Feature Selection Incorporating Known and Novel Biological Information: Identifying Genomic Features Related to Prostate Cancer Recurrence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1427-1439, October.
    20. Capanu, Marinela & Giurcanu, Mihai & Begg, Colin B. & Gönen, Mithat, 2023. "Subsampling based variable selection for generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 184(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:19:p:3695-:d:937026. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.