IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0089700.html
   My bibliography  Save this article

Using Multivariate Regression Model with Least Absolute Shrinkage and Selection Operator (LASSO) to Predict the Incidence of Xerostomia after Intensity-Modulated Radiotherapy for Head and Neck Cancer

Author

Listed:
  • Tsair-Fwu Lee
  • Pei-Ju Chao
  • Hui-Min Ting
  • Liyun Chang
  • Yu-Jie Huang
  • Jia-Ming Wu
  • Hung-Yu Wang
  • Mong-Fong Horng
  • Chun-Ming Chang
  • Jen-Hong Lan
  • Ya-Yu Huang
  • Fu-Min Fang
  • Stephen Wan Leung

Abstract

Purpose: The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials: Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results: Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions: Multivariate NTCP models with LASSO can be used to predict patient-rated xerostomia after IMRT.

Suggested Citation

  • Tsair-Fwu Lee & Pei-Ju Chao & Hui-Min Ting & Liyun Chang & Yu-Jie Huang & Jia-Ming Wu & Hung-Yu Wang & Mong-Fong Horng & Chun-Ming Chang & Jen-Hong Lan & Ya-Yu Huang & Fu-Min Fang & Stephen Wan Leung, 2014. "Using Multivariate Regression Model with Least Absolute Shrinkage and Selection Operator (LASSO) to Predict the Incidence of Xerostomia after Intensity-Modulated Radiotherapy for Head and Neck Cancer," PLOS ONE, Public Library of Science, vol. 9(2), pages 1-11, February.
  • Handle: RePEc:plo:pone00:0089700
    DOI: 10.1371/journal.pone.0089700
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0089700
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0089700&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0089700?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. James H. Stock & Mark W. Watson, 2012. "Generalized Shrinkage Methods for Forecasting Using Many Predictors," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 30(4), pages 481-493, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ekele Alih & Hong Choon Ong, 2015. "Cluster-based multivariate outlier identification and re-weighted regression in linear models," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(5), pages 938-955, May.
    2. Akram Farhadi & Joshua J. Chern & Daniel Hirsh & Tod Davis & Mingyoung Jo & Frederick Maier & Khaled Rasheed, 2018. "Intracranial Pressure Forecasting in Children Using Dynamic Averaging of Time Series Data," Forecasting, MDPI, vol. 1(1), pages 1-12, August.
    3. Tarun Mehra & Christian Thomas Benedikt Müller & Jörk Volbracht & Burkhardt Seifert & Rudolf Moos, 2015. "Predictors of High Profit and High Deficit Outliers under SwissDRG of a Tertiary Care Center," PLOS ONE, Public Library of Science, vol. 10(10), pages 1-18, October.
    4. Laura Cella & Giuseppe Palma & Joseph O Deasy & Jung Hun Oh & Raffaele Liuzzi & Vittoria D’Avino & Manuel Conson & Novella Pugliese & Marco Picardi & Marco Salvatore & Roberto Pacelli, 2014. "Complication Probability Models for Radiation-Induced Heart Valvular Dysfunction: Do Heart-Lung Interactions Play a Role?," PLOS ONE, Public Library of Science, vol. 9(10), pages 1-11, October.
    5. Steffen CE Schmidt & Jennifer Schneider & Anne Kerstin Reimers & Claudia Niessner & Alexander Woll, 2019. "Exploratory Determined Correlates of Physical Activity in Children and Adolescents: The MoMo Study," IJERPH, MDPI, vol. 16(3), pages 1-16, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Oxana Babecka Kucharcukova & Jan Bruha, 2016. "Nowcasting the Czech Trade Balance," Working Papers 2016/11, Czech National Bank.
    2. Eric Hillebrand & Huiyu Huang & Tae-Hwy Lee & Canlin Li, 2018. "Using the Entire Yield Curve in Forecasting Output and Inflation," Econometrics, MDPI, vol. 6(3), pages 1-27, August.
    3. Kim, Hyun Hak & Swanson, Norman R., 2018. "Mining big data using parsimonious factor, machine learning, variable selection and shrinkage methods," International Journal of Forecasting, Elsevier, vol. 34(2), pages 339-354.
    4. Bartosz Uniejewski & Katarzyna Maciejowska, 2022. "LASSO Principal Component Averaging -- a fully automated approach for point forecast pooling," Papers 2207.04794, arXiv.org.
    5. Philippe Goulet Coulombe & Maxime Leroux & Dalibor Stevanovic & Stéphane Surprenant, 2022. "How is machine learning useful for macroeconomic forecasting?," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 920-964, August.
    6. Norman R. Swanson & Weiqi Xiong, 2018. "Big data analytics in economics: What have we learned so far, and where should we go from here?," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 51(3), pages 695-746, August.
    7. Mario Forni & Alessandro Giovannelli & Marco Lippi & Stefano Soccorsi, 2018. "Dynamic factor model with infinite‐dimensional factor space: Forecasting," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 33(5), pages 625-642, August.
    8. Ning Xu & Jian Hong & Timothy C. G. Fisher, 2016. "Finite-sample and asymptotic analysis of generalization ability with an application to penalized regression," Papers 1609.03344, arXiv.org, revised Sep 2016.
    9. Červená, Marianna & Schneider, Martin, 2014. "Short-term forecasting of GDP with a DSGE model augmented by monthly indicators," International Journal of Forecasting, Elsevier, vol. 30(3), pages 498-516.
    10. Bryan T. Kelly & Asaf Manela & Alan Moreira, 2019. "Text Selection," NBER Working Papers 26517, National Bureau of Economic Research, Inc.
    11. Tommaso Proietti, 2016. "On the Selection of Common Factors for Macroeconomic Forecasting," Advances in Econometrics, in: Dynamic Factor Models, volume 35, pages 593-628, Emerald Group Publishing Limited.
    12. Ralf Brüggemann & Christian Kascha, 2017. "Directed Graphs and Variable Selection in Large Vector Autoregressive Models," Working Paper Series of the Department of Economics, University of Konstanz 2017-06, Department of Economics, University of Konstanz.
    13. Todd E. Clark & Florian Huber & Gary Koop & Massimiliano Marcellino & Michael Pfarrhofer, 2023. "Tail Forecasting With Multivariate Bayesian Additive Regression Trees," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 64(3), pages 979-1022, August.
    14. Charles Rahal, 2015. "Housing Market Forecasting with Factor Combinations," Discussion Papers 15-05, Department of Economics, University of Birmingham.
    15. Bańbura, Marta & Giannone, Domenico & Modugno, Michele & Reichlin, Lucrezia, 2013. "Now-Casting and the Real-Time Data Flow," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 195-237, Elsevier.
    16. Rua, António, 2017. "A wavelet-based multivariate multiscale approach for forecasting," International Journal of Forecasting, Elsevier, vol. 33(3), pages 581-590.
    17. Davide Pettenuzzo & Rossen Valkanov & Allan Timmermann, 2014. "A Bayesian MIDAS Approach to Modeling First and Second Moment Dynamics," Working Papers 76, Brandeis University, Department of Economics and International Business School.
    18. Sium Bodha Hannadige & Jiti Gao & Mervyn J Silvapulle & Param Silvapulle, 2021. "Time Series Forecasting Using a Mixture of Stationary and Nonstationary Predictors," Monash Econometrics and Business Statistics Working Papers 6/21, Monash University, Department of Econometrics and Business Statistics.
    19. Anwen Yin, 2024. "Predictive model averaging with parameter instability and heteroskedasticity," Bulletin of Economic Research, Wiley Blackwell, vol. 76(2), pages 418-442, April.
    20. Medeiros, Marcelo C. & Mendes, Eduardo F., 2016. "ℓ1-regularization of high-dimensional time-series models with non-Gaussian and heteroskedastic errors," Journal of Econometrics, Elsevier, vol. 191(1), pages 255-271.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0089700. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.