Author
Listed:
- El-Housainy A. Rady
(Department of Applied Statistics and Econometrics, Faculty of Graduate Studies for Statistical Research, Cairo University, Egypt)
- Mohamed R. Abonazel
(Department of Applied Statistics and Econometrics, Faculty of Graduate Studies for Statistical Research, Cairo University, Egypt)
- Mariam H. Metawe’e
(Department of Applied Statistics and Econometrics, Faculty of Graduate Studies for Statistical Research, Cairo University, Egypt)
Abstract
Goodness of fit (GOF) tests of logistic regression attempt to find out the suitability of the model to the data. The null hypothesis of all GOF tests is the model fit. R as a free software package has many GOF tests in different packages. A Monte Carlo simulation has been conducted to study two situations; the first, studying the ability of each test, under its default settings, to accept the null hypothesis when the model truly fitted. The second, studying the power of these tests when assumptions of sufficient linear combination of the explanatory variables are violated (by omitting linear covariate term, quadratic term, or interaction term). Moreover, checking whether the same test in different R packages had the same results or not. As the sample size supposed to affect simulation results, so the pattern of change of GOF tests results under different sample sizes as well as different model settings was estimated. All tests accept the null hypothesis (more than 95% of simulation trials) when the model truly fitted except modified Hosmer-Lemeshow test in "LogisticDx" package under all different model settings and Osius and Rojek’s (OsRo) test when the true model had an interaction term between binary and categorical covariates. In addition, le Cessie-van Houwelingen-Copas-Hosmer unweighted sum of squares (CHCH) test gave unexpected different results under different packages. Concerning the power study, all tests had a very low power when a departure of missing covariate existed. Generally, stukel’s test (package ’LogisticDX) and CHCH test (package "RMS") reached a power in detecting a missing quadratic term greater than 80% under lower sample size while OsRo test (package ’LogisticDX’) was better in detecting missing interaction term. Beside the simulation study, we evaluated the performance of GOF tests using the breast cancer dataset.
Suggested Citation
El-Housainy A. Rady & Mohamed R. Abonazel & Mariam H. Metawe’e, 2021.
"A Comparison Study of Goodness of Fit Tests of Logistic Regression in R: Simulation and Application to Breast Cancer Data,"
Academic Journal of Applied Mathematical Sciences, Academic Research Publishing Group, vol. 7(1), pages 50-59, 01-2021.
Handle:
RePEc:arp:ajoams:2021:p:50-59
DOI: 10.32861/ajams.71.50.59
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arp:ajoams:2021:p:50-59. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Managing Editor (email available below). General contact details of provider: http://arpgweb.com/index.php?ic=journal&journal=17&info=aims .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.