IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v117y2018icp76-89.html
   My bibliography  Save this article

Informativeness of diagnostic marker values and the impact of data grouping

Author

Listed:
  • Ma, Hua
  • Bandos, Andriy I.
  • Gur, David

Abstract

Assessing performance of diagnostic markers is a necessary step for their use in decision making regarding various conditions of interest in diagnostic medicine and other fields. Globally useful markers could, however, have ranges of values that are “diagnostically non-informative”. This paper demonstrates that the presence of marker values from diagnostically non-informative ranges could lead to a loss in statistical efficiency during nonparametric evaluation and shows that grouping non-informative values provides a natural resolution to this problem. These points are theoretically proven and an extensive simulation study is conducted to illustrate the possible benefits of using grouped marker values in a number of practically reasonable scenarios. The results contradict the common conjecture regarding the detrimental effect of grouped marker values during performance assessments. Specifically, contrary to the common assumption that grouped marker values lead to bias, grouping non-informative values does not introduce bias and could substantially reduce sampling variability. The proven concept that grouped marker values could be statistically beneficial without detrimental consequences implies that in practice, tied values do not always require resolution whereas the use of continuous diagnostic results without addressing diagnostically non-informative ranges could be statistically detrimental. Based on these findings, more efficient methods for evaluating diagnostic markers could be developed.

Suggested Citation

  • Ma, Hua & Bandos, Andriy I. & Gur, David, 2018. "Informativeness of diagnostic marker values and the impact of data grouping," Computational Statistics & Data Analysis, Elsevier, vol. 117(C), pages 76-89.
  • Handle: RePEc:eee:csdana:v:117:y:2018:i:c:p:76-89
    DOI: 10.1016/j.csda.2017.07.008
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947317301664
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2017.07.008?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Kelly Zou & W. J. Hall, 2000. "Two transformation models for estimating an ROC curve derived from continuous data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 27(5), pages 621-631.
    2. Fengchun Peng & W.Jack Hall, 1996. "Bayesian Analysis of ROC Curves Using Markov-chain Monte Carlo Methods," Medical Decision Making, , vol. 16(4), pages 404-411, October.
    3. Margaret Sullivan Pepe & Tianxi Cai, 2004. "The Analysis of Placement Values for Evaluating Discriminatory Measures," Biometrics, The International Biometric Society, vol. 60(2), pages 528-535, June.
    4. Contal, Cecile & O'Quigley, John, 1999. "An application of changepoint methods in studying the effect of age on survival in breast cancer," Computational Statistics & Data Analysis, Elsevier, vol. 30(3), pages 253-270, May.
    5. Donna Katzman McClish, 1989. "Analyzing a Portion of the ROC Curve," Medical Decision Making, , vol. 9(3), pages 190-195, August.
    6. Y. Huang & M. S. Pepe, 2009. "A Parametric ROC Model-Based Approach for Evaluating the Predictiveness of Continuous Markers in Case–Control Studies," Biometrics, The International Biometric Society, vol. 65(4), pages 1133-1144, December.
    7. Albert Vexler & Aiyi Liu & Ekaterina Eliseeva & Enrique F. Schisterman, 2008. "Maximum Likelihood Ratio Tests for Comparing the Discriminatory Ability of Biomarkers Subject to Limit of Detection," Biometrics, The International Biometric Society, vol. 64(3), pages 895-903, September.
    8. David Hinkley, 1974. "A Bibliography of Multivariate Statistical Analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 23(3), pages 439-440, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Y. Huang & M. S. Pepe, 2009. "A Parametric ROC Model-Based Approach for Evaluating the Predictiveness of Continuous Markers in Case–Control Studies," Biometrics, The International Biometric Society, vol. 65(4), pages 1133-1144, December.
    2. Heinzl, Harald & Tempfer, Clemens, 2001. "A cautionary note on segmenting a cyclical covariate by minimum P-value search," Computational Statistics & Data Analysis, Elsevier, vol. 35(4), pages 451-461, February.
    3. Chen, Zhelun & O’Neill, Zheng & Wen, Jin & Pradhan, Ojas & Yang, Tao & Lu, Xing & Lin, Guanjing & Miyata, Shohei & Lee, Seungjae & Shen, Chou & Chiosa, Roberto & Piscitelli, Marco Savino & Capozzoli, , 2023. "A review of data-driven fault detection and diagnostics for building HVAC systems," Applied Energy, Elsevier, vol. 339(C).
    4. Beom Seuk Hwang & Zhen Chen, 2015. "An Integrated Bayesian Nonparametric Approach for Stochastic and Variability Orders in ROC Curve Estimation: An Application to Endometriosis Diagnosis," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(511), pages 923-934, September.
    5. Wang, P. C. & Lin, D. F., 2001. "Dispersion effects in signal-response data from fractional factorial experiments," Computational Statistics & Data Analysis, Elsevier, vol. 38(1), pages 95-111, November.
    6. Jihnhee Yu & Albert Vexler & Lili Tian, 2010. "Analyzing Incomplete Data Subject to a Threshold using Empirical Likelihood Methods: An Application to a Pneumonia Risk Study in an ICU Setting," Biometrics, The International Biometric Society, vol. 66(1), pages 123-130, March.
    7. Simon Bussy & Mokhtar Z. Alaya & Anne‐Sophie Jannot & Agathe Guilloux, 2022. "Binacox: automatic cut‐point detection in high‐dimensional Cox model with applications in genetics," Biometrics, The International Biometric Society, vol. 78(4), pages 1414-1426, December.
    8. Elisa–María Molanes-López & Ricardo Cao, 2008. "Relative density estimation for left truncated and right censored data," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 20(8), pages 693-720.
    9. Wang, Dan & Tian, Lili, 2017. "Parametric methods for confidence interval estimation of overlap coefficients," Computational Statistics & Data Analysis, Elsevier, vol. 106(C), pages 12-26.
    10. Chen, Xiwei & Vexler, Albert & Markatou, Marianthi, 2015. "Empirical likelihood ratio confidence interval estimation of best linear combinations of biomarkers," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 186-198.
    11. Soutik Ghosal & Zhen Chen, 2022. "Discriminatory Capacity of Prenatal Ultrasound Measures for Large-for-Gestational-Age Birth: A Bayesian Approach to ROC Analysis Using Placement Values," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 14(1), pages 1-22, April.
    12. Junbeom Park & Pil-sung Yang & Tae-Hoon Kim & Jae-Sun Uhm & Joung-Youn Kim & Boyoung Joung & Moon-Hyoung Lee & Chun Hwang & Hui-Nam Pak, 2015. "Low Left Atrial Compliance Contributes to the Clinical Recurrence of Atrial Fibrillation after Catheter Ablation in Patients with Structurally and Functionally Normal Heart," PLOS ONE, Public Library of Science, vol. 10(12), pages 1-13, December.
    13. Zhongkai Liu & Howard D. Bondell, 2019. "Binormal Precision–Recall Curves for Optimal Classification of Imbalanced Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 11(1), pages 141-161, April.
    14. Holly Janes & Gary Longton & Margaret S. Pepe, 2009. "Accommodating covariates in receiver operating characteristic analysis," Stata Journal, StataCorp LP, vol. 9(1), pages 17-39, March.
    15. Martin Hellmich & Keith R. Abrams & David R. Jones & Paul C. Lambert, 1998. "A Bayesian Approach to a General Regression Model for ROC Curves," Medical Decision Making, , vol. 18(4), pages 436-443, October.
    16. Rodríguez-Álvarez, María Xosé & Roca-Pardiñas, Javier & Cadarso-Suárez, Carmen, 2011. "A new flexible direct ROC regression model: Application to the detection of cardiovascular risk factors by anthropometric measures," Computational Statistics & Data Analysis, Elsevier, vol. 55(12), pages 3257-3270, December.
    17. Kelly Zou & W. J. Hall, 2002. "Semiparametric and parametric transformation models for comparing diagnostic markers with paired design," Journal of Applied Statistics, Taylor & Francis Journals, vol. 29(6), pages 803-816.
    18. Cheam, Amay S.M. & McNicholas, Paul D., 2016. "Modelling receiver operating characteristic curves using Gaussian mixtures," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 192-208.
    19. Theobald, Chris M. & Talbot, Mike, 2004. "Bayesian selection of fertilizer level when crop price depends on quality," Computational Statistics & Data Analysis, Elsevier, vol. 47(4), pages 867-880, November.
    20. Zhang, Biao, 2006. "A semiparametric hypothesis testing procedure for the ROC curve area under a density ratio model," Computational Statistics & Data Analysis, Elsevier, vol. 50(7), pages 1855-1876, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:117:y:2018:i:c:p:76-89. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.