IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2205.04613.html
   My bibliography  Save this paper

Calibrating for Class Weights by Modeling Machine Learning

Author

Listed:
  • Andrew Caplin
  • Daniel Martin
  • Philip Marx

Abstract

A much studied issue is the extent to which the confidence scores provided by machine learning algorithms are calibrated to ground truth probabilities. Our starting point is that calibration is seemingly incompatible with class weighting, a technique often employed when one class is less common (class imbalance) or with the hope of achieving some external objective (cost-sensitive learning). We provide a model-based explanation for this incompatibility and use our anthropomorphic model to generate a simple method of recovering likelihoods from an algorithm that is miscalibrated due to class weighting. We validate this approach in the binary pneumonia detection task of Rajpurkar, Irvin, Zhu, et al. (2017).

Suggested Citation

  • Andrew Caplin & Daniel Martin & Philip Marx, 2022. "Calibrating for Class Weights by Modeling Machine Learning," Papers 2205.04613, arXiv.org, revised Jul 2022.
  • Handle: RePEc:arx:papers:2205.04613
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2205.04613
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Andrew Caplin & Daniel Martin, 2015. "A Testable Theory of Imperfect Perception," Economic Journal, Royal Economic Society, vol. 125(582), pages 184-202, February.
    2. Emir Shuford & Arthur Albert & H. Edward Massengill, 1966. "Admissible probability measurement procedures," Psychometrika, Springer;The Psychometric Society, vol. 31(2), pages 125-145, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Naudé, Wim, 2023. "Artificial Intelligence and the Economics of Decision-Making," IZA Discussion Papers 16000, Institute of Labor Economics (IZA).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dirk Bergemann & Stephen Morris, 2019. "Information Design: A Unified Perspective," Journal of Economic Literature, American Economic Association, vol. 57(1), pages 44-95, March.
    2. Brocas, Isabelle & Carrillo, Juan D., 2021. "Value computation and modulation: A neuroeconomic theory of self-control as constrained optimization," Journal of Economic Theory, Elsevier, vol. 198(C).
    3. Manski, Charles F., 2006. "Interpreting the predictions of prediction markets," Economics Letters, Elsevier, vol. 91(3), pages 425-429, June.
    4. Pamela Giustinelli & Charles F. Manski, 2018. "Survey Measures Of Family Decision Processes For Econometric Analysis Of Schooling Decisions," Economic Inquiry, Western Economic Association International, vol. 56(1), pages 81-99, January.
    5. Daniel Martin & Philip Marx, 2022. "A Robust Test of Prejudice for Discrimination Experiments," Management Science, INFORMS, vol. 68(6), pages 4527-4536, June.
    6. Victor Jose, 2009. "A Characterization for the Spherical Scoring Rule," Theory and Decision, Springer, vol. 66(3), pages 263-281, March.
    7. Emerson Melo, 2021. "Learning in Random Utility Models Via Online Decision Problems," Papers 2112.10993, arXiv.org, revised Aug 2022.
    8. Roberto Leon-Gonzalez & Blessings Majoni, 2023. "Exact Likelihood for Inverse Gamma Stochastic Volatility Models," Working Paper series 23-11, Rimini Centre for Economic Analysis.
    9. Dirk Bergemann & Stephen Morris, 2013. "The Comparison of Information Structures in Games: Bayes Correlated Equilibrium and Individual Sufficiency," Cowles Foundation Discussion Papers 1909R, Cowles Foundation for Research in Economics, Yale University, revised May 2014.
    10. Francesco Giancaterini & Alain Hecq & Claudio Morana, 2022. "Is Climate Change Time-Reversible?," Econometrics, MDPI, vol. 10(4), pages 1-18, December.
    11. Geweke, John & Amisano, Gianni, 2011. "Optimal prediction pools," Journal of Econometrics, Elsevier, vol. 164(1), pages 130-141, September.
    12. Martin, Daniel, 2017. "Strategic pricing with rational inattention to quality," Games and Economic Behavior, Elsevier, vol. 104(C), pages 131-145.
    13. Carlos Alós-Ferrer & Ernst Fehr & Nick Netzer, 2021. "Time Will Tell: Recovering Preferences When Choices Are Noisy," Journal of Political Economy, University of Chicago Press, vol. 129(6), pages 1828-1877.
    14. Cristina Gualdani & Shruti Sinha, 2019. "Identification in discrete choice models with imperfect information," Papers 1911.04529, arXiv.org, revised Dec 2023.
    15. Bergemann, Dirk & Morris, Stephen, 2016. "Bayes correlated equilibrium and the comparison of information structures in games," Theoretical Economics, Econometric Society, vol. 11(2), May.
    16. Caplin, Andrew, 2014. "Rational inattention and revealed preference: The data-theoretic approach to economic modeling," Research in Economics, Elsevier, vol. 68(4), pages 295-305.
    17. Delavande, Adeline & Zafar, Basit, 2018. "Information and anti-American attitudes," Journal of Economic Behavior & Organization, Elsevier, vol. 149(C), pages 1-31.
    18. Fabian Krüger & Sebastian Lerch & Thordis Thorarinsdottir & Tilmann Gneiting, 2021. "Predictive Inference Based on Markov Chain Monte Carlo Output," International Statistical Review, International Statistical Institute, vol. 89(2), pages 274-301, August.
    19. Andrew Caplin & Mark Dean & John Leahy, 2022. "Rationally Inattentive Behavior: Characterizing and Generalizing Shannon Entropy," Journal of Political Economy, University of Chicago Press, vol. 130(6), pages 1676-1715.
    20. Tamer Boyaci & Yalçin Akçay, 2016. "Pricing when customers have limited attention," ESMT Research Working Papers ESMT-16-01, ESMT European School of Management and Technology, revised 19 Jan 2017.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2205.04613. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.