IDEAS home Printed from https://ideas.repec.org/a/bpj/causin/v10y2022i1p64-89n2.html
   My bibliography  Save this article

A unifying causal framework for analyzing dataset shift-stable learning algorithms

Author

Listed:
  • Subbaswamy Adarsh

    (Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, United States)

  • Chen Bryant

    (Brex Inc, San Francisco, California, United States)

  • Saria Suchi

    (Department of Computer Science, Johns Hopkins University and Bayesian Health, Baltimore, MD 21218, United States)

Abstract

Recent interest in the external validity of prediction models (i.e., the problem of different train and test distributions, known as dataset shift) has produced many methods for finding predictive distributions that are invariant to dataset shifts and can be used for prediction in new, unseen environments. However, these methods consider different types of shifts and have been developed under disparate frameworks, making it difficult to theoretically analyze how solutions differ with respect to stability and accuracy. Taking a causal graphical view, we use a flexible graphical representation to express various types of dataset shifts. Given a known graph of the data generating process, we show that all invariant distributions correspond to a causal hierarchy of graphical operators, which disable the edges in the graph that are responsible for the shifts. The hierarchy provides a common theoretical underpinning for understanding when and how stability to shifts can be achieved, and in what ways stable distributions can differ. We use it to establish conditions for minimax optimal performance across environments, and derive new algorithms that find optimal stable distributions. By using this new perspective, we empirically demonstrate that that there is a tradeoff between minimax and average performance.

Suggested Citation

  • Subbaswamy Adarsh & Chen Bryant & Saria Suchi, 2022. "A unifying causal framework for analyzing dataset shift-stable learning algorithms," Journal of Causal Inference, De Gruyter, vol. 10(1), pages 64-89, January.
  • Handle: RePEc:bpj:causin:v:10:y:2022:i:1:p:64-89:n:2
    DOI: 10.1515/jci-2021-0042
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/jci-2021-0042
    Download Restriction: no

    File URL: https://libkey.io/10.1515/jci-2021-0042?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Francis Vella, 1998. "Estimating Models with Sample Selection Bias: A Survey," Journal of Human Resources, University of Wisconsin Press, vol. 33(1), pages 127-169.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Paul Ellickson & Sanjog Misra, 2012. "Enriching interactions: Incorporating outcome data into static discrete games," Quantitative Marketing and Economics (QME), Springer, vol. 10(1), pages 1-26, March.
    2. Hajime Seya & Junyi Zhang & Makoto Chikaraishi & Ying Jiang, 2020. "Decisions on truck parking place and time on expressways: an analysis using digital tachograph data," Transportation, Springer, vol. 47(2), pages 555-583, April.
    3. Lo Turco, Alessia & Maggioni, Daniela, 2018. "Effects of Islamic religiosity on bilateral trust in trade: The case of Turkish exports," Journal of Comparative Economics, Elsevier, vol. 46(4), pages 947-965.
    4. Manuel Arellano & Stéphane Bonhomme, 2017. "Quantile Selection Models With an Application to Understanding Changes in Wage Inequality," Econometrica, Econometric Society, vol. 85, pages 1-28, January.
    5. Larry W. Hunter, 2000. "What Determines Job Quality in Nursing Homes?," ILR Review, Cornell University, ILR School, vol. 53(3), pages 463-481, April.
    6. S. I. Dolgikh & B. S. Potanin, 2023. "The Impact of Public Administration on the Efficiency of Russian Firms," Studies on Russian Economic Development, Springer, vol. 34(1), pages 59-67, February.
    7. Stephan Wachtel & Thomas Otter, 2013. "Successive Sample Selection and Its Relevance for Management Decisions," Marketing Science, INFORMS, vol. 32(1), pages 170-185, September.
    8. Tocco, Barbara & Bailey, Alastair & Davidova, Sophia, 2013. "Determinants to Leave Agriculture and Change Occupational Sector: Evidence from an Enlarged EU," Working papers 155704, Factor Markets, Centre for European Policy Studies.
    9. Benitez-Silva, Hugo & Dwyer, Debra S., 2006. "Expectation formation of older married couples and the rational expectations hypothesis," Labour Economics, Elsevier, vol. 13(2), pages 191-218, April.
    10. Nabanita Datta Gupta & Mona Larsen, 2010. "The impact of health on individual retirement plans: self‐reported versus diagnostic measures," Health Economics, John Wiley & Sons, Ltd., vol. 19(7), pages 792-813, July.
    11. Keisuke Hirano & Guido W. Imbens & Geert Ridder & Donald B. Rubin, 2001. "Combining Panel Data Sets with Attrition and Refreshment Samples," Econometrica, Econometric Society, vol. 69(6), pages 1645-1659, November.
    12. Arndt Reichert & Harald Tauchmann, 2014. "When outcome heterogeneously matters for selection: a generalized selection correction estimator," Applied Economics, Taylor & Francis Journals, vol. 46(7), pages 762-768, March.
    13. Takashi Yamagata & Chris Orme, 2005. "On Testing Sample Selection Bias Under the Multicollinearity Problem," Econometric Reviews, Taylor & Francis Journals, vol. 24(4), pages 467-481.
    14. João Pereira & Aurora Galego, 2014. "Inter-Regional Wage Differentials in Portugal: An Analysis Across the Wage Distribution," Regional Studies, Taylor & Francis Journals, vol. 48(9), pages 1529-1546, September.
    15. Hugo Benítez-Silva & Debra S. Dwyer, 2003. "What to Expect when you are Expecting Rationality: Testing Rational Expectations using Micro Data," Working Papers wp037, University of Michigan, Michigan Retirement Research Center.
    16. Beltran, Jesusa C. & Pannell, David J. & Doole, Graeme J. & White, Benedict, 2011. "Factors that affect the use of herbicides in Philippine rice farming systems," Working Papers 108769, University of Western Australia, School of Agricultural and Resource Economics.
    17. Kässi, Otto, 2012. "Uncertainty and Heterogeneity in Returns to Education: Evidence from Finland," MPRA Paper 43503, University Library of Munich, Germany.
    18. Robert Innes & Abdoul G. Sam, 2008. "Voluntary Pollution Reductions and the Enforcement of Environmental Law: An Empirical Study of the 33/50 Program," Journal of Law and Economics, University of Chicago Press, vol. 51(2), pages 271-296, May.
    19. Giambona, Erasmo & Golec, Joseph, 2010. "Strategic trading in the wrong direction by a large institutional insider," Journal of Empirical Finance, Elsevier, vol. 17(1), pages 1-22, January.
    20. Christopher R. Bollinger & Barry T. Hirsch, 2010. "GDP & Beyond – die europäische Perspektive," RatSWD Working Papers 165, German Data Forum (RatSWD).

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:causin:v:10:y:2022:i:1:p:64-89:n:2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyterbrill.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.