IDEAS home Printed from https://ideas.repec.org/p/osf/osfxxx/ex8ad_v1.html
   My bibliography  Save this paper

Enhancing Propensity Score Analysis with data Missing Not at Random: Introducing Dual-Forest Proximity Imputation

Author

Listed:
  • Lee, Yongseok

    (University of Florida)

  • Leite, Walter

Abstract

Researchers using propensity score analysis (PSA) to estimate treatment effects using secondary data may have to handle data that is missing not at random (MNAR). Existing methods for PSA with MNAR data use logistic regression to model the missing data mechanisms, thus requiring manual specification of functional forms, and are difficult to implement with a large number of covariates. To overcome these limitations, this study proposes alternatives to existing methods by replacing logistic regression with a random forest. Also, it introduces the Dual-Forest Proximity imputation method, which leverages two types of proximity matrices of random forest techniques and incorporates missing pattern information in each matrix. Results from a Monte Carlo simulation show Dual-Forest Proximity imputation’s enhanced bias reduction with various types of MNAR mechanisms as compared to existing and alternative methods. A case study is also provided using data from the National Longitudinal Survey of Youth 1979 (NLSY79).

Suggested Citation

  • Lee, Yongseok & Leite, Walter, 2025. "Enhancing Propensity Score Analysis with data Missing Not at Random: Introducing Dual-Forest Proximity Imputation," OSF Preprints ex8ad_v1, Center for Open Science.
  • Handle: RePEc:osf:osfxxx:ex8ad_v1
    DOI: 10.31219/osf.io/ex8ad_v1
    as

    Download full text from publisher

    File URL: https://osf.io/download/686c8a04c129c534e103ac37/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/ex8ad_v1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
    2. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, December.
    3. James Heckman, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    4. G. V. Kass, 1980. "An Exploratory Technique for Investigating Large Quantities of Categorical Data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 29(2), pages 119-127, June.
    5. Daniel McNeish, 2017. "Missing data methods for arbitrary missingness with small samples," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(1), pages 24-39, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Manuel S. González Canché, 2017. "Financial Benefits of Rapid Student Loan Repayment: An Analytic Framework Employing Two Decades of Data," The ANNALS of the American Academy of Political and Social Science, , vol. 671(1), pages 154-182, May.
    2. Ichev, Riste & Valentinčič, Aljoša, 2025. "The effect of impact investing on performance of private firms," Research in International Business and Finance, Elsevier, vol. 73(PA).
    3. Sandra Müllbacher & Wolfgang Nagl, 2017. "Labour supply in Austria: an assessment of recent developments and the effects of a tax reform," Empirica, Springer;Austrian Institute for Economic Research;Austrian Economic Association, vol. 44(3), pages 465-486, August.
    4. Campbell, Randall C. & Nagel, Gregory L., 2016. "Private information and limitations of Heckman's estimator in banking and corporate finance research," Journal of Empirical Finance, Elsevier, vol. 37(C), pages 186-195.
    5. Chang, Yingying & Du, Xingqiang & Zeng, Quan, 2021. "Does environmental information disclosure mitigate corporate risk? Evidence from China," Journal of Contemporary Accounting and Economics, Elsevier, vol. 17(1).
    6. Seneshaw Tamru & Bart Minten, 2023. "Value addition and farmers: Evidence from coffee in Ethiopia," PLOS ONE, Public Library of Science, vol. 18(1), pages 1-21, January.
    7. Maximilian Klöckner & Christoph G. Schmidt & Stephan M. Wagner, 2022. "When Blockchain Creates Shareholder Value: Empirical Evidence from International Firm Announcements," Production and Operations Management, Production and Operations Management Society, vol. 31(1), pages 46-64, January.
    8. Riillo, Cesare Fabio Antonio & Peroni, Chiara, 2022. "Immigration and entrepreneurship in Europe: cross-country evidence," MPRA Paper 114580, University Library of Munich, Germany.
    9. Juergen Bitzer & Erkan Goeren, 2018. "Foreign Aid and Subnational Development: A Grid Cell Analysis," Working Papers V-407-18, University of Oldenburg, Department of Economics, revised Mar 2018.
    10. Sande, Jon Bingen & Haugland, Sven A., 2015. "Strategic performance effects of misaligned formal contracting: The mediating role of relational contracting," International Journal of Research in Marketing, Elsevier, vol. 32(2), pages 187-194.
    11. repec:aer:wpaper:323 is not listed on IDEAS
    12. Cirillo, Valeria & Fanti, Lucrezia & Mina, Andrea & Ricci, Andrea, 2023. "The adoption of digital technologies: Investment, skills, work organisation," Structural Change and Economic Dynamics, Elsevier, vol. 66(C), pages 89-105.
    13. Yusen Dong & Pengcheng Ma & Lanzhu Sun & Daniel Han Ming Chng, 2024. "Goodwill Hunting: Why and When Ultimate Controlling Owners Affect Their Firms’ Corporate Social Responsibility Performance," Journal of Business Ethics, Springer, vol. 193(3), pages 535-553, September.
    14. Zheng, Yanfeng & Liu, Jing & George, Gerard, 2010. "The dynamic impact of innovative capability and inter-firm network on firm valuation: A longitudinal study of biotechnology start-ups," Journal of Business Venturing, Elsevier, vol. 25(6), pages 593-609, November.
    15. Kohler, Wilhelm & Kukharskyy, Bohdan, 2019. "Offshoring under uncertainty," European Economic Review, Elsevier, vol. 118(C), pages 158-180.
    16. Breunig, Christoph & Mammen, Enno & Simoni, Anna, 2018. "Nonparametric estimation in case of endogenous selection," Journal of Econometrics, Elsevier, vol. 202(2), pages 268-285.
    17. Tang, Ryan W., 2023. "Institutional unpredictability and foreign exit−reentry dynamics: The moderating role of foreign ownership," Journal of World Business, Elsevier, vol. 58(2).
    18. Lachos, Victor H. & Prates, Marcos O. & Dey, Dipak K., 2021. "Heckman selection-t model: Parameter estimation via the EM-algorithm," Journal of Multivariate Analysis, Elsevier, vol. 184(C).
    19. Nan Zhang & Qiaozhuan Liang & Huiying Li & Xiao Wang, 2022. "The organizational relationship–based political connection and debt financing: Evidence from Chinese private firms," Bulletin of Economic Research, Wiley Blackwell, vol. 74(1), pages 69-105, January.
    20. Maggie Xiaoyang Chen & Aaditya Mattoo, 2008. "Regionalism in standards: good or bad for trade?," Canadian Journal of Economics, Canadian Economics Association, vol. 41(3), pages 838-863, August.
    21. Changhyun Kim & Yoonseok Zang & Heli Wang & Kate Niu, 2024. "When Do Corporate Good Deeds Become a Burden? The Role of Corporate Social Responsibility Following Negative Events," Journal of Business Ethics, Springer, vol. 192(2), pages 285-306, June.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:osfxxx:ex8ad_v1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://osf.io/preprints/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.