IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2410.02091.html
   My bibliography  Save this paper

The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot

Author

Listed:
  • Fangchen Song
  • Ashish Agarwal
  • Wen Wen

Abstract

Generative artificial intelligence (AI) enables automated content production, including coding in software development, which can significantly influence developer participation and performance. To explore its impact on collaborative open-source software (OSS) development, we investigate the role of GitHub Copilot, a generative AI pair programmer, in OSS development where multiple distributed developers voluntarily collaborate. Using GitHub's proprietary Copilot usage data, combined with public OSS repository data obtained from GitHub, we find that Copilot use increases project-level code contributions by 5.9%. This gain is driven by a 2.1% increase in individual code contributions and a 3.4% rise in developer coding participation. However, these benefits come at a cost as coordination time for code integration increases by 8% due to more code discussions enabled by AI pair programmers. This reveals an important tradeoff: While AI expands who can contribute and how much they contribute, it slows coordination in collective development efforts. Despite this tension, the combined effect of these two competing forces remains positive, indicating a net gain in overall project-level productivity from using AI pair programmers. Interestingly, we also find the effects differ across developer roles. Peripheral developers show relatively smaller gains in project-level code contributions and face a higher increase in coordination time than core developers, likely due to the difference in their project familiarity. In summary, our study underscores the dual role of AI pair programmers in affecting project-level code contributions and coordination time in OSS development. Our findings on the differential effects between core and peripheral developers also provide important implications for the structure of OSS communities in the long run.

Suggested Citation

  • Fangchen Song & Ashish Agarwal & Wen Wen, 2024. "The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot," Papers 2410.02091, arXiv.org, revised Jul 2025.
  • Handle: RePEc:arx:papers:2410.02091
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2410.02091
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Alberto Abadie & Alexis Diamond & Jens Hainmueller, 2015. "Comparative Politics and the Synthetic Control Method," American Journal of Political Science, John Wiley & Sons, vol. 59(2), pages 495-510, February.
    2. Mohammed Alyakoob & Mohammad S. Rahman, 2022. "Shared Prosperity (or Lack Thereof) in the Sharing Economy," Information Systems Research, INFORMS, vol. 33(2), pages 638-658, June.
    3. Xu, Yiqing, 2017. "Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models," Political Analysis, Cambridge University Press, vol. 25(1), pages 57-76, January.
    4. Jushan Bai, 2009. "Panel Data Models With Interactive Fixed Effects," Econometrica, Econometric Society, vol. 77(4), pages 1229-1279, July.
    5. Yixin Lu & Alok Gupta & Wolfgang Ketter & Eric van Heck, 2019. "Information Transparency in Business-to-Business Auction Markets: The Role of Winner Identity Disclosure," Management Science, INFORMS, vol. 65(9), pages 4261-4279, September.
    6. David H. Autor, 2003. "Outsourcing at Will: The Contribution of Unjust Dismissal Doctrine to the Growth of Employment Outsourcing," Journal of Labor Economics, University of Chicago Press, vol. 21(1), pages 1-42, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Qili Wang & Liangfei Qiu & Wei Xu, 2024. "Informal Payments and Doctor Engagement in an Online Health Community: An Empirical Investigation Using Generalized Synthetic Control," Information Systems Research, INFORMS, vol. 35(2), pages 706-726, June.
    2. Sondang Marsinta Uli Panggabean & Mahjus Ekananda & Beta Yulianita Gitaharie & Leslie Djuranovik, 2025. "Export proceeds repatriation policies: A shield against exchange rate volatility in emerging markets?," Papers 2506.09168, arXiv.org.
    3. Xiong, Ruoxuan & Pelger, Markus, 2023. "Large dimensional latent factor modeling with missing observations and applications to causal inference," Journal of Econometrics, Elsevier, vol. 233(1), pages 271-301.
    4. Michał Marcin Kobierecki & Michał Pierzgalski, 2022. "Sports Mega-Events and Economic Growth: A Synthetic Control Approach," Journal of Sports Economics, , vol. 23(5), pages 567-597, June.
    5. Taylor K. Odle & Jennifer A. Delaney, 2022. "You are Admitted! Early Evidence on Enrollment from Idaho’s Direct Admissions System," Research in Higher Education, Springer;Association for Institutional Research, vol. 63(6), pages 899-932, September.
    6. Bai, Jushan & Wang, Peng, 2024. "Causal inference using factor models," MPRA Paper 120585, University Library of Munich, Germany.
    7. Hans-Bernd Schaefer & Rok Spruk, 2024. "Islamic Law, Western European Law and the Roots of Middle East's Long Divergence: a Comparative Empirical Investigation (800-1600)," Papers 2401.14435, arXiv.org, revised Mar 2024.
    8. Alexander S. Skorobogatov, 2021. "The effect of alcohol sales restrictions on alcohol poisoning mortality: Evidence from Russia," Health Economics, John Wiley & Sons, Ltd., vol. 30(6), pages 1417-1442, June.
    9. Chakraborty Avinandan & Doremus Jacqueline & Stith Sarah, 2021. "The effects of recreational cannabis access on labor markets: evidence from Colorado," IZA Journal of Labor Economics, Sciendo & Forschungsinstitut zur Zukunft der Arbeit GmbH (IZA), vol. 10(1), pages 1-86, January.
    10. Victor Chernozhukov & Kaspar Wüthrich & Yinchu Zhu, 2021. "An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(536), pages 1849-1864, October.
    11. Shu He & Jing Peng & Jianbin Li & Liping Xu, 2020. "Impact of Platform Owner’s Entry on Third-Party Stores," Information Systems Research, INFORMS, vol. 31(4), pages 1467-1484, December.
    12. Guo, Xue & Cheng, Aaron & Pavlou, Paul A., 2024. "Skill-biased technical change, again? Online gig platforms and local employment," LSE Research Online Documents on Economics 124538, London School of Economics and Political Science, LSE Library.
    13. Rafaty, Ryan & Dolphin, Geoffroy & Pretis, Felix, 2025. "Carbon pricing and the elasticity of CO2 emissions," Energy Economics, Elsevier, vol. 144(C).
    14. Bruno Ferman, 2021. "On the Properties of the Synthetic Control Estimator with Many Periods and Many Controls," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(536), pages 1764-1772, October.
    15. Dallas Dotter & Duncan Chaplin & Maria Bartlett, "undated". "Impacts of School Reforms in Washington, DC on Student Achievement," Mathematica Policy Research Reports 44e95d7566434a21b8d57f951, Mathematica Policy Research.
    16. Susan Athey & Mohsen Bayati & Nikolay Doudchenko & Guido Imbens & Khashayar Khosravi, 2021. "Matrix Completion Methods for Causal Panel Data Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(536), pages 1716-1730, October.
    17. Dmitry Arkhangelsky & Guido Imbens, 2023. "Causal Models for Longitudinal and Panel Data: A Survey," Papers 2311.15458, arXiv.org, revised Jun 2024.
    18. Bennato, Anna Rita & Davies, Stephen & Mariuzzo, Franco & Ormosi, Peter, 2021. "Mergers and innovation: Evidence from the hard disk drive market," International Journal of Industrial Organization, Elsevier, vol. 77(C).
    19. Trang My Tran, 2022. "International Environmental Agreement and Trade in Environmental Goods: The Case of Kyoto Protocol," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 83(2), pages 341-379, October.
    20. Taylor K. Odle, 2022. "Free to Spend? Institutional Autonomy and Expenditures on Executive Compensation, Faculty Salaries, and Research Activities," Research in Higher Education, Springer;Association for Institutional Research, vol. 63(1), pages 1-32, February.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2410.02091. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.