IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v130y2025i5d10.1007_s11192-025-05298-y.html
   My bibliography  Save this article

Finding Doppelgängers in Scopus: how to build scientists control groups using sosia

Author

Listed:
  • Michael E. Rose

    (Max Planck Institute for Innovation and Competition)

  • Stefano H. Baruffaldi

    (Max Planck Institute for Innovation and Competition
    Politecnico di Milano)

Abstract

The construction of control groups of scientists is often a daunting effort. This paper presents sosia, an open-source Python-based software designed to efficiently query the Scopus database via RESTful API. sosia searches for researchers with publication profiles similar to a given researcher up to a given year based on all main standard bibliometric indicators. The user can choose flexibly a set of parameters to restrict the search to more or less narrow boundaries upfront and obtain additional similarity indicators to select a subset of authors after the search. Advanced settings also allow narrowing the search to a list of affiliations and to minimize the possible errors arising from ambiguous author profiles. One basic search can be set up in a few command lines and the average time of computation goes between 60 and 300 minutes. We discuss the functioning, characteristics, limitations and possible extension of the software.

Suggested Citation

  • Michael E. Rose & Stefano H. Baruffaldi, 2025. "Finding Doppelgängers in Scopus: how to build scientists control groups using sosia," Scientometrics, Springer;Akadémiai Kiadó, vol. 130(5), pages 3013-3028, May.
  • Handle: RePEc:spr:scient:v:130:y:2025:i:5:d:10.1007_s11192-025-05298-y
    DOI: 10.1007/s11192-025-05298-y
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-025-05298-y
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-025-05298-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Colatat, Phech, 2015. "An organizational perspective to funding science: Collaborator novelty at DARPA," Research Policy, Elsevier, vol. 44(4), pages 874-887.
    2. Bedoor K. AlShebli & Talal Rahwan & Wei Lee Woon, 2018. "The preeminence of ethnic diversity in scientific collaboration," Nature Communications, Nature, vol. 9(1), pages 1-10, December.
    3. Philippe Mongeon & Adèle Paul-Hus, 2016. "The journal coverage of Web of Science and Scopus: a comparative analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 106(1), pages 213-228, January.
    4. Ina Ganguli, 2015. "Immigration and Ideas: What Did Russian Scientists "Bring" to the United States?," Journal of Labor Economics, University of Chicago Press, vol. 33(S1), pages 257-288.
    5. Wang, Qi & Waltman, Ludo, 2016. "Large-scale analysis of the accuracy of the journal classification systems of Web of Science and Scopus," Journal of Informetrics, Elsevier, vol. 10(2), pages 347-364.
    6. Hanna Hottenrott & Michael E. Rose & Cornelia Lawson, 2021. "The rise of multiple institutional affiliations in academia," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(8), pages 1039-1058, August.
    7. Pierre Azoulay & Toby Stuart & Yanbo Wang, 2014. "Matthew: Effect or Fable?," Management Science, INFORMS, vol. 60(1), pages 92-109, January.
    8. Alexander Oettl, 2012. "Reconceptualizing Stars: Scientist Helpfulness and Peer Performance," Management Science, INFORMS, vol. 58(6), pages 1122-1140, June.
    9. Caren Klingbeil & Thorsten Semrau & Mark Ebers & Hendrik Wilhelm, 2019. "Logics, Leaders, Lab Coats: A Multi‐Level Study on How Institutional Logics are Linked to Entrepreneurial Intentions in Academia," Journal of Management Studies, Wiley Blackwell, vol. 56(5), pages 929-965, July.
    10. Julia Muschallik & Kerstin Pull, 2016. "Mentoring in higher education: does it enhance mentees’ research productivity?," Education Economics, Taylor & Francis Journals, vol. 24(2), pages 210-223, April.
    11. Iacus, Stefano & King, Gary & Porro, Giuseppe, 2009. "cem: Software for Coarsened Exact Matching," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 30(i09).
    12. Katrin Hussinger & Lorenzo Palladini, 2024. "Information accessibility and knowledge creation: the impact of Google’s withdrawal from China on scientific research," Industry and Innovation, Taylor & Francis Journals, vol. 31(6), pages 753-783, July.
    13. Pierre Azoulay & Christian Fons-Rosen & Joshua S. Graff Zivin, 2019. "Does Science Advance One Funeral at a Time?," American Economic Review, American Economic Association, vol. 109(8), pages 2889-2920, August.
    14. Henk F. Moed & M’hamed Aisati & Andrew Plume, 2013. "Studying scientific migration in Scopus," Scientometrics, Springer;Akadémiai Kiadó, vol. 94(3), pages 929-942, March.
    15. Yadav, Anil & McHale, John & O'Neill, Stephen, 2023. "How does co-authoring with a star affect scientists' productivity? Evidence from small open economies," Research Policy, Elsevier, vol. 52(1).
    16. Valeria Aman, 2018. "Does the Scopus author ID suffice to track scientific international mobility? A case study based on Leibniz laureates," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(2), pages 705-720, November.
    17. Pierre Azoulay & Joshua S. Graff Zivin & Jialan Wang, 2010. "Superstar Extinction," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 125(2), pages 549-589.
    18. Iacus, Stefano M. & King, Gary & Porro, Giuseppe, 2012. "Causal Inference without Balance Checking: Coarsened Exact Matching," Political Analysis, Cambridge University Press, vol. 20(1), pages 1-24, January.
    19. Myra Mohnen, 2022. "Stars and Brokers: Knowledge Spillovers Among Medical Scientists," Management Science, INFORMS, vol. 68(4), pages 2513-2532, April.
    20. Hird, Mackenzie D. & Pfotenhauer, Sebastian M., 2017. "How complex international partnerships shape domestic research clusters: Difference-in-difference network formation and research re-orientation in the MIT Portugal Program," Research Policy, Elsevier, vol. 46(3), pages 557-572.
    21. Lu Liu & Benjamin F. Jones & Brian Uzzi & Dashun Wang, 2023. "Data, measurement and empirical methods in the science of science," Nature Human Behaviour, Nature, vol. 7(7), pages 1046-1058, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Massimiliano Coda-Zabetta & Francesco Lissoni & Ernest Miguelez, 2024. "Star recruitment and internationalization effects: an analysis of the Alexander von Humboldt professorship programme," Economia e Politica Industriale: Journal of Industrial and Business Economics, Springer;Associazione Amici di Economia e Politica Industriale, vol. 51(3), pages 667-690, September.
    2. Chen, Kaihua & Ding, Yi & Zhao, Binbin & Guo, Rui & Ning, Lutao, 2025. "Benefits beyond the local network: Does indirect international collaboration ties contribute to research performance for young scientists?," Research Policy, Elsevier, vol. 54(5).
    3. Khanna, Rajat, 2021. "Aftermath of a tragedy: A star's death and coauthors’ subsequent productivity," Research Policy, Elsevier, vol. 50(2).
    4. Khanna, Rajat, 2023. "Passing the torch of knowledge: Star death, collaborative ties, and knowledge creation," Research Policy, Elsevier, vol. 52(1).
    5. Drivas, Kyriakos & Kremmydas, Dimitris, 2020. "The Matthew effect of a journal's ranking," Research Policy, Elsevier, vol. 49(4).
    6. Yadav, Anil & McHale, John & O'Neill, Stephen, 2023. "How does co-authoring with a star affect scientists' productivity? Evidence from small open economies," Research Policy, Elsevier, vol. 52(1).
    7. Graddy-Reed, Alexandra & Lanahan, Lauren & D'Agostino, Jesse, 2021. "Training across the academy: The impact of R&D funding on graduate students," Research Policy, Elsevier, vol. 50(5).
    8. Cristelli, Gabriele & Lissoni, Francesco, 2020. "Free movement of inventors: open-border policy and innovation in Switzerland," MPRA Paper 107433, University Library of Munich, Germany.
    9. Ejermo, Olof & Sofer, Yotam, 2024. "When colleges graduate: Micro-level effects on publications and scientific organization," Research Policy, Elsevier, vol. 53(6).
    10. Agrawal, Ajay & McHale, John & Oettl, Alexander, 2019. "Does scientist immigration harm US science? An examination of the knowledge spillover channel," Research Policy, Elsevier, vol. 48(5), pages 1248-1259.
    11. Corsini, Alberto & Pezzoni, Michele, 2023. "Does grant funding foster research impact? Evidence from France," Journal of Informetrics, Elsevier, vol. 17(4).
    12. Sebastian Hager & Carlo Schwarz & Fabian Waldinger, 2024. "Measuring Science: Performance Metrics and the Allocation of Talent," American Economic Review, American Economic Association, vol. 114(12), pages 4052-4090, December.
    13. Liu, Meijun & Hu, Xiao, 2021. "Will collaborators make scientists move? A Generalized Propensity Score analysis," Journal of Informetrics, Elsevier, vol. 15(1).
    14. Hmaddi, Ouafaa & Lanahan, Lauren & Murray, Alex, 2025. "Tracing entrepreneurial spillovers: Evidence from the U.S. State Small Business Credit initiative and Kickstarter," Research Policy, Elsevier, vol. 54(4).
    15. Matteo Prato & Fabrizio Ferraro, 2018. "Starstruck: How Hiring High-Status Employees Affects Incumbents’ Performance," Organization Science, INFORMS, vol. 29(5), pages 755-774, October.
    16. Galasso, Alberto & Luo, Hong & Zhu, Brooklynn, 2023. "Laboratory safety and research productivity," Research Policy, Elsevier, vol. 52(8).
    17. Xinyi Zhao & Samin Aref & Emilio Zagheni & Guy Stecklov, 2022. "Return migration of German-affiliated researchers: analyzing departure and return by gender, cohort, and discipline using Scopus bibliometric data 1996–2020," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(12), pages 7707-7729, December.
    18. Vikas A. Aggarwal & David H. Hsu, 2014. "Entrepreneurial Exits and Innovation," Management Science, INFORMS, vol. 60(4), pages 867-887, April.
    19. Prithwiraj Choudhury & Kirk Doran & Astrid Marinoni & Chungeun Yoon, 2022. "Loss of Peers and Individual Worker Performance: Evidence from H-1B Visa Denials," CESifo Working Paper Series 10152, CESifo.
    20. Zhu, Wanying & Jin, Ching & Ma, Yifang & Xu, Cong, 2023. "Earlier recognition of scientific excellence enhances future achievements and promotes persistence," Journal of Informetrics, Elsevier, vol. 17(2).

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    JEL classification:

    • C00 - Mathematical and Quantitative Methods - - General - - - General
    • A14 - General Economics and Teaching - - General Economics - - - Sociology of Economics

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:130:y:2025:i:5:d:10.1007_s11192-025-05298-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.