IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v130y2025i7d10.1007_s11192-025-05341-y.html
   My bibliography  Save this article

Examining linguistic shifts in academic writing before and after the launch of ChatGPT: a study on preprint papers

Author

Listed:
  • Tong Bao

    (Nanjing University of Science and Technology)

  • Yi Zhao

    (Nanjing University of Science and Technology)

  • Jin Mao

    (Wuhan University)

  • Chengzhi Zhang

    (Nanjing University of Science and Technology)

Abstract

Large Language Models (LLMs), such as ChatGPT, have prompted academic concerns about their impact on academic writing. Existing studies have primarily examined LLM usage in academic writing through quantitative approaches, such as word frequency statistics and probability-based analyses. However, few have systematically examined the potential impact of LLMs on the linguistic characteristics of academic writing. To address this gap, we conducted a large-scale analysis across 823,798 abstracts published in last decade from arXiv dataset. Through the linguistic analysis of features such as the frequency of LLM-preferred words, lexical complexity, syntactic complexity, cohesion, readability and sentiment, the results indicate a significant increase in the proportion of LLM-preferred words in abstracts, revealing the widespread influence of LLMs on academic writing. Additionally, we observed an increase in lexical complexity and sentiment in the abstracts, but a decrease in syntactic complexity, suggesting that LLMs introduce more new vocabulary and simplify sentence structure. However, the significant decrease in cohesion and readability indicates that abstracts have fewer connecting words and are becoming more difficult to read. Moreover, our analysis reveals that scholars with weaker English proficiency were more likely to use the LLMs for academic writing, and focused on improving the overall logic and fluency of the abstracts. Finally, at discipline level, we found that scholars in Computer Science showed more pronounced changes in writing style, while the changes in Mathematics were minimal.

Suggested Citation

  • Tong Bao & Yi Zhao & Jin Mao & Chengzhi Zhang, 2025. "Examining linguistic shifts in academic writing before and after the launch of ChatGPT: a study on preprint papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 130(7), pages 3597-3627, July.
  • Handle: RePEc:spr:scient:v:130:y:2025:i:7:d:10.1007_s11192-025-05341-y
    DOI: 10.1007/s11192-025-05341-y
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-025-05341-y
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-025-05341-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Melitz, Jacques & Toubal, Farid, 2014. "Native language, spoken language, translation and trade," Journal of International Economics, Elsevier, vol. 93(2), pages 351-363.
    2. Song, Ningyuan & Chen, Kejun & Zhao, Yuehua, 2023. "Understanding writing styles of scientific papers in the IS-LS domain: Evidence from abstracts over the past three decades," Journal of Informetrics, Elsevier, vol. 17(1).
    3. Bikun Chen & Dannan Deng & Zhouyan Zhong & Chengzhi Zhang, 2020. "Exploring linguistic characteristics of highly browsed and downloaded academic articles," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(3), pages 1769-1790, March.
    4. Leonardo Costa Ribeiro & Márcia Siqueira Rapini & Leandro Alves Silva & Eduardo Motta Albuquerque, 2018. "Growth patterns of the network of international collaboration in science," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(1), pages 159-179, January.
    5. Chao Lu & Yi Bu & Jie Wang & Ying Ding & Vetle Torvik & Matthew Schnaars & Chengzhi Zhang, 2019. "Examining scientific writing styles from the perspective of linguistic complexity," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 70(5), pages 462-475, May.
    6. Gui Wang & Hui Wang & Xinyi Sun & Nan Wang & Li Wang, 2023. "Linguistic complexity in scientific writing: A large-scale diachronic study from 1821 to 1920," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 441-460, January.
    7. Corrêa Jr., Edilson A. & Silva, Filipi N. & da F. Costa, Luciano & Amancio, Diego R., 2017. "Patterns of authors contribution in scientific manuscripts," Journal of Informetrics, Elsevier, vol. 11(2), pages 498-510.
    8. Holly Else, 2023. "Abstracts written by ChatGPT fool scientists," Nature, Nature, vol. 613(7944), pages 423-423, January.
    9. Amber Dance, 2012. "Authorship: Who's on first?," Nature, Nature, vol. 489(7417), pages 591-593, September.
    10. Hongyu Zhou & Raf Guns & Tim C. E. Engels, 2023. "Towards indicating interdisciplinarity: Characterizing interdisciplinary knowledge flow," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 74(11), pages 1325-1340, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Nikolaos Askitas, 2025. "The Behavioral Signature of GenAI in Scientific Communication," CESifo Working Paper Series 12069, CESifo.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sun, Zhuanlan & He, Dongjin & Li, Yiwei, 2024. "How the readability of manuscript before journal submission advantages peer review process: Evidence from biomedical scientific publications," Journal of Informetrics, Elsevier, vol. 18(3).
    2. Song, Ningyuan & Chen, Kejun & Zhao, Yuehua, 2023. "Understanding writing styles of scientific papers in the IS-LS domain: Evidence from abstracts over the past three decades," Journal of Informetrics, Elsevier, vol. 17(1).
    3. Kun Sun & Haitao Liu & Wenxin Xiong, 2021. "The evolutionary pattern of language in scientific writings: A case study of Philosophical Transactions of Royal Society (1665–1869)," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1695-1724, February.
    4. Gui Wang & Hui Wang & Xinyi Sun & Nan Wang & Li Wang, 2023. "Linguistic complexity in scientific writing: A large-scale diachronic study from 1821 to 1920," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 441-460, January.
    5. Brito, Ana C.M. & Silva, Filipi N. & de Arruda, Henrique F. & Comin, Cesar H. & Amancio, Diego R. & Costa, Luciano da F., 2021. "Classification of abrupt changes along viewing profiles of scientific articles," Journal of Informetrics, Elsevier, vol. 15(2).
    6. Fan Pan & Yiying Yang, 2025. "Diachronic change in lexical complexity of research articles (1970–2020): economics vs. medicine," Scientometrics, Springer;Akadémiai Kiadó, vol. 130(3), pages 1789-1812, March.
    7. Ali Zackery & Joseph Amankwah-Amoah & Zahra Heidari Darani & Shiva Ghasemi, 2022. "COVID-19 Research in Business and Management: A Review and Future Research Agenda," Sustainability, MDPI, vol. 14(16), pages 1-32, August.
    8. Magnus Lodefalk & Fredrik Sjöholm & Aili Tang, 2022. "International trade and labour market integration of immigrants," The World Economy, Wiley Blackwell, vol. 45(6), pages 1650-1689, June.
    9. Lin, Wenlian & Cao, Jerry & Zhou, Sili & Li, Yong, 2024. "Political affinity, multilateralism, and foreign direct investment worldwide," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 94(C).
    10. Lo Turco, Alessia & Maggioni, Daniela, 2018. "Effects of Islamic religiosity on bilateral trust in trade: The case of Turkish exports," Journal of Comparative Economics, Elsevier, vol. 46(4), pages 947-965.
    11. Tamara Gurevich & Peter R. Herman & Farid Toubal & Y. Yotov, 2024. "The Domestic and International Common Language Database," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-04682625, HAL.
    12. Victor Ginsburgh & Shlomo Weber, 2020. "The Economics of Language," Journal of Economic Literature, American Economic Association, vol. 58(2), pages 348-404, June.
    13. GINSBURGH, Victor & MELITZ, Jacques & TOUBAL, Farid, 2014. "Foreign language learnings: An econometric analysis," LIDAM Discussion Papers CORE 2014049, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    14. Hans Pohl, 2021. "Internationalisation, innovation, and academic–corporate co-publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1329-1358, February.
    15. Wessel, Jan, 2019. "Evaluating the transport-mode-specific trade effects of different transport infrastructure types," Transport Policy, Elsevier, vol. 78(C), pages 42-57.
    16. David M. Kemme & Bhavik Parikh & Tanja Steigner, 2017. "Tax Havens, Tax Evasion and Tax Information Exchange Agreements in the OECD," European Financial Management, European Financial Management Association, vol. 23(3), pages 519-542, June.
    17. Jing Yan, 2018. "Do Merger Laws Deter Cross‐Border Mergers and Acquisitions?," Australian Economic Papers, Wiley Blackwell, vol. 57(3), pages 376-393, September.
    18. Marchal, Léa & Naiditch, Claire, 2016. "A micro-funded theory of multilateral resistance to migration," Kiel Working Papers 2051, Kiel Institute for the World Economy (IfW Kiel).
    19. Joan Llull, 2018. "The Effect of Immigration on Wages: Exploiting Exogenous Variation at the National Level," Journal of Human Resources, University of Wisconsin Press, vol. 53(3), pages 608-662.
    20. Gianluca Orefice & Hillel Rapoport & Gianluca Santoni, 2021. "How Do Immigrants Promote Exports? Networks, Knowledge, Diversity," CESifo Working Paper Series 9288, CESifo.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:130:y:2025:i:7:d:10.1007_s11192-025-05341-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.