IDEAS home Printed from https://ideas.repec.org/a/spr/qualqt/v55y2021i4d10.1007_s11135-020-01067-6.html
   My bibliography  Save this article

What’s in a text? Bridging the gap between quality and quantity in the digital era

Author

Listed:
  • Roberto Franzosi

    (Emory University)

Abstract

The digital era has not only given us the world of big data, but the tools to deal with this mostly unstructured, mostly textual data: Natural Language Processing (NLP) tools. Yet, most humanists and social scientists do not work with big data. They do not deal with millions of documents. Literary critics’ corpora are the handful of works produced by an author. The historians’ primary source documents number in the tens, perhaps hundreds. Social scientists deal with tens or hundreds of transcripts of focus groups and in-depth interviews, or at most a few thousand media articles. And they analyze these data either qualitatively or quantitatively with a variety of manual or computer-assisted methodologies, from content analysis to frame analysis, discourse analysis, quantitative narrative analysis. But, once developed, at least some of the NLP tools of automatic textual analysis and the data analytics visualization tools, can be applied not just to big data but to small data as well. This paper illustrates how some of these tools can be used by focusing on a short first-person narrative. And the NLP tools reveal patterns of language use perhaps not immediately discernible, thus proving useful in the analysis of even small data. But understanding and interpreting these patterns requires knowledge way beyond the NLP tools themselves. Humanists and social scientists need not fear computer scientists; rather, they need to learn to take advantage of them. NLP tools lay a bridge between quality and quantity, with much to be gained from a constant interaction between distant and close reading.

Suggested Citation

  • Roberto Franzosi, 2021. "What’s in a text? Bridging the gap between quality and quantity in the digital era," Quality & Quantity: International Journal of Methodology, Springer, vol. 55(4), pages 1513-1540, August.
  • Handle: RePEc:spr:qualqt:v:55:y:2021:i:4:d:10.1007_s11135-020-01067-6
    DOI: 10.1007/s11135-020-01067-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11135-020-01067-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11135-020-01067-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Park, Han Woo & Leydesdorff, Loet, 2013. "Decomposing social and semantic networks in emerging “big data” research," Journal of Informetrics, Elsevier, vol. 7(3), pages 756-765.
    2. Roberto Franzosi, 2017. "A third road to the past? Historical scholarship in the age of big data," Historical Methods: A Journal of Quantitative and Interdisciplinary History, Taylor & Francis Journals, vol. 50(4), pages 227-244, October.
    3. Mauro Tebaldi & Marco Calaresu & Alberto Purpura, 2019. "The power of the President: a quantitative narrative analysis of the Diary of an Italian head of state (2006–2013)," Quality & Quantity: International Journal of Methodology, Springer, vol. 53(6), pages 3063-3095, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Stefania Capogna, 2023. "Sociology between big data and research frontiers, a challenge for educational policies and skills," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(1), pages 193-212, February.
    2. Sepideh Fahimifar & Khadijeh Mousavi & Fatemeh Mozaffari & Marcel Ausloos, 2023. "Identification of the most important external features of highly cited scholarly papers through 3 (i.e., Ridge, Lasso, and Boruta) feature selection data mining methods," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(4), pages 3685-3712, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhang, Yi & Huang, Ying & Porter, Alan L. & Zhang, Guangquan & Lu, Jie, 2019. "Discovering and forecasting interactions in big data research: A learning-enhanced bibliometric study," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 795-807.
    2. Hyejin Park & Han Woo Park, 2018. "Two-side face of knowledge building using scientometric analysis," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(6), pages 2815-2836, November.
    3. Maruccia, Ylenia & Solazzo, Gianluca & Del Vecchio, Pasquale & Passiante, Giuseppina, 2020. "Evidence from Network Analysis application to Innovation Systems and Quintuple Helix," Technological Forecasting and Social Change, Elsevier, vol. 161(C).
    4. Yuchul Jung, 2017. "A semantic annotation framework for scientific publications," Quality & Quantity: International Journal of Methodology, Springer, vol. 51(3), pages 1009-1025, May.
    5. Jungwon Yoon & Joshua SungWoo Yang & Han Woo Park, 2017. "Quintuple helix structure of Sino-Korean research collaboration in science," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(1), pages 61-81, October.
    6. Vivek Kumar Singh & Sumit Kumar Banshal & Khushboo Singhal & Ashraf Uddin, 2015. "Scientometric mapping of research on ‘Big Data’," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(2), pages 727-741, November.
    7. Xu, Guannan & Wu, Yuchen & Minshall, Tim & Zhou, Yuan, 2018. "Exploring innovation ecosystems across science, technology, and business: A case of 3D printing in China," Technological Forecasting and Social Change, Elsevier, vol. 136(C), pages 208-221.
    8. Arif Mehmood & Byung-Won On & Ingyu Lee & Han Woo Park & Gyu Sang Choi, 2018. "Corroborating social media echelon in cancer research," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(2), pages 801-813, March.
    9. Kyujin Jung & Minsun Song & Han Woo Park, 2018. "Filling the gap between bureaucratic and adaptive approaches to crisis management: lessons from the Sewol Ferry sinking in South Korea," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(1), pages 277-294, January.
    10. Han Woo Park, 2014. "An interview with Loet Leydesdorff: the past, present, and future of the triple helix in the age of big data," Scientometrics, Springer;Akadémiai Kiadó, vol. 99(1), pages 199-202, April.
    11. Chinnadurai Kathiravan & Murugesan Selvam & Desti Kannaiah & Kasilingam Lingaraja & Vadivel Thanikachalam, 2019. "On the relationship between weather and Agricultural Commodity Index in India: a study with reference to Dhaanya of NCDEX," Quality & Quantity: International Journal of Methodology, Springer, vol. 53(2), pages 667-683, March.
    12. Xiaozan Lyu & Rodrigo Costas, 2020. "How do academic topics shift across altmetric sources? A case study of the research area of Big Data," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(2), pages 909-943, May.
    13. Jae-Hyuck Lee & Do-Kyun Kim, 2021. "Analysis of the Discriminatory Perceptions of Victims on Damage from Environmental Pollution: A Case Study of the Hebei Spirit Oil Spill in South Korea," Land, MDPI, vol. 10(10), pages 1-12, October.
    14. Marko M. Skoric, 2014. "The implications of big data for developing and transitional economies: Extending the Triple Helix?," Scientometrics, Springer;Akadémiai Kiadó, vol. 99(1), pages 175-186, April.
    15. Karlsson, Tobias, 2019. "Strikes and Lockouts in Sweden: Reconsidering Raphael’s List of Work Stoppages 1859-1902," Lund Papers in Economic History 192, Lund University, Department of Economic History.
    16. Hyo Chan Park & Jonghee M. Youn & Han Woo Park, 2019. "Global mapping of scientific information exchange using altmetric data," Quality & Quantity: International Journal of Methodology, Springer, vol. 53(2), pages 935-955, March.
    17. Mauro Tebaldi & Marco Calaresu & Alberto Purpura, 2022. "The actorness of the President of the Republic in Italian foreign policy: a quantitative narrative analysis of two case studies (1999–2013)," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(4), pages 2035-2061, August.
    18. Ying Huang & Jannik Schuehle & Alan L. Porter & Jan Youtie, 2015. "A systematic method to create search strategies for emerging technologies based on the Web of Science: illustrated for ‘Big Data’," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(3), pages 2005-2022, December.
    19. Han Woo Park & Jungwon Yoon & Loet Leydesdorff, 2016. "The normalization of co-authorship networks in the bibliometric evaluation: the government stimulation programs of China and Korea," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(2), pages 1017-1036, November.
    20. Srijana Acharya & Han Woo Park, 2017. "Open data in Nepal: a webometric network analysis," Quality & Quantity: International Journal of Methodology, Springer, vol. 51(3), pages 1027-1043, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:qualqt:v:55:y:2021:i:4:d:10.1007_s11135-020-01067-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.