IDEAS home Printed from https://ideas.repec.org/p/hal/journl/hal-01354368.html
   My bibliography  Save this paper

Mining team characteristics to predict Wikipedia article quality

Author

Listed:
  • Grace Gimon Betancourt

    (LUSSI - Département Logique des Usages, Sciences sociales et Sciences de l'Information - UEB - Université européenne de Bretagne - European University of Brittany - Télécom Bretagne - IMT - Institut Mines-Télécom [Paris])

  • Armando Segnini

    (LUSSI - Département Logique des Usages, Sciences sociales et Sciences de l'Information - UEB - Université européenne de Bretagne - European University of Brittany - Télécom Bretagne - IMT - Institut Mines-Télécom [Paris])

  • Carlos Trabuco

    (LUSSI - Département Logique des Usages, Sciences sociales et Sciences de l'Information - UEB - Université européenne de Bretagne - European University of Brittany - Télécom Bretagne - IMT - Institut Mines-Télécom [Paris])

  • Amira Rezgui

    (LUSSI - Département Logique des Usages, Sciences sociales et Sciences de l'Information - UEB - Université européenne de Bretagne - European University of Brittany - Télécom Bretagne - IMT - Institut Mines-Télécom [Paris], MARSOUIN - Môle Armoricain de Recherche sur la SOciété de l'information et des usages d'INternet - UR - Université de Rennes - UEB - Université européenne de Bretagne - European University of Brittany - UBS - Université de Bretagne Sud - ENSAI - Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] - UBO - Université de Brest - Télécom Bretagne - IMT - Institut Mines-Télécom [Paris] - UR2 - Université de Rennes 2, ICI - Laboratoire Information, Coordination, Incitations - UEB - Université européenne de Bretagne - European University of Brittany - UBO - Université de Brest - Télécom Bretagne - IMT - Institut Mines-Télécom [Paris] - IBSHS - Institut Brestois des Sciences de l'Homme et de la Société - UBO - Université de Brest)

  • Nicolas Jullien

    (LUSSI - Département Logique des Usages, Sciences sociales et Sciences de l'Information - UEB - Université européenne de Bretagne - European University of Brittany - Télécom Bretagne - IMT - Institut Mines-Télécom [Paris], MARSOUIN - Môle Armoricain de Recherche sur la SOciété de l'information et des usages d'INternet - UR - Université de Rennes - UEB - Université européenne de Bretagne - European University of Brittany - UBS - Université de Bretagne Sud - ENSAI - Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] - UBO - Université de Brest - Télécom Bretagne - IMT - Institut Mines-Télécom [Paris] - UR2 - Université de Rennes 2, ICI - Laboratoire Information, Coordination, Incitations - UEB - Université européenne de Bretagne - European University of Brittany - UBO - Université de Brest - Télécom Bretagne - IMT - Institut Mines-Télécom [Paris] - IBSHS - Institut Brestois des Sciences de l'Homme et de la Société - UBO - Université de Brest)

Abstract

In this study, we were interested in studying which characteristics of virtual teams are good predictors for the quality of their production. The experiment involved obtaining the Spanish Wikipedia database dump and applying different data mining techniques suitable for large data sets to label the whole set of articles according to their quality (comparing them with the Featured/Good Articles, or FA/GA). Then we created the attributes that describe the characteristics of the team who produced the articles and using decision tree methods, we obtained the most relevant characteristics of the teams that produced FA/GA. The team's maximum efficiency and the total length of contribution are the most important predictors. This article contributes to the literature on virtual team organization.

Suggested Citation

  • Grace Gimon Betancourt & Armando Segnini & Carlos Trabuco & Amira Rezgui & Nicolas Jullien, 2016. "Mining team characteristics to predict Wikipedia article quality," Post-Print hal-01354368, HAL.
  • Handle: RePEc:hal:journl:hal-01354368
    Note: View the original document on HAL open archive server: https://hal.science/hal-01354368v1
    as

    Download full text from publisher

    File URL: https://hal.science/hal-01354368v1/document
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Dirk Lewandowski & Ulrike Spree, 2011. "Ranking of Wikipedia articles in search engines revisited: Fair ranking for reasonable quality?," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(1), pages 117-132, January.
    2. Besiki Stvilia & Michael B. Twidale & Linda C. Smith & Les Gasser, 2008. "Information quality work organization in wikipedia," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 59(6), pages 983-1001, April.
    3. Dirk Lewandowski & Ulrike Spree, 2011. "Ranking of Wikipedia articles in search engines revisited: Fair ranking for reasonable quality?," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(1), pages 117-132, January.
    4. Rullani, Francesco & Haefliger, Stefan, 2013. "The periphery on stage: The intra-organizational dynamics in online communities of creation," Research Policy, Elsevier, vol. 42(4), pages 941-953.
    5. Besiki Stvilia & Les Gasser & Michael B. Twidale & Linda C. Smith, 2007. "A framework for information quality assessment," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 58(12), pages 1720-1733, October.
    6. Eduardo A. Haddad & Jaime Bonet & Geoffrey J. D. Hewings, 2023. "Introduction and Overview," Advances in Spatial Science, in: Eduardo A. Haddad & Jaime Bonet & Geoffrey J. D. Hewings (ed.), The Colombian Economy and Its Regional Structural Challenges, chapter 0, pages 1-16, Springer.
    7. Nicolas Jullien & Kevin Crowston & Felipe Ortega, 2015. "The Rise and Fall of an Online Project. Is Bureaucracy Killing Efficiency in Open Knowledge Production?," Post-Print hal-01192596, HAL.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nicolas Jullien, 2012. "What We Know About Wikipedia: A Review of the Literature Analyzing the Project(s)," Post-Print hal-00857208, HAL.
    2. Kevin Crowston & Nicolas Jullien & Felipe Ortega, 2013. "Is Wikipedia Inefficient? Modelling Effort and Participation in Wikipedia," Post-Print hal-00947731, HAL.
    3. Dejean, Sylvain & Jullien, Nicolas, 2015. "Big from the beginning: Assessing online contributors’ behavior by their first contribution," Research Policy, Elsevier, vol. 44(6), pages 1226-1239.
    4. Dirk Lewandowski, 2015. "Evaluating the retrieval effectiveness of web search engines using a representative query sample," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(9), pages 1763-1775, September.
    5. Chris Desmond & Janet Seeley & Candice Groenewald & Nothando Ngwenya & Kate Rich & Tony Barnett, 2019. "Interpreting social determinants: Emergent properties and adolescent risk behaviour," PLOS ONE, Public Library of Science, vol. 14(12), pages 1-17, December.
    6. Li, Yung-Ming & Lee, Yi-Lin, 2010. "Pricing peer-produced services: Quality, capacity, and competition issues," European Journal of Operational Research, Elsevier, vol. 207(3), pages 1658-1668, December.
    7. Bernhard Christoph, 2010. "The Relation Between Life Satisfaction and the Material Situation: A Re-Evaluation Using Alternative Measures," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 98(3), pages 475-499, September.
    8. repec:iab:iabfda:201307(en is not listed on IDEAS
    9. Natina Yaduma & Mika Kortelainen & Ada Wossink, 2013. "Estimating Mortality and Economic Costs of Particulate Air Pollution in Developing Countries: The Case of Nigeria," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 54(3), pages 361-387, March.
    10. Kody T. Ponds & Ali Arefi & Ali Sayigh & Gerard Ledwich, 2018. "Aggregator of Demand Response for Renewable Integration and Customer Engagement: Strengths, Weaknesses, Opportunities, and Threats," Energies, MDPI, vol. 11(9), pages 1-20, September.
    11. Justesen, Mogens K. & Bjørnskov, Christian, 2014. "Exploiting the Poor: Bureaucratic Corruption and Poverty in Africa," World Development, Elsevier, vol. 58(C), pages 106-115.
    12. Anke Becker, 2019. "On the Economic Origins of Restrictions on Women's Sexuality," CESifo Working Paper Series 7770, CESifo.
    13. Jessica Gordon Nembhard, 2013. "Community Development Credit Unions: Securing and Protecting Assets in Black Communities," The Review of Black Political Economy, Springer;National Economic Association, vol. 40(4), pages 459-490, December.
    14. Gawer, Annabelle, 2014. "Bridging differing perspectives on technological platforms: Toward an integrative framework," Research Policy, Elsevier, vol. 43(7), pages 1239-1249.
    15. Alemayehu, B. & Hagos, Fitsum & Haileslassie, A. & Mapedza, Everisto & Awulachew, Seleshi Bekele & Peden, D. & Tafesse, T., 2009. "Prospect of payments for environmental services in the Blue Nile Basin: examples from Koga and Gumera watersheds, Ethiopia," Conference Papers h042521, International Water Management Institute.
    16. Lee, Jung & Seo, DongBack, 2016. "Crowdsourcing not all sourced by the crowd: An observation on the behavior of Wikipedia participants," Technovation, Elsevier, vol. 55, pages 14-21.
    17. Mahamadou Roufahi Tankari, 2018. "Mobile Phone and Households¡¯ Poverty: Evidence from Niger," Journal of Economic Development, Chung-Ang Unviersity, Department of Economics, vol. 43(2), pages 67-84, June.
    18. Srivastava, Abhishek & Bala, Pradip Kumar & Kumar, Bipul, 2020. "New perspectives on gray sheep behavior in E-commerce recommendations," Journal of Retailing and Consumer Services, Elsevier, vol. 53(C).
    19. Khan, Aftab Ahmed & Razzaq, Sohail & Khan, Asadullah & Khursheed, Fatima & Owais,, 2015. "HEMSs and enabled demand response in electricity market: An overview," Renewable and Sustainable Energy Reviews, Elsevier, vol. 42(C), pages 773-785.
    20. Kuk, George & Schaarschmidt, Mario & Homscheid, Dirk, 2024. "All of the same breed? A networking perspective of private-collective innovation," Journal of Business Research, Elsevier, vol. 172(C).
    21. Zaggl, Michael A., 2017. "Manipulation of explicit reputation in innovation and knowledge exchange communities: The example of referencing in science," Research Policy, Elsevier, vol. 46(5), pages 970-983.

    More about this item

    Keywords

    Wikipedia; Epistemic Community; Article quality; Teaming;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:journl:hal-01354368. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.