IDEAS home Printed from https://ideas.repec.org/p/ekd/002672/4535.html
   My bibliography  Save this paper

How to become a regular/big Wikipedia contributor? A robust fuzzy predictive model of the propensity to contribute based on a French survey

Author

Listed:
  • Pascal Jollivet
  • G. Gueydan
  • P. Jollivet
  • N. Jullien,
  • Y. Moulier-Boutang
  • M. Vicente
  • Z. Zalila

Abstract

Value creation processes based on economic dynamics of contribution – also referred as crowdsourcing or web 2.0 - take more and more importance in contemporary capitalism (Moulier-Boutang 2010; Moulier-Boutang 2011a). This paper’s aim is twofold: 1) to contribute to a better understanding of the kind of positive externalities at stake by a deep study of one of the most archetypical organization of this contributive economy, i.e. Wikimedia (France) with original and exhaustive data; 2) to explore some distinctive heuristic capacities of a robust fuzzy predictive modeling and fuzzy optimization tool supposed to provide better performance for non-linear modeling on quali-quanti datasets (Zalila & al., 2008a, 2008b). Our theoretical framework relies upon a socio-economic approach, coupling evolutionist, cognitive capitalism economics and digital humanities sociology. Our main hypothesis deals with the critical importance of socio-cognitive processes such as interactive learning and socialization processes in the economy of contribution. More specifically, we will try to solve an apparent paradox concerning three major predictive variables of our model of contribution, suggesting a conflict between two underlying processes. a robust fuzzy predictive modeling approach Fuzzy inference systems allow to easily and intuitively model any decision making process, whether it represents a physical measurement, a mathematical computation or a human evaluation. The decision making process is modelled as a deterministic relationship between inputs - available knowledge about the situation - and an output - the decision to be taken - implicitly expressed by linguistic rules. The classical fuzzy modelling method derived from Artificial Intelligence uses available knowledge about the decision making process (Zadeh, 1965, 1973, 1975). The fuzzy rules are built thanks to a linguistic expression of this knowledge. However, in many situations (subjective evaluation, high complexity of the decision process), it is not possible to a priori define the linguistic rules that explain the process. In those cases, the xtractis® approach proposes to automatically extract the linguistic rules explaining the process through automatic learning (Zalila & al., 2008a, 2008b). This learning is performed on a database of several decision cases corresponding to different situations. This approach is similar to neural network training with a learning base, with the advantages of the fuzzy model paradigm over neural networks. Every automatic learning or training process is prone to a risk of overfitting (or overtraining). It is in fact really easy to obtain a model that is able to exactly predict the points of a learning database, but that has no generalization ability to unknown points. Thus, a fuzzy system composed by as many rules as learning points could easily give exact predictions (the conclusion of the rule would be the known output value of each point), but a prediction on any other points would have no sense. The same behaviour happens when trying to fit a statistical model with too many parameters on a small data sample. It is then required to implement several means of minimizing the risk of overfitting during the learning process (regulation methods) and to check the generalization capacity of the generated models (validation methods). xtractis® integrates these two classes of methods to be able building robust predictive models, when they exist. We choose to assess the robustness of built models by a Monte-Carlo Cross-validation estimator with 150 cycles, randomly drawing 15% of the reference dataset as validation points. First partial results : a robust model and 5 major predictive variables allowing a better understanding of the propensity to contribute a. The 5 best predictive variables: a paradox between progressive learning processes and initial endowment? Five best predictive variables The xtractis® tool calculates for each predictor the individual contribution of this variable to the quality of the prediction. A value of 1 is given to the most influential predictor and 0 to a variable with no influence. Consequently, if the value of a predictor with a high contribution is not filled in, the quality of the prediction will fall drastically. Conversely, a predictor with a low individual contribution will not affect a lot the prediction if its value is not filled in.It is important to remind that a high individual contribution does not mean that the predictor is a “positive driver” of the variable to predict, like in statistics approaches. Actually, fuzzy theory is mostly focused on nonlinear and non-monotonic models; consequently, a predictor could be a positive driver on certain regions of the decision space (the higher the value of the predictor, the higher the value of the variable to predict), and at the same time, a “negative driver” in other regions (the higher the value of the predictor, the lower the value of the variable to predict). The rules automatically built by xtractis® explain the relationships existing between the variable to predict and the predictor (and its interactions with the other predictors). Five predictors are used by the three top-models and have a strong average individual contribution to the quality of the prediction superior to 0.7: REPAppartientCommunaute, QuandPremierMois1erContrib, SituationMomentContrib[EnEmploi], SituationMomentContrib[CollegeLyceen], OuiNouvelArticle. The importance of learning processes in becoming a regular/big contributor The variable REPAppartientCommunaute (see Table 2) proves to be the most important predictor of the three top-models (with an Averageindividual contribution of 0914). The aim of the question is to test the feeling of belonging to a community, in the social sense, of the answerer. This high contribution of the variable to the model leads us to consider whether the development of behaviors of big contributions (vs little) is highly linked to a social behavior of socialization of the wikipedian . Such socialization enables him/her to develop a feeling of belonging to a community actually argues that learning processes are embedded in processes of social interactions, and that consequently, community forms of organizations are to be very conducive to individual and collective learning. Lundvall (2010) explains that interactive learning is a key process for innovation and that it constitutes an intangible asset providing competitive advantage for organizations or nations. This statistical result therefore tends to reinforce the above hypothesis of high linkage between contributivity and socialization for wikipedian high contribution. This raises the question of the consequence of such regularity on the social filtering of contributors, and the possible initiatives to correct it. The critical influence of initial individual endowment The QuandPremierMoi1erContrib variable (see table 2) proves also to be an important predictor of the models (with an Average individual contribution of 0,.822). At first glance, this result tends to go against the hypothesis of the importance of learning processes, as cumulative, related to experience, processes. Actually, as argued in Dejean and Jullien (2012), who worked with the same database, a quite convincing interpretation may be proposed. It consists in asserting that there is a strong social determinism (and a barrier to entry) going on here, that may be summarized by a sentence as “You are born wikipedian high contributor, you don’t become one”. A social, cognitive and cultural capital (in the sense of Bourdieu) would be a prerequisite for being able to be a future high contributor, and no learning process would be critical. The high level of contribution of variable OuiNouvelArticle (with an Average individual contribution of 0,.752) would seem to reinforce this former interpretation: future high contributors not only contribute very fast (close to their first discovery of the wiki) but they contribute very strong right away (they don’t correct a grammatical mistake or improve an article but they commit a new article). No learning process seems of importance. However, as pointed by Dejean and Jullien, future high contributors prove to have often interacted with other wikipedian (asking help to more experienced contributors) from the very beginning of their involvement . Here again, the importance of interactive learning for becoming a big contributor comes back as a critical process. A synthetic interpretative hypothesis resolving the apparent paradox? How to conciliate those apparent contradictory results? We submit a threefold hypothesis that fits both the “initial capital determinism” and the “social learning by interacting” evolutionary dynamics : - hypothesis 1 (H1) : social learning by interacting evolutionary processes are key underlying mechanisms partially explaining how a contributor can become a big contributor rather than stays a small or intermediate one. - hypothesis 2 (H2) : initial individual socio-cognitive endowments (as social and cognitive assets) have critical importance in determining if a contributor can become a big contributor rather than stays a small or intermediate one. - hypothesis 3 (H3) : the critical initial individual endowment consists in a socio-cognitive capability empowering the individual to access to learning by interacting process. These interpretative hypotheses would however need to be deepened by further explorative work. Two heterogeneous categories of contributors (at least) The two last variables that can pretend to a “strong predictor” status (contribution >0,.7) in our models deal with the type of occupation of the individual when first contributing (SituationMomentContrib[EnEmploi] and SituationMomentContrib[CollegeLyceen]) (see Table 1). They seem contradictory in first instance. Indeed, the SituationMomentContrib[EnEmploi] suggests that the insertion in the professional world is a an important socio-economic feature for becoming a big contributor. When the SituationMomentContrib[CollegeLyceen] variable suggests, quite oppositely, that being a teenager at school is quite conducive to becoming a big contributor. Further investigation and analysis seems to be needed to understand, for instance, if we are facing here heterogeneous groups that could be discriminated .

Suggested Citation

  • Pascal Jollivet & G. Gueydan & P. Jollivet & N. Jullien, & Y. Moulier-Boutang & M. Vicente & Z. Zalila, 2012. "How to become a regular/big Wikipedia contributor? A robust fuzzy predictive model of the propensity to contribute based on a French survey," EcoMod2012 4535, EcoMod.
  • Handle: RePEc:ekd:002672:4535
    as

    Download full text from publisher

    File URL: http://ecomod.net/system/files/Gueydan%20%20%2526%20al%20%202012%20-%20%20How%20to%20become%20a%20regular-big%20wikipedia%20contributor%20-%20Final%20version%20of%20full%20article%20.doc
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ekd:002672:4535. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Theresa Leary (email available below). General contact details of provider: https://edirc.repec.org/data/ecomoea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.