IDEAS home Printed from
   My bibliography  Save this article

In Search of Lost Profiles: The Reliability of VKontakte Data and Its Importance for Educational Research



Ivan Smirnov - ResearchAssistant, Institute of Education, National Research University Higher School of Economics. E-mail: ibsmirnov@hse.ruElizaveta Sivak - Research Fellow, Institute of Education, National Research University Higher School of Economics. E-mail: esivak@hse.ruYana Kozmina - Junior Research Fellow, Institute of Education, National Research University Higher School of Economics. E-mail: ikozmina@hse.ruAddress: 20 Myasnitskaya str., 101000 Moscow, Russian Federation.The potential of VKontakte as a data source is now acknowledged in educational research, but little is known about the reliability of data obtained from this social network and about its sampling bias. Our article investigates the reliability of VK data, using the examples of a secondary school (766 students) and a university (15,757 students). We describe the procedure of matching V K profiles to real students. A direct comparison permitted us to identify profiles of around 18% of students. A special technique introduced in the article increased this number up to 88% for school students and up to 93% for university students. We compare age, gender and GPA of identified students and those whomwe did not find on V K. We also compare the structure of social relationships, retrieved from VK data, to the expected structure of students' social ties. We found that the structure of virtual' social relationships reproduces both the socio-demographic division of students into classes or majors andthe spatial division into different school buildings or university campuses. To our knowledge, it is the first study of this kind and scale based on VK data. It contributes to the understanding of how reliable data from this SNS is, how its accuracy can be improved, and how it can be used in educational research.

Suggested Citation

  • Ivan Smirnov & Elizaveta Sivak & Yana Kozmina, 2016. "In Search of Lost Profiles: The Reliability of VKontakte Data and Its Importance for Educational Research," Voprosy obrazovaniya / Educational Studies Moscow, National Research University Higher School of Economics, issue 4, pages 106-122.
  • Handle: RePEc:nos:voprob:2016:i:4:p:106-122

    Download full text from publisher

    File URL:
    Download Restriction: no

    References listed on IDEAS

    1. Alexander Krasilnikov & Maria Semenova, 2014. "Do Social Networks Help to Improve Student Academic Performance? The Case of and Russian Students," Economics Bulletin, AccessEcon, vol. 34(2), pages 718-733.
    2. Valeria Ivaniushina & Daniil Alexandrov, 2013. "Anti-School Culture and Social Networks in Schools," Voprosy obrazovaniya / Educational Studies Moscow, National Research University Higher School of Economics, issue 2, pages 233-251.
    3. Mathieu Jacomy & Tommaso Venturini & Sebastien Heymann & Mathieu Bastian, 2014. "ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-12, June.
    Full references (including those not matched with items on IDEAS)


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. Kotyrlo , Elena, 2017. "Social network sites: What users post and to whom they address. Some approaches to the study," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 47, pages 74-99.
    2. Katerina Polivanova & Ivan Smirnov, 2017. "What's in My Profile: VKontakte Data as a Tool for Studying the Interests of Modern Teenagers," Voprosy obrazovaniya / Educational Studies Moscow, National Research University Higher School of Economics, issue 2, pages 134-152.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Duan, Zhengxiao & Zhang, Yanni & Deng, Jun & Shu, Pan & Yao, Di, 2023. "A systematic exploration of mapping knowledge domains for free radical research related to coal," Energy, Elsevier, vol. 282(C).
    2. Gong, Chen & Tang, Pan & Wang, Yutong, 2019. "Measuring the network connectedness of global stock markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 535(C).
    3. Gomez-Gonzalez, Jose E. & Hirs-Garzón, Jorge & Sanín-Restrepo, Sebastián, 2021. "Dynamic relations between oil and stock markets: Volatility spillovers, networks and causality," International Economics, Elsevier, vol. 165(C), pages 37-50.
    4. Florentin Gloetzl & Ernest Aigner, 2015. "Pluralism in the Market of Science? A citation network analysis of economic research at universities in Vienna," Ecological Economics Papers ieep5, Institute of Ecological Economics.
    5. Martin, Drew & Palakshappa, Nitha & Woodside, Arch, 2019. "Consumer metaphoria: Uncovering the automaticity of animal, product/brand, and country meanings," Australasian marketing journal, Elsevier, vol. 27(2), pages 113-125.
    6. Kamil Yilmaz, 2018. "Bank Volatility Connectedness in South East Asia," Koç University-TUSIAD Economic Research Forum Working Papers 1807, Koc University-TUSIAD Economic Research Forum.
    7. Costantini, Mauro & Maaitah, Ahmad & Mishra, Tapas & Sousa, Ricardo M., 2023. "Bitcoin market networks and cyberattacks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 630(C).
    8. Nassar S. Al-Nassar & Abdulrahman A. Albahouth, 2023. "Inflation Spillovers among Advanced and Emerging Economies: Evidence from the G20 Group," Economies, MDPI, vol. 11(4), pages 1-25, April.
    9. Nicola Melluso & Andrea Bonaccorsi & Filippo Chiarello & Gualtiero Fantoni, 2021. "Rapid detection of fast innovation under the pressure of COVID-19," Papers 2102.00197,
    10. Yan, Weiwei & Zhang, Yin, 2018. "Research universities on the ResearchGate social networking site: An examination of institutional differences, research activity level, and social networks formed," Journal of Informetrics, Elsevier, vol. 12(1), pages 385-400.
    11. Watanabe, Nicholas M. & Kim, Jiyeon & Park, Joohyung, 2021. "Social network analysis and domestic and international retailers: An investigation of social media networks of cosmetic brands," Journal of Retailing and Consumer Services, Elsevier, vol. 58(C).
    12. Paflioti, Persa & Vitsounis, Thomas K. & Teye, Collins & Bell, Michael G.H. & Tsamourgelis, Ioannis, 2017. "Box dynamics: A sectoral approach to analyse containerized port throughput interdependencies," Transportation Research Part A: Policy and Practice, Elsevier, vol. 106(C), pages 396-413.
    13. Mert Demirer & Francis X. Diebold & Laura Liu & Kamil Yilmaz, 2018. "Estimating global bank network connectedness," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 33(1), pages 1-15, January.
    14. John Spray & Sebastian Wolf, 2017. "Industries without smokestacks in Uganda and Rwanda," WIDER Working Paper Series wp-2017-12, World Institute for Development Economic Research (UNU-WIDER).
    15. Debnath, Ramit & Bardhan, Ronita & Reiner, David M. & Miller, J.R., 2021. "Political, economic, social, technological, legal and environmental dimensions of electric vehicle adoption in the United States: A social-media interaction analysis," Renewable and Sustainable Energy Reviews, Elsevier, vol. 152(C).
    16. Callen, Michael & Blumenstock, Joshua & Ghani, Tarek, 2016. "Mobile-izing Savings with Automatic Contributions: Experimental Evidence on Present Bias and Default Effects in Afghanistan," CEPR Discussion Papers 11400, C.E.P.R. Discussion Papers.
    17. Baccini, Federica & Barabesi, Lucio & Baccini, Alberto & Khelfaoui, Mahdi & Gingras, Yves, 2022. "Similarity network fusion for scholarly journals," Journal of Informetrics, Elsevier, vol. 16(1).
    18. Gomez-Gonzalez, Jose Eduardo & Hirs-Garzon, Jorge & Uribe, Jorge M., 2020. "Spillovers beyond the variance: exploring the natural gas and oil higher order risk linkages with the global financial markets," Working papers 46, Red Investigadores de Economía.
    19. Jérôme Baray & Gérard Cliquet & Yves Soulabail, 2023. "Thematic communities and image of french mass-market retailers in the media [Communautés thématiques et image des groupes de la grande distribution française dans les médias]," Post-Print hal-04251853, HAL.
    20. Chen, Yanhua & Li, Youwei & Pantelous, Athanasios A. & Stanley, H. Eugene, 2022. "Short-run disequilibrium adjustment and long-run equilibrium in the international stock markets: A network-based approach," International Review of Financial Analysis, Elsevier, vol. 79(C).


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nos:voprob:2016:i:4:p:106-122. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Marta Morozova (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.