IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0315849.html
   My bibliography  Save this article

Detecting anomalies in graph networks on digital markets

Author

Listed:
  • Agata Skorupka

Abstract

The study examines different graph-based methods of detecting anomalous activities on digital markets, proposing the most efficient way to increase market actors’ protection and reduce information asymmetry. Anomalies are defined below as both bots and fraudulent users (who can be both bots and real people). Methods are compared against each other, and state-of-the-art results from the literature and a new algorithm is proposed. The goal is to find an efficient method suitable for threat detection, both in terms of predictive performance and computational efficiency. It should scale well and remain robust on the advancements of the newest technologies. The article utilized three publicly accessible graph-based datasets: one describing the Twitter social network (TwiBot-20) and two describing Bitcoin cryptocurrency markets (Bitcoin OTC and Bitcoin Alpha). In the former, an anomaly is defined as a bot, as opposed to a human user, whereas in the latter, an anomaly is a user who conducted a fraudulent transaction, which may (but does not have to) imply being a bot. The study proves that graph-based data is a better-performing predictor than text data. It compares different graph algorithms to extract feature sets for anomaly detection models. It states that methods based on nodes’ statistics result in better model performance than state-of-the-art graph embeddings. They also yield a significant improvement in computational efficiency. This often means reducing the time by hours or enabling modeling on significantly larger graphs (usually not feasible in the case of embeddings). On that basis, the article proposes its own graph-based statistics algorithm. Furthermore, using embeddings requires two engineering choices: the type of embedding and its dimension. The research examines whether there are types of graph embeddings and dimensions that perform significantly better than others. The solution turned out to be dataset-specific and needed to be tailored on a case-by-case basis, adding even more engineering overhead to using embeddings (building a leaderboard of grid of embedding instances, where each of them takes hours to be generated). This, again, speaks in favor of the proposed algorithm based on nodes’ statistics. The research proposes its own efficient algorithm, which makes this engineering overhead redundant.

Suggested Citation

  • Agata Skorupka, 2024. "Detecting anomalies in graph networks on digital markets," PLOS ONE, Public Library of Science, vol. 19(12), pages 1-30, December.
  • Handle: RePEc:plo:pone00:0315849
    DOI: 10.1371/journal.pone.0315849
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0315849
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0315849&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0315849?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Peter Fratrič & Giovanni Sileno & Sander Klous & Tom Engers, 2022. "Manipulation of the Bitcoin market: an agent-based study," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-29, December.
    2. Kai-Cheng Yang & Emilio Ferrara & Filippo Menczer, 2022. "Botometer 101: social bot practicum for computational social scientists," Journal of Computational Social Science, Springer, vol. 5(2), pages 1511-1528, November.
    3. Dina Mayzlin & Yaniv Dover & Judith Chevalier, 2014. "Promotional Reviews: An Empirical Investigation of Online Review Manipulation," American Economic Review, American Economic Association, vol. 104(8), pages 2421-2455, August.
    4. Michael Luca & Georgios Zervas, 2016. "Fake It Till You Make It: Reputation, Competition, and Yelp Review Fraud," Management Science, INFORMS, vol. 62(12), pages 3412-3427, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hui, Xiang & Klein, Tobias & Stahl, Konrad, 2022. "Learning from Online Ratings," CEPR Discussion Papers 17006, C.E.P.R. Discussion Papers.
    2. M. Narciso, 2022. "The Unreliability of Online Review Mechanisms," Journal of Consumer Policy, Springer, vol. 45(3), pages 349-368, September.
    3. Sungsik Park & Woochoel Shin & Jinhong Xie, 2021. "The Fateful First Consumer Review," Marketing Science, INFORMS, vol. 40(3), pages 481-507, May.
    4. Lingfang (Ivy) Li & Steven Tadelis & Xiaolan Zhou, 2020. "Buying reputation as a signal of quality: Evidence from an online marketplace," RAND Journal of Economics, RAND Corporation, vol. 51(4), pages 965-988, December.
    5. Plé, Loïc & Demangeot, Catherine, 2020. "Social contagion of online and offline deviant behaviors and its value outcomes: The case of tourism ecosystems," Journal of Business Research, Elsevier, vol. 117(C), pages 886-896.
    6. Gesche, Tobias, 2018. "Reference Price Shifts and Customer Antagonism: Evidence from Reviews for Online Auctions," VfS Annual Conference 2018 (Freiburg, Breisgau): Digital Economy 181650, Verein für Socialpolitik / German Economic Association.
    7. Zhuang, Mengzhou & Cui, Geng & Peng, Ling, 2018. "Manufactured opinions: The effect of manipulating online product reviews," Journal of Business Research, Elsevier, vol. 87(C), pages 24-35.
    8. Vollaard, Ben & van Ours, Jan C., 2022. "Bias in expert product reviews," Journal of Economic Behavior & Organization, Elsevier, vol. 202(C), pages 105-118.
    9. Hung-Pin Shih & Pei-Chen Sung, 2021. "Addressing the Review-Based Learning and Private Information Approaches to Foster Platform Continuance," Information Systems Frontiers, Springer, vol. 23(3), pages 649-661, June.
    10. Surachartkumtonkun, Jiraporn (Nui) & Grace, Debra & Ross, Mitchell, 2021. "Unfair customer reviews: Third-party perceptions and managerial responses," Journal of Business Research, Elsevier, vol. 132(C), pages 631-640.
    11. Christoph Carnehl & Maximilian Schaefer & André Stenzel & Kevin Ducbao Tran, 2022. "Value for Money and Selection: How Pricing Affects Airbnb Ratings," Working Papers 684, IGIER (Innocenzo Gasparini Institute for Economic Research), Bocconi University.
    12. Wen Zhang & Qiang Wang & Jian Li & Zhenzhong Ma & Gokul Bhandari & Rui Peng, 2023. "What makes deceptive online reviews? A linguistic analysis perspective," Palgrave Communications, Palgrave Macmillan, vol. 10(1), pages 1-14, December.
    13. Apostolos Filippas & John J. Horton & Joseph M. Golden, 2022. "Reputation Inflation," Marketing Science, INFORMS, vol. 41(4), pages 733-745, July.
    14. Theodoros Lappas & Gaurav Sabnis & Georgios Valkanas, 2016. "The Impact of Fake Reviews on Online Visibility: A Vulnerability Assessment of the Hotel Industry," Information Systems Research, INFORMS, vol. 27(4), pages 940-961, December.
    15. Weijia Dai & Hyunjin Kim & Michael Luca, 2023. "Frontiers: Which Firms Gain from Digital Advertising? Evidence from a Field Experiment," Marketing Science, INFORMS, vol. 42(3), pages 429-439, May.
    16. Kim, Jong Min & Park, Keeyeon Ki-cheon & Mariani, Marcello & Wamba, Samuel Fosso, 2024. "Investigating reviewers' intentions to post fake vs. authentic reviews based on behavioral linguistic features," Technological Forecasting and Social Change, Elsevier, vol. 198(C).
    17. Dimitrios Tsekouras & Dominik Gutt & Irina Heimbach, 2024. "The robo bias in conversational reviews: How the solicitation medium anthropomorphism affects product rating valence and review helpfulness," Journal of the Academy of Marketing Science, Springer, vol. 52(6), pages 1651-1672, November.
    18. Krügel, Jan Philipp & Paetzel, Fabian, 2024. "The impact of fraud on reputation systems," Games and Economic Behavior, Elsevier, vol. 144(C), pages 329-354.
    19. Mardumyan, Anna & Siret, Iris, 2023. "When review verification does more harm than good: How certified reviews determine customer–brand relationship quality," Journal of Business Research, Elsevier, vol. 160(C).
    20. T. Tony Ke & Yuting Zhu, 2021. "Cheap Talk on Freelance Platforms," Management Science, INFORMS, vol. 67(9), pages 5901-5920, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0315849. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.