IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0315849.html
   My bibliography  Save this article

Detecting anomalies in graph networks on digital markets

Author

Listed:
  • Agata Skorupka

Abstract

The study examines different graph-based methods of detecting anomalous activities on digital markets, proposing the most efficient way to increase market actors’ protection and reduce information asymmetry. Anomalies are defined below as both bots and fraudulent users (who can be both bots and real people). Methods are compared against each other, and state-of-the-art results from the literature and a new algorithm is proposed. The goal is to find an efficient method suitable for threat detection, both in terms of predictive performance and computational efficiency. It should scale well and remain robust on the advancements of the newest technologies. The article utilized three publicly accessible graph-based datasets: one describing the Twitter social network (TwiBot-20) and two describing Bitcoin cryptocurrency markets (Bitcoin OTC and Bitcoin Alpha). In the former, an anomaly is defined as a bot, as opposed to a human user, whereas in the latter, an anomaly is a user who conducted a fraudulent transaction, which may (but does not have to) imply being a bot. The study proves that graph-based data is a better-performing predictor than text data. It compares different graph algorithms to extract feature sets for anomaly detection models. It states that methods based on nodes’ statistics result in better model performance than state-of-the-art graph embeddings. They also yield a significant improvement in computational efficiency. This often means reducing the time by hours or enabling modeling on significantly larger graphs (usually not feasible in the case of embeddings). On that basis, the article proposes its own graph-based statistics algorithm. Furthermore, using embeddings requires two engineering choices: the type of embedding and its dimension. The research examines whether there are types of graph embeddings and dimensions that perform significantly better than others. The solution turned out to be dataset-specific and needed to be tailored on a case-by-case basis, adding even more engineering overhead to using embeddings (building a leaderboard of grid of embedding instances, where each of them takes hours to be generated). This, again, speaks in favor of the proposed algorithm based on nodes’ statistics. The research proposes its own efficient algorithm, which makes this engineering overhead redundant.

Suggested Citation

  • Agata Skorupka, 2024. "Detecting anomalies in graph networks on digital markets," PLOS ONE, Public Library of Science, vol. 19(12), pages 1-30, December.
  • Handle: RePEc:plo:pone00:0315849
    DOI: 10.1371/journal.pone.0315849
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0315849
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0315849&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0315849?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Dina Mayzlin & Yaniv Dover & Judith Chevalier, 2014. "Promotional Reviews: An Empirical Investigation of Online Review Manipulation," American Economic Review, American Economic Association, vol. 104(8), pages 2421-2455, August.
    2. Peter Fratrič & Giovanni Sileno & Sander Klous & Tom Engers, 2022. "Manipulation of the Bitcoin market: an agent-based study," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-29, December.
    3. Kai-Cheng Yang & Emilio Ferrara & Filippo Menczer, 2022. "Botometer 101: social bot practicum for computational social scientists," Journal of Computational Social Science, Springer, vol. 5(2), pages 1511-1528, November.
    4. Michael Luca & Georgios Zervas, 2016. "Fake It Till You Make It: Reputation, Competition, and Yelp Review Fraud," Management Science, INFORMS, vol. 62(12), pages 3412-3427, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hui, Xiang & Klein, Tobias & Stahl, Konrad, 2022. "Learning from Online Ratings," CEPR Discussion Papers 17006, C.E.P.R. Discussion Papers.
    2. M. Narciso, 2022. "The Unreliability of Online Review Mechanisms," Journal of Consumer Policy, Springer, vol. 45(3), pages 349-368, September.
    3. Balázs Kovács, 2024. "The Turing test of online reviews: Can we tell the difference between human-written and GPT-4-written online reviews?," Marketing Letters, Springer, vol. 35(4), pages 651-666, December.
    4. Sungsik Park & Woochoel Shin & Jinhong Xie, 2021. "The Fateful First Consumer Review," Marketing Science, INFORMS, vol. 40(3), pages 481-507, May.
    5. Lingfang (Ivy) Li & Steven Tadelis & Xiaolan Zhou, 2020. "Buying reputation as a signal of quality: Evidence from an online marketplace," RAND Journal of Economics, RAND Corporation, vol. 51(4), pages 965-988, December.
    6. Plé, Loïc & Demangeot, Catherine, 2020. "Social contagion of online and offline deviant behaviors and its value outcomes: The case of tourism ecosystems," Journal of Business Research, Elsevier, vol. 117(C), pages 886-896.
    7. Gesche, Tobias, 2018. "Reference Price Shifts and Customer Antagonism: Evidence from Reviews for Online Auctions," VfS Annual Conference 2018 (Freiburg, Breisgau): Digital Economy 181650, Verein für Socialpolitik / German Economic Association.
    8. Dominik Gutt & Jürgen Neumann & Steffen Zimmermann & Dennis Kundisch & Jianqing Chen, 2018. "Design of Review Systems - A Strategic Instrument to shape Online Review Behavior and Economic Outcomes," Working Papers Dissertations 42, Paderborn University, Faculty of Business Administration and Economics.
    9. Zhuang, Mengzhou & Cui, Geng & Peng, Ling, 2018. "Manufactured opinions: The effect of manipulating online product reviews," Journal of Business Research, Elsevier, vol. 87(C), pages 24-35.
    10. Erfan Rezvani & Christian Rojas, 2022. "Firm responsiveness to consumers' reviews: The effect on online reputation," Journal of Economics & Management Strategy, Wiley Blackwell, vol. 31(4), pages 898-922, November.
    11. Meoli, Michele & Vismara, Silvio, 2021. "Information manipulation in equity crowdfunding markets," Journal of Corporate Finance, Elsevier, vol. 67(C).
    12. Dominik Gutt & Philipp Herrmann & Mohammad S. Rahman, 2018. "Crowd-Driven Competitive Intelligence: Understanding the Relationship Between Local Market Competition and Online Rating Distributions," Working Papers Dissertations 41, Paderborn University, Faculty of Business Administration and Economics.
    13. Weijia (Daisy) Dai & Ginger Jin & Jungmin Lee & Michael Luca, 2018. "Aggregation of consumer ratings: an application to Yelp.com," Quantitative Marketing and Economics (QME), Springer, vol. 16(3), pages 289-339, September.
    14. Apostolos Filippas & John J. Horton & Richard J. Zeckhauser, 2020. "Owning, Using, and Renting: Some Simple Economics of the “Sharing Economy”," Management Science, INFORMS, vol. 66(9), pages 4152-4172, September.
    15. Chatterjee, Sheshadri & Chaudhuri, Ranjan & Kumar, Ajay & Lu Wang, Cheng & Gupta, Shivam, 2023. "Impacts of consumer cognitive process to ascertain online fake review: A cognitive dissonance theory approach," Journal of Business Research, Elsevier, vol. 154(C).
    16. Jason Greenberg & Daniel B. Sands & Gino Cattani & Joseph Porac, 2024. "Rating systems and increased heterogeneity in firm performance: Evidence from the New York City Restaurant Industry, 1994–2013," Strategic Management Journal, Wiley Blackwell, vol. 45(1), pages 36-65, January.
    17. Vollaard, Ben & van Ours, Jan C., 2022. "Bias in expert product reviews," Journal of Economic Behavior & Organization, Elsevier, vol. 202(C), pages 105-118.
    18. Hung-Pin Shih & Pei-Chen Sung, 2021. "Addressing the Review-Based Learning and Private Information Approaches to Foster Platform Continuance," Information Systems Frontiers, Springer, vol. 23(3), pages 649-661, June.
    19. Li Chen & Yiangos Papanastasiou, 2021. "Seeding the Herd: Pricing and Welfare Effects of Social Learning Manipulation," Management Science, INFORMS, vol. 67(11), pages 6734-6750, November.
    20. Travis Dyer & Eunjee Kim, 2021. "Anonymous Equity Research," Journal of Accounting Research, Wiley Blackwell, vol. 59(2), pages 575-611, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0315849. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.