IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2201.10351.html
   My bibliography  Save this paper

AI-based Re-identification of Behavioral Clickstream Data

Author

Listed:
  • Stefan Vamosi
  • Michael Platzer
  • Thomas Reutterer

Abstract

AI-based face recognition, i.e., the re-identification of individuals within images, is an already well established technology for video surveillance, for user authentication, for tagging photos of friends, etc. This paper demonstrates that similar techniques can be applied to successfully re-identify individuals purely based on their behavioral patterns. In contrast to de-anonymization attacks based on record linkage, these methods do not require any overlap in data points between a released dataset and an identified auxiliary dataset. The mere resemblance of behavioral patterns between records is sufficient to correctly attribute behavioral data to identified individuals. Further, we can demonstrate that data perturbation does not provide protection, unless a significant share of data utility is being destroyed. These findings call for sincere cautions when sharing actual behavioral data with third parties, as modern-day privacy regulations, like the GDPR, define their scope based on the ability to re-identify. This has also strong implications for the Marketing domain, when dealing with potentially re-identify-able data sources like shopping behavior, clickstream data or cockies. We also demonstrate how synthetic data can offer a viable alternative, that is shown to be resilient against our introduced AI-based re-identification attacks.

Suggested Citation

  • Stefan Vamosi & Michael Platzer & Thomas Reutterer, 2022. "AI-based Re-identification of Behavioral Clickstream Data," Papers 2201.10351, arXiv.org.
  • Handle: RePEc:arx:papers:2201.10351
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2201.10351
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Schneider, Matthew J. & Jagpal, Sharan & Gupta, Sachin & Li, Shaobo & Yu, Yan, 2017. "Protecting customer privacy when marketing with second-party data," International Journal of Research in Marketing, Elsevier, vol. 34(3), pages 593-603.
    2. Matthew J. Schneider & Sharan Jagpal & Sachin Gupta & Shaobo Li & Yan Yu, 2018. "A Flexible Method for Protecting Marketing Data: An Application to Point-of-Sale Data," Marketing Science, INFORMS, vol. 37(1), pages 153-171, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Matthew J. Schneider & Dawn Iacobucci, 2020. "Protecting survey data on a consumer level," Journal of Marketing Analytics, Palgrave Macmillan, vol. 8(1), pages 3-17, March.
    2. Shaobo Li & Matthew J. Schneider & Yan Yu & Sachin Gupta, 2023. "Reidentification Risk in Panel Data: Protecting for k -Anonymity," Information Systems Research, INFORMS, vol. 34(3), pages 1066-1088, September.
    3. Matthew J. Schneider & Shawn Mankad, 2021. "A Two-Stage Authorship Attribution Method Using Text and Structured Data for De-Anonymizing User-Generated Content," Customer Needs and Solutions, Springer;Institute for Sustainable Innovation and Growth (iSIG), vol. 8(3), pages 66-83, September.
    4. Wieringa, Jaap & Kannan, P.K. & Ma, Xiao & Reutterer, Thomas & Risselada, Hans & Skiera, Bernd, 2021. "Data analytics in a privacy-concerned world," Journal of Business Research, Elsevier, vol. 122(C), pages 915-925.
    5. Ming-Hui Huang & Roland T. Rust, 2021. "A strategic framework for artificial intelligence in marketing," Journal of the Academy of Marketing Science, Springer, vol. 49(1), pages 30-50, January.
    6. Guha, Abhijit & Grewal, Dhruv & Kopalle, Praveen K. & Haenlein, Michael & Schneider, Matthew J. & Jung, Hyunseok & Moustafa, Rida & Hegde, Dinesh R. & Hawkins, Gary, 2021. "How artificial intelligence will affect the future of retailing," Journal of Retailing, Elsevier, vol. 97(1), pages 28-41.
    7. Bandara, Ruwan & Fernando, Mario & Akter, Shahriar, 2020. "Explicating the privacy paradox: A qualitative inquiry of online shopping consumers," Journal of Retailing and Consumer Services, Elsevier, vol. 52(C).
    8. Ronny Behrens & Natasha Zhang Foutz & Michael Franklin & Jannis Funk & Fernanda Gutierrez-Navratil & Julian Hofmann & Ulrike Leibfried, 2021. "Leveraging analytics to produce compelling and profitable film content," Journal of Cultural Economics, Springer;The Association for Cultural Economics International, vol. 45(2), pages 171-211, June.
    9. Henner Gimpel & Dominikus Kleindienst & Niclas Nüske & Daniel Rau & Fabian Schmied, 2018. "The upside of data privacy – delighting customers by implementing data privacy measures," Electronic Markets, Springer;IIM University of St. Gallen, vol. 28(4), pages 437-452, November.
    10. Artur Strzelecki & Mariia Rizun, 2022. "Consumers’ Change in Trust and Security after a Personal Data Breach in Online Shopping," Sustainability, MDPI, vol. 14(10), pages 1-17, May.
    11. Rust, Roland T., 2020. "The future of marketing," International Journal of Research in Marketing, Elsevier, vol. 37(1), pages 15-26.
    12. Piyush Anand & Clarence Lee, 2023. "Using Deep Learning to Overcome Privacy and Scalability Issues in Customer Data Transfer," Marketing Science, INFORMS, vol. 42(1), pages 189-207, January.
    13. Sara Quach & Park Thaichon & Kelly D. Martin & Scott Weaven & Robert W. Palmatier, 2022. "Digital technologies: tensions in privacy and data," Journal of the Academy of Marketing Science, Springer, vol. 50(6), pages 1299-1323, November.
    14. Robert W. Palmatier & Andrew T. Crecelius, 2019. "The “first principles” of marketing strategy," AMS Review, Springer;Academy of Marketing Science, vol. 9(1), pages 5-26, June.
    15. Grewal, Dhruv & Guha, Abhijit & Satornino, Cinthia B. & Schweiger, Elisa B., 2021. "Artificial intelligence: The light and the darkness," Journal of Business Research, Elsevier, vol. 136(C), pages 229-236.
    16. Mingyung Kim & Eric T. Bradlow & Raghuram Iyengar, 2022. "Selecting Data Granularity and Model Specification Using the Scaled Power Likelihood with Multiple Weights," Marketing Science, INFORMS, vol. 41(4), pages 848-866, July.
    17. Elliot Shin Oblander & Sunil Gupta & Carl F. Mela & Russell S. Winer & Donald R. Lehmann, 2020. "The past, present, and future of customer management," Marketing Letters, Springer, vol. 31(2), pages 125-136, September.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2201.10351. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.