IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v282y2020i3p1011-1024.html
   My bibliography  Save this article

From one-class to two-class classification by incorporating expert knowledge: Novelty detection in human behaviour

Author

Listed:
  • Oosterlinck, Dieter
  • Benoit, Dries F.
  • Baecke, Philippe

Abstract

One-class classification is the standard procedure for novelty detection. Novelty detection aims to identify observations that deviate from a determined normal behaviour. Only instances of one class are known, whereas so called novelties are unlabelled. Traditional novelty detection applies methods from the field of outlier detection. These standard one-class classification approaches have limited performance in many real business cases. The traditional techniques are mainly developed for industrial problems such as machine condition monitoring. When applying these to human behaviour, the performance drops significantly. This paper proposes a method that improves existing approaches by creating semi-synthetic novelties in order to have labelled data for the two classes. Expert knowledge is incorporated in the initial phase of this data generation process. The method was deployed on a real-life test case where the goal was to detect fraudulent subscriptions to a telecom family plan. This research demonstrates that the two-class expert model outperforms a one-class model on the semi-synthetic dataset. In a next step the model was validated on a real dataset. A fraud detection team of the company manually checked the top predicted novelties. The results show that incorporating expert knowledge to transform a one-class problem into a two-class problem is a valuable method.

Suggested Citation

  • Oosterlinck, Dieter & Benoit, Dries F. & Baecke, Philippe, 2020. "From one-class to two-class classification by incorporating expert knowledge: Novelty detection in human behaviour," European Journal of Operational Research, Elsevier, vol. 282(3), pages 1011-1024.
  • Handle: RePEc:eee:ejores:v:282:y:2020:i:3:p:1011-1024
    DOI: 10.1016/j.ejor.2019.10.015
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221719308501
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2019.10.015?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Lessmann, Stefan & Voß, Stefan, 2009. "A reference model for customer-centric data mining with support vector machines," European Journal of Operational Research, Elsevier, vol. 199(2), pages 520-530, December.
    2. Ashouri, F, 1993. "An expert system for predicting gas demand: A case study," Omega, Elsevier, vol. 21(3), pages 307-317, May.
    3. Larichev, Oleg & Asanov, Artyom & Naryzhny, Yevgeny, 2002. "Effectiveness evaluation of expert classification methods," European Journal of Operational Research, Elsevier, vol. 138(2), pages 260-273, April.
    4. Karatzoglou, Alexandros & Meyer, David & Hornik, Kurt, 2006. "Support Vector Machines in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 15(i09).
    5. Preyas S. Desai & Devavrat Purohit & Bo Zhou, 2018. "Allowing Consumers to Bundle Themselves: The Profitability of Family Plans," Marketing Science, INFORMS, vol. 37(6), pages 953-969, November.
    6. Wang, W. & Zhang, W., 2008. "An asset residual life prediction model based on expert judgments," European Journal of Operational Research, Elsevier, vol. 188(2), pages 496-505, July.
    7. Li, Heng & Li, Zhicheng & Li, Ling X. & Hu, Bin, 2000. "A production rescheduling expert simulation system," European Journal of Operational Research, Elsevier, vol. 124(2), pages 283-293, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mi, Yunlong & Quan, Pei & Shi, Yong & Wang, Zongrun, 2022. "Concept-cognitive computing system for dynamic classification," European Journal of Operational Research, Elsevier, vol. 301(1), pages 287-299.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Paolo Sorino & Maria Gabriella Caruso & Giovanni Misciagna & Caterina Bonfiglio & Angelo Campanella & Antonella Mirizzi & Isabella Franco & Antonella Bianco & Claudia Buongiorno & Rosalba Liuzzi & Ann, 2020. "Selecting the best machine learning algorithm to support the diagnosis of Non-Alcoholic Fatty Liver Disease: A meta learner study," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-15, October.
    2. Gökçe Kiliçkaya & Tarik Küçükdeniz & Şakir Esnaf, 2021. "Fuzzy Clustering With Derivative–Free Search Algorithm for Location of Biogas Energy Systems," International Journal of Operations Research and Information Systems (IJORIS), IGI Global, vol. 12(4), pages 1-19, October.
    3. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W., 2018. "A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees," European Journal of Operational Research, Elsevier, vol. 269(2), pages 760-772.
    4. Na Tang & Maoxiang Yuan & Zhijun Chen & Jian Ma & Rui Sun & Yide Yang & Quanyuan He & Xiaowei Guo & Shixiong Hu & Junhua Zhou, 2023. "Machine Learning Prediction Model of Tuberculosis Incidence Based on Meteorological Factors and Air Pollutants," IJERPH, MDPI, vol. 20(5), pages 1-17, February.
    5. Brandner, Hubertus & Lessmann, Stefan & Voß, Stefan, 2013. "A memetic approach to construct transductive discrete support vector machines," European Journal of Operational Research, Elsevier, vol. 230(3), pages 581-595.
    6. Moro Russ A. & Härdle Wolfgang K. & Schäfer Dorothea, 2017. "Company rating with support vector machines," Statistics & Risk Modeling, De Gruyter, vol. 34(1-2), pages 55-67, June.
    7. Alejandra Duenas & Dobrila Petrovic, 2008. "An approach to predictive-reactive scheduling of parallel machines subject to disruptions," Annals of Operations Research, Springer, vol. 159(1), pages 65-82, March.
    8. Mario P. Brito & Ian G. J. Dawson, 2020. "Predicting the Validity of Expert Judgments in Assessing the Impact of Risk Mitigation Through Failure Prevention and Correction," Risk Analysis, John Wiley & Sons, vol. 40(10), pages 1928-1943, October.
    9. Framinan, Jose M. & Ruiz, Rubén, 2010. "Architecture of manufacturing scheduling systems: Literature review and an integrated proposal," European Journal of Operational Research, Elsevier, vol. 205(2), pages 237-246, September.
    10. Wang, John & Yan, Ruiliang & Hollister, Kimberly & Zhu, Dan, 2008. "A historic review of management science research in China," Omega, Elsevier, vol. 36(6), pages 919-932, December.
    11. Ana Patrícia Rocha & Hugo Miguel Pereira Choupina & Maria do Carmo Vilas-Boas & José Maria Fernandes & João Paulo Silva Cunha, 2018. "System for automatic gait analysis based on a single RGB-D camera," PLOS ONE, Public Library of Science, vol. 13(8), pages 1-24, August.
    12. Phichhang Ou & Hengshan Wang, 2009. "Prediction of Stock Market Index Movement by Ten Data Mining Techniques," Modern Applied Science, Canadian Center of Science and Education, vol. 3(12), pages 1-28, December.
    13. Zaumanis, Martins & Mallick, Rajib B. & Frank, Robert, 2014. "100% recycled hot mix asphalt: A review and analysis," Resources, Conservation & Recycling, Elsevier, vol. 92(C), pages 230-245.
    14. Luca Longo, 2018. "Experienced mental workload, perception of usability, their interaction and impact on task performance," PLOS ONE, Public Library of Science, vol. 13(8), pages 1-36, August.
    15. Blanquero, R. & Carrizosa, E. & Jiménez-Cordero, A. & Martín-Barragán, B., 2019. "Functional-bandwidth kernel for Support Vector Machine with Functional Data: An alternating optimization algorithm," European Journal of Operational Research, Elsevier, vol. 275(1), pages 195-207.
    16. Courage Kamusoko & Jonah Gamba & Hitomi Murakami, 2014. "Mapping Woodland Cover in the Miombo Ecosystem: A Comparison of Machine Learning Classifiers," Land, MDPI, vol. 3(2), pages 1-17, June.
    17. Marta Cabral & Dália Loureiro & Inês Flores-Colen & Dídia Covas, 2022. "A Distress-Based Condition Assessment Approach of Urban Water Assets Using Novel Deterioration Indices," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 36(3), pages 1075-1092, February.
    18. Gattermann-Itschert, Theresa & Thonemann, Ulrich W., 2021. "How training on multiple time slices improves performance in churn prediction," European Journal of Operational Research, Elsevier, vol. 295(2), pages 664-674.
    19. Yevseyeva, Iryna & Miettinen, Kaisa & Rasanen, Pekka, 2008. "Verbal ordinal classification with multicriteria decision aiding," European Journal of Operational Research, Elsevier, vol. 185(3), pages 964-983, March.
    20. Perthame, Emeline & Forbes, Florence & Deleforge, Antoine, 2018. "Inverse regression approach to robust nonlinear high-to-low dimensional mapping," Journal of Multivariate Analysis, Elsevier, vol. 163(C), pages 1-14.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:282:y:2020:i:3:p:1011-1024. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.