IDEAS home Printed from https://ideas.repec.org/a/gam/jeners/v16y2023i14p5405-d1195106.html
   My bibliography  Save this article

Enhancing Electricity Theft Detection through K-Nearest Neighbors and Logistic Regression Algorithms with Synthetic Minority Oversampling Technique: A Case Study on State Electricity Company (PLN) Customer Data

Author

Listed:
  • Yan Maraden

    (Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia)

  • Gunawan Wibisono

    (Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia)

  • I Gde Dharma Nugraha

    (Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia)

  • Budi Sudiarto

    (Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia)

  • Fauzan Hanif Jufri

    (Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia)

  • Kazutaka

    (Departement of Electrical Engineering, Universitas Indonesia, Depok 16424, Indonesia)

  • Anton Satria Prabuwono

    (Faculty of Computing and Information Technology in Rabigh, King Abdulaziz University, Rabigh 21911, Saudi Arabia)

Abstract

Electricity theft has caused massive losses and damage to electricity utilities. The damage affects the electricity supply’s quality and increases the generation load. The losses happen not only for the electricity utilities but also affect the legitimate users who have to pay excessive electricity bills. That is why the method to detect electricity theft is indispensable. Recently, machine learning algorithms have been used to develop a model for detecting electricity theft. However, most algorithms have problems due to imbalanced data, overfitting issues, and lack of data. Therefore, this paper proposes a solution that implements the oversampling technique to address the problems and increase the developed model’s accuracy. It is used to perform oversampling on the imbalanced dataset. Our proposed method consists of a pre-processing step to remove empty values and extract several parameters. After that, the oversampling technique is performed on the result of the pre-processing step. The logistic regression model combined with the oversampling techniques shows the best performance results on the developed model of electricity theft detection based on the state electricity company customers. The experiment shows that the proposed method, logistic regression combined with the synthetic minority oversampling technique, shows superior performance in terms of the accuracy of the training data and data testing, precision, recall, and F1-scores of 98.97%, 98.7%, 95%, 99%, and 97%, respectively. Moreover, the experiment also shows that the proposed solution outperforms existing methods.

Suggested Citation

  • Yan Maraden & Gunawan Wibisono & I Gde Dharma Nugraha & Budi Sudiarto & Fauzan Hanif Jufri & Kazutaka & Anton Satria Prabuwono, 2023. "Enhancing Electricity Theft Detection through K-Nearest Neighbors and Logistic Regression Algorithms with Synthetic Minority Oversampling Technique: A Case Study on State Electricity Company (PLN) Cus," Energies, MDPI, vol. 16(14), pages 1-24, July.
  • Handle: RePEc:gam:jeners:v:16:y:2023:i:14:p:5405-:d:1195106
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1996-1073/16/14/5405/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1996-1073/16/14/5405/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jeners:v:16:y:2023:i:14:p:5405-:d:1195106. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.