IDEAS home Printed from https://ideas.repec.org/a/eee/appene/v341y2023ics0306261923004129.html
   My bibliography  Save this article

Effective sampling for drift mitigation in machine learning using scenario selection: A microgrid case study

Author

Listed:
  • Darville, Joshua
  • Yavuz, Abdurrahman
  • Runsewe, Temitope
  • Celik, Nurcin

Abstract

Predictive modeling is pervasive across a variety of industries (e.g., driving patterns in insurance, fraud detection in finance, and forecasting in energy). However, the deterioration of prediction accuracy for most machine learning (ML) models due to the drift is inevitable despite having acceptable performance in a test setting. Drift is a natural occurrence as cumulative information changes over time but should be regulated to ensure an ML model’s prediction remains relevant in real-time when decisions are being made. This study proposes a drift mitigation framework (DMF) to obtain an effective sample size in a scalable and unbiased manner for sustainable predictive performance from ML models using microgrids (MGs) as a testbed. Scenario selection was used as an alternative sampling technique to obtain an effective sample from historical data contributing to probability theory. These scenarios were evaluated using a quantile-quantile (Q-Q) plot, Welch’s T-test, and relative entropy measures to ensure the obtained sample effectively captures the general population behavior. The results showed two of the three climate factors used in our experiment formed a nearly straight line on their Q-Q plots, suggesting the effective sample retained comparable statistical properties to theoretical distributions fitted to the original population and complete training data. At an alpha equal to 0.05, a hypothesis test was conducted with the null hypothesis claiming the means of reduced and complete data were statistically equivalent for ambient temperature, solar irradiance, and wind speed. The p-values obtained for each stochastic climate factor were 1.01, 0.296, and 4.13 suggesting insufficient evidence to reject the null hypothesis. Similarly, the resulting population stability index (PSI) values for each stochastic climate factor were 0.175, 0.352, and 0.023 respectively. Hence, there was no significant shift between the population data and the effective sample across each factor.

Suggested Citation

  • Darville, Joshua & Yavuz, Abdurrahman & Runsewe, Temitope & Celik, Nurcin, 2023. "Effective sampling for drift mitigation in machine learning using scenario selection: A microgrid case study," Applied Energy, Elsevier, vol. 341(C).
  • Handle: RePEc:eee:appene:v:341:y:2023:i:c:s0306261923004129
    DOI: 10.1016/j.apenergy.2023.121048
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0306261923004129
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.apenergy.2023.121048?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jie Xu & Si Zhang & Edward Huang & Chun-Hung Chen & Loo Hay Lee & Nurcin Celik, 2016. "MO2TOS: Multi-Fidelity Optimization with Ordinal Transformation and Optimal Sampling," Asia-Pacific Journal of Operational Research (APJOR), World Scientific Publishing Co. Pte. Ltd., vol. 33(03), pages 1-26, June.
    2. Hong, Ying-Yi & Pula, Rolando A., 2022. "Detection and classification of faults in photovoltaic arrays using a 3D convolutional neural network," Energy, Elsevier, vol. 246(C).
    3. Backe, Stian & Ahang, Mohammadreza & Tomasgard, Asgeir, 2021. "Stable stochastic capacity expansion with variable renewables: Comparing moment matching and stratified scenario generation sampling," Applied Energy, Elsevier, vol. 302(C).
    4. Bistline, John E.T. & Merrick, James H., 2020. "Parameterizing open-source energy models: Statistical learning to estimate unknown power plant attributes," Applied Energy, Elsevier, vol. 269(C).
    5. Bischi, Aldo & Taccari, Leonardo & Martelli, Emanuele & Amaldi, Edoardo & Manzolini, Giampaolo & Silva, Paolo & Campanari, Stefano & Macchi, Ennio, 2019. "A rolling-horizon optimization algorithm for the long term operational scheduling of cogeneration systems," Energy, Elsevier, vol. 184(C), pages 73-90.
    6. Romain Mannini & Julien Eynard & Stéphane Grieu, 2022. "A Survey of Recent Advances in the Smart Management of Microgrids and Networked Microgrids," Energies, MDPI, vol. 15(19), pages 1-37, September.
    7. Durand, Robert B. & Patterson, Fernando M. & Shank, Corey A., 2021. "Behavioral biases in the NFL gambling market: Overreaction to news and the recency bias," Journal of Behavioral and Experimental Finance, Elsevier, vol. 31(C).
    8. Junya Tang & Kuo-Yi Lin & Li Li, 2022. "Using Domain Adaptation for Incremental SVM Classification of Drift Data," Mathematics, MDPI, vol. 10(19), pages 1-17, September.
    9. Carneiro, Tatiane C. & Rocha, Paulo A.C. & Carvalho, Paulo C.M. & Fernández-Ramírez, Luis M., 2022. "Ridge regression ensemble of machine learning models applied to solar and wind forecasting in Brazil and Spain," Applied Energy, Elsevier, vol. 314(C).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sijia Li & Arman Oshnoei & Frede Blaabjerg & Amjad Anvari-Moghaddam, 2023. "Hierarchical Control for Microgrids: A Survey on Classical and Machine Learning-Based Methods," Sustainability, MDPI, vol. 15(11), pages 1-22, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ariel Villalón & Carlos Muñoz & Javier Muñoz & Marco Rivera, 2023. "Fixed-Switching-Frequency Modulated Model Predictive Control for Islanded AC Microgrid Applications," Mathematics, MDPI, vol. 11(3), pages 1-27, January.
    2. Kim, SangYoun & Heo, SungKu & Nam, KiJeon & Woo, TaeYong & Yoo, ChangKyoo, 2023. "Flexible renewable energy planning based on multi-step forecasting of interregional electricity supply and demand: Graph-enhanced AI approach," Energy, Elsevier, vol. 282(C).
    3. Fei, Xin & Gülpınar, Nalân & Branke, Jürgen, 2019. "Efficient solution selection for two-stage stochastic programs," European Journal of Operational Research, Elsevier, vol. 277(3), pages 918-929.
    4. Jianpei Wen & Hanyu Jiang & Jie Song, 2019. "A Stochastic Queueing Model for Capacity Allocation in the Hierarchical Healthcare Delivery System," Asia-Pacific Journal of Operational Research (APJOR), World Scientific Publishing Co. Pte. Ltd., vol. 36(01), pages 1-24, February.
    5. Marius Ötting & Christian Deutscher & Carl Singleton & Luca De Angelis, 2022. "Gambling on Momentum," Economics Discussion Papers em-dp2022-10, Department of Economics, University of Reading.
      • Marius Otting & Christian Deutscher & Carl Singleton & Luca De Angelis, 2022. "Gambling on Momentum," Papers 2211.06052, arXiv.org.
    6. Capone, Martina & Guelpa, Elisa & Verda, Vittorio, 2021. "Multi-objective optimization of district energy systems with demand response," Energy, Elsevier, vol. 227(C).
    7. Wakui, Tetsuya & Akai, Kazuki & Yokoyama, Ryohei, 2022. "Shrinking and receding horizon approaches for long-term operational planning of energy storage and supply systems," Energy, Elsevier, vol. 239(PD).
    8. Saletti, Costanza & Morini, Mirko & Gambarotta, Agostino, 2022. "Smart management of integrated energy systems through co-optimization with long and short horizons," Energy, Elsevier, vol. 250(C).
    9. Andre Leippi & Markus Fleschutz & Michael D. Murphy, 2022. "A Review of EV Battery Utilization in Demand Response Considering Battery Degradation in Non-Residential Vehicle-to-Grid Scenarios," Energies, MDPI, vol. 15(9), pages 1-22, April.
    10. Lingxuan Liu & Leyuan Shi, 2019. "Simulation Optimization on Complex Job Shop Scheduling with Non-Identical Job Sizes," Asia-Pacific Journal of Operational Research (APJOR), World Scientific Publishing Co. Pte. Ltd., vol. 36(05), pages 1-26, October.
    11. Ann-Kathrin Klaas & Hans-Peter Beck, 2021. "A MILP Model for Revenue Optimization of a Compressed Air Energy Storage Plant with Electrolysis," Energies, MDPI, vol. 14(20), pages 1-21, October.
    12. Mellit, A. & Benghanem, M. & Kalogirou, S. & Massi Pavan, A., 2023. "An embedded system for remote monitoring and fault diagnosis of photovoltaic arrays using machine learning and the internet of things," Renewable Energy, Elsevier, vol. 208(C), pages 399-408.
    13. Angelos Patsidis & Adam Dyśko & Campbell Booth & Anastasios Oulis Rousis & Polyxeni Kalliga & Dimitrios Tzelepis, 2023. "Digital Architecture for Monitoring and Operational Analytics of Multi-Vector Microgrids Utilizing Cloud Computing, Advanced Virtualization Techniques, and Data Analytics Methods," Energies, MDPI, vol. 16(16), pages 1-19, August.
    14. Li, Na & Zhang, Yue & Teng, De & Kong, Nan, 2021. "Pareto optimization for control agreement in patient referral coordination," Omega, Elsevier, vol. 101(C).
    15. Capone, Martina & Guelpa, Elisa & Mancò, Giulia & Verda, Vittorio, 2021. "Integration of storage and thermal demand response to unlock flexibility in district multi-energy systems," Energy, Elsevier, vol. 237(C).
    16. Chao-Ming Huang & Shin-Ju Chen & Sung-Pei Yang & Hsin-Jen Chen, 2023. "One-Day-Ahead Hourly Wind Power Forecasting Using Optimized Ensemble Prediction Methods," Energies, MDPI, vol. 16(6), pages 1-22, March.
    17. Wang, Lijin & Fan, Weipeng & Jiang, Guoqian & Xie, Ping, 2023. "An efficient federated transfer learning framework for collaborative monitoring of wind turbines in IoE-enabled wind farms," Energy, Elsevier, vol. 284(C).
    18. Zhang, Hongyu & Tomasgard, Asgeir & Knudsen, Brage Rugstad & Svendsen, Harald G. & Bakker, Steffen J. & Grossmann, Ignacio E., 2022. "Modelling and analysis of offshore energy hubs," Energy, Elsevier, vol. 261(PA).
    19. Song, Zhe & Cao, Sunliang & Yang, Hongxing, 2023. "Assessment of solar radiation resource and photovoltaic power potential across China based on optimized interpretable machine learning model and GIS-based approaches," Applied Energy, Elsevier, vol. 339(C).
    20. Michael Perry & Hadi El-Amine, 2019. "Computational Efficiency in Multivariate Adversarial Risk Analysis Models," Decision Analysis, INFORMS, vol. 16(4), pages 314-332, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:appene:v:341:y:2023:i:c:s0306261923004129. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/405891/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.