IDEAS home Printed from https://ideas.repec.org/a/eee/appene/v297y2021ics0306261921005791.html
   My bibliography  Save this article

Problem of data imbalance in building energy load prediction: Concept, influence, and solution

Author

Listed:
  • Zhang, Chaobo
  • Li, Junyang
  • Zhao, Yang
  • Li, Tingting
  • Chen, Qi
  • Zhang, Xuejun
  • Qiu, Weikang

Abstract

Building energy systems work under wide-scale operation conditions. The available data from some conditions might be far less than the data from the other conditions seriously. This is the so-called data imbalance problem, that is, the volumes of data are different for various conditions. This problem is always ignored in the field of building energy load prediction. Three questions remain unclear: how to identify various building operation conditions, how this problem affects the prediction accuracy, and how to overcome this problem. With the aim of addressing the above three questions, at first, this study proposes a clustering decision tree algorithm to identify the building operation conditions. Then, the effects of data imbalance are investigated by changing the proportions of model training samples from various operation conditions. Finally, a clustering decision tree-based multi-model prediction method is proposed to solve the data imbalance problem. The one-year historical operational data from a public building are utilized to validate the multi-model method. The results show that the proposed method has better prediction performance than the conventional single model-based method. It decreases the mean absolute errors of energy load prediction using artificial neural networks, gradient boosting trees, random forests, and support vector regression by 9.83%, 6.71%, 1.32%, and 12.22% on average, respectively. In addition, it increases the coefficients of determination of energy load prediction using the four algorithms by 8.47%, 4.59%, 0.26%, and 13.99% on average, respectively.

Suggested Citation

  • Zhang, Chaobo & Li, Junyang & Zhao, Yang & Li, Tingting & Chen, Qi & Zhang, Xuejun & Qiu, Weikang, 2021. "Problem of data imbalance in building energy load prediction: Concept, influence, and solution," Applied Energy, Elsevier, vol. 297(C).
  • Handle: RePEc:eee:appene:v:297:y:2021:i:c:s0306261921005791
    DOI: 10.1016/j.apenergy.2021.117139
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0306261921005791
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.apenergy.2021.117139?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Wang, Zhe & Hong, Tianzhen & Piette, Mary Ann, 2020. "Building thermal load prediction through shallow machine learning and deep learning," Applied Energy, Elsevier, vol. 263(C).
    2. Xu, Zhaoyi & Saleh, Joseph Homer, 2021. "Machine learning for reliability engineering and safety applications: Review of current status and future opportunities," Reliability Engineering and System Safety, Elsevier, vol. 211(C).
    3. Seyedzadeh, Saleh & Pour Rahimian, Farzad & Oliver, Stephen & Rodriguez, Sergio & Glesk, Ivan, 2020. "Machine learning modelling for predicting non-domestic buildings energy performance: A model to support deep energy retrofit decision-making," Applied Energy, Elsevier, vol. 279(C).
    4. Kavousian, Amir & Rajagopal, Ram & Fischer, Martin, 2013. "Determinants of residential electricity consumption: Using smart meter data to examine the effect of climate, building characteristics, appliance stock, and occupants' behavior," Energy, Elsevier, vol. 55(C), pages 184-194.
    5. Walter, Travis & Sohn, Michael D., 2016. "A regression-based approach to estimating retrofit savings using the Building Performance Database," Applied Energy, Elsevier, vol. 179(C), pages 996-1005.
    6. Fan, Cheng & Xiao, Fu & Wang, Shengwei, 2014. "Development of prediction models for next-day building energy consumption and peak power demand using data mining techniques," Applied Energy, Elsevier, vol. 127(C), pages 1-10.
    7. Fan, Cheng & Sun, Yongjun & Zhao, Yang & Song, Mengjie & Wang, Jiayuan, 2019. "Deep learning-based feature engineering methods for improved building energy prediction," Applied Energy, Elsevier, vol. 240(C), pages 35-45.
    8. Kathirgamanathan, Anjukan & De Rosa, Mattia & Mangina, Eleni & Finn, Donal P., 2021. "Data-driven predictive control for unlocking building energy flexibility: A review," Renewable and Sustainable Energy Reviews, Elsevier, vol. 135(C).
    9. Arjunan, Pandarasamy & Poolla, Kameshwar & Miller, Clayton, 2020. "EnergyStar++: Towards more accurate and explanatory building energy benchmarking," Applied Energy, Elsevier, vol. 276(C).
    10. Li, Xiaoma & Zhou, Yuyu & Yu, Sha & Jia, Gensuo & Li, Huidong & Li, Wenliang, 2019. "Urban heat island impacts on building energy consumption: A review of approaches and findings," Energy, Elsevier, vol. 174(C), pages 407-419.
    11. Zhao, Yang & Li, Tingting & Zhang, Xuejun & Zhang, Chaobo, 2019. "Artificial intelligence-based fault detection and diagnosis methods for building energy systems: Advantages, challenges and the future," Renewable and Sustainable Energy Reviews, Elsevier, vol. 109(C), pages 85-101.
    12. Fan, Cheng & Xiao, Fu & Yan, Chengchu & Liu, Chengliang & Li, Zhengdao & Wang, Jiayuan, 2019. "A novel methodology to explain and evaluate data-driven building energy performance models based on interpretable machine learning," Applied Energy, Elsevier, vol. 235(C), pages 1551-1560.
    13. Zhang, Chaobo & Xue, Xue & Zhao, Yang & Zhang, Xuejun & Li, Tingting, 2019. "An improved association rule mining-based method for revealing operational problems of building heating, ventilation and air conditioning (HVAC) systems," Applied Energy, Elsevier, vol. 253(C), pages 1-1.
    14. Cai, Mengmeng & Pipattanasomporn, Manisa & Rahman, Saifur, 2019. "Day-ahead building-level load forecasts using deep learning vs. traditional time-series techniques," Applied Energy, Elsevier, vol. 236(C), pages 1078-1088.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Meng Wang & Junqi Yu & Meng Zhou & Wei Quan & Renyin Cheng, 2023. "Joint Forecasting Model for the Hourly Cooling Load and Fluctuation Range of a Large Public Building Based on GA-SVM and IG-SVM," Sustainability, MDPI, vol. 15(24), pages 1-23, December.
    2. Sun, Ying & Haghighat, Fariborz & Fung, Benjamin C.M., 2022. "Trade-off between accuracy and fairness of data-driven building and indoor environment models: A comparative study of pre-processing methods," Energy, Elsevier, vol. 239(PD).
    3. Zhiqiang Yin & Lin Shi & Junru Luo & Shoukun Xu & Yang Yuan & Xinxin Tan & Jiaqun Zhu, 2023. "Pump Feature Construction and Electrical Energy Consumption Prediction Based on Feature Engineering and LightGBM Algorithm," Sustainability, MDPI, vol. 15(1), pages 1-17, January.
    4. Di Natale, L. & Svetozarevic, B. & Heer, P. & Jones, C.N., 2022. "Physically Consistent Neural Networks for building thermal modeling: Theory and analysis," Applied Energy, Elsevier, vol. 325(C).
    5. Chen, Ruijun & Tsay, Yaw-Shyan & Zhang, Ting, 2023. "A multi-objective optimization strategy for building carbon emission from the whole life cycle perspective," Energy, Elsevier, vol. 262(PA).
    6. Fan, Cheng & Lei, Yutian & Sun, Yongjun & Piscitelli, Marco Savino & Chiosa, Roberto & Capozzoli, Alfonso, 2022. "Data-centric or algorithm-centric: Exploiting the performance of transfer learning for improving building energy predictions in data-scarce context," Energy, Elsevier, vol. 240(C).
    7. Görtz, J. & Jürgensen, J. & Stolz, D. & Wieprecht, S. & Terheiden, K., 2022. "Energy load prediction on structures and buildings-Effect of numerical model complexity on simulation of heat fluxes across the structure/environment interface," Applied Energy, Elsevier, vol. 326(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Venkatraj, V. & Dixit, M.K., 2022. "Challenges in implementing data-driven approaches for building life cycle energy assessment: A review," Renewable and Sustainable Energy Reviews, Elsevier, vol. 160(C).
    2. Gao, Lei & Liu, Tianyuan & Cao, Tao & Hwang, Yunho & Radermacher, Reinhard, 2021. "Comparing deep learning models for multi energy vectors prediction on multiple types of building," Applied Energy, Elsevier, vol. 301(C).
    3. Fan, Cheng & Sun, Yongjun & Xiao, Fu & Ma, Jie & Lee, Dasheng & Wang, Jiayuan & Tseng, Yen Chieh, 2020. "Statistical investigations of transfer learning-based methodology for short-term building energy predictions," Applied Energy, Elsevier, vol. 262(C).
    4. Wang, Ran & Lu, Shilei & Feng, Wei, 2020. "A novel improved model for building energy consumption prediction based on model integration," Applied Energy, Elsevier, vol. 262(C).
    5. Yue, Naihua & Caini, Mauro & Li, Lingling & Zhao, Yang & Li, Yu, 2023. "A comparison of six metamodeling techniques applied to multi building performance vectors prediction on gymnasiums under multiple climate conditions," Applied Energy, Elsevier, vol. 332(C).
    6. Jason Runge & Radu Zmeureanu, 2021. "A Review of Deep Learning Techniques for Forecasting Energy Use in Buildings," Energies, MDPI, vol. 14(3), pages 1-26, January.
    7. Wang, Zeyu & Liu, Jian & Zhang, Yuanxin & Yuan, Hongping & Zhang, Ruixue & Srinivasan, Ravi S., 2021. "Practical issues in implementing machine-learning models for building energy efficiency: Moving beyond obstacles," Renewable and Sustainable Energy Reviews, Elsevier, vol. 143(C).
    8. Fan, Cheng & Xiao, Fu & Song, Mengjie & Wang, Jiayuan, 2019. "A graph mining-based methodology for discovering and visualizing high-level knowledge for building energy management," Applied Energy, Elsevier, vol. 251(C), pages 1-1.
    9. Jason Runge & Radu Zmeureanu, 2019. "Forecasting Energy Use in Buildings Using Artificial Neural Networks: A Review," Energies, MDPI, vol. 12(17), pages 1-27, August.
    10. Liu, Jiangyan & Zhang, Qing & Dong, Zhenxiang & Li, Xin & Li, Guannan & Xie, Yi & Li, Kuining, 2021. "Quantitative evaluation of the building energy performance based on short-term energy predictions," Energy, Elsevier, vol. 223(C).
    11. Chen, Zhelun & O’Neill, Zheng & Wen, Jin & Pradhan, Ojas & Yang, Tao & Lu, Xing & Lin, Guanjing & Miyata, Shohei & Lee, Seungjae & Shen, Chou & Chiosa, Roberto & Piscitelli, Marco Savino & Capozzoli, , 2023. "A review of data-driven fault detection and diagnostics for building HVAC systems," Applied Energy, Elsevier, vol. 339(C).
    12. Yildiz, B. & Bilbao, J.I. & Sproul, A.B., 2017. "A review and analysis of regression and machine learning models on commercial building electricity load forecasting," Renewable and Sustainable Energy Reviews, Elsevier, vol. 73(C), pages 1104-1122.
    13. Saima Akhtar & Sulman Shahzad & Asad Zaheer & Hafiz Sami Ullah & Heybet Kilic & Radomir Gono & Michał Jasiński & Zbigniew Leonowicz, 2023. "Short-Term Load Forecasting Models: A Review of Challenges, Progress, and the Road Ahead," Energies, MDPI, vol. 16(10), pages 1-29, May.
    14. Liang, Xinbin & Liu, Zhuoxuan & Wang, Jie & Jin, Xinqiao & Du, Zhimin, 2023. "Uncertainty quantification-based robust deep learning for building energy systems considering distribution shift problem," Applied Energy, Elsevier, vol. 337(C).
    15. Ibrahim, Muhammad Sohail & Dong, Wei & Yang, Qiang, 2020. "Machine learning driven smart electric power systems: Current trends and new perspectives," Applied Energy, Elsevier, vol. 272(C).
    16. Wei, Ziqing & Zhang, Tingwei & Yue, Bao & Ding, Yunxiao & Xiao, Ran & Wang, Ruzhu & Zhai, Xiaoqiang, 2021. "Prediction of residential district heating load based on machine learning: A case study," Energy, Elsevier, vol. 231(C).
    17. Hyunsoo Kim & Jiseok Jeong & Changwan Kim, 2022. "Daily Peak-Electricity-Demand Forecasting Based on Residual Long Short-Term Network," Mathematics, MDPI, vol. 10(23), pages 1-17, November.
    18. Gao, Yuan & Miyata, Shohei & Akashi, Yasunori, 2023. "How to improve the application potential of deep learning model in HVAC fault diagnosis: Based on pruning and interpretable deep learning method," Applied Energy, Elsevier, vol. 348(C).
    19. Li, Sihui & Peng, Jinqing & Zou, Bin & Li, Bojia & Lu, Chujie & Cao, Jingyu & Luo, Yimo & Ma, Tao, 2021. "Zero energy potential of photovoltaic direct-driven air conditioners with considering the load flexibility of air conditioners," Applied Energy, Elsevier, vol. 304(C).
    20. Ng, Rong Wang & Begam, Kasim Mumtaj & Rajkumar, Rajprasad Kumar & Wong, Yee Wan & Chong, Lee Wai, 2021. "An improved self-organizing incremental neural network model for short-term time-series load prediction," Applied Energy, Elsevier, vol. 292(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:appene:v:297:y:2021:i:c:s0306261921005791. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/405891/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.