IDEAS home Printed from https://ideas.repec.org/a/eee/appene/v392y2025ics0306261925006956.html
   My bibliography  Save this article

MMGPT4LF: Leveraging an optimized pre-trained GPT-2 model with multi-modal cross-attention for load forecasting

Author

Listed:
  • Gao, Mingyang
  • Zhou, Suyang
  • Gu, Wei
  • Wu, Zhi
  • Liu, Haiquan
  • Zhou, Aihua
  • Wang, Xinliang

Abstract

Accurate load forecasting is crucial for maintaining power system balance. Traditionally, forecasting relies on time series data such as historical loads and corresponding meteorological information. However, non-time-series data like news reports and holiday schedules can also significantly influence outcomes. Existing research primarily focuses on time series data and lacks effective handling of multi-modal inputs. Recent advances in Large Language Models (LLMs) demonstrate inherent advantages in capturing long-term dependencies and complex textual patterns, indicating their potential for load forecasting. Nevertheless, the application of LLMs in this field remains limited. Thus, to fill this gap, we propose MMGPT4LF, a model that combines the pre-trained GPT-2 model with multi-modal data inputs for load forecasting. Specifically, the model designs an additional time series input head to more effectively capture temporal dependencies, particularly the periodicity and long-term trends present in power load data. Furthermore, the model incorporates a Multi-Modal Cross-Attention (MMCA) mechanism, enabling efficient alignment and fusion of high-dimensional feature representations from both time series and textual inputs. Through this framework, MMGPT4LF not only enhances the effectiveness of multi-modal data fusion but also accurately handles the interactions between different modalities, thereby significantly improving load forecasting accuracy and the model’s generalization ability. Extensive experiments on two open-source load forecasting datasets, compared with nine advanced time series forecasting models, validate the effectiveness and accuracy of MMGPT4LF in load forecasting tasks.

Suggested Citation

  • Gao, Mingyang & Zhou, Suyang & Gu, Wei & Wu, Zhi & Liu, Haiquan & Zhou, Aihua & Wang, Xinliang, 2025. "MMGPT4LF: Leveraging an optimized pre-trained GPT-2 model with multi-modal cross-attention for load forecasting," Applied Energy, Elsevier, vol. 392(C).
  • Handle: RePEc:eee:appene:v:392:y:2025:i:c:s0306261925006956
    DOI: 10.1016/j.apenergy.2025.125965
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0306261925006956
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.apenergy.2025.125965?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Wu, Tangjie & Ling, Qiang, 2024. "STELLM: Spatio-temporal enhanced pre-trained large language model for wind speed forecasting," Applied Energy, Elsevier, vol. 375(C).
    2. Chen, Yongbao & Xu, Peng & Chu, Yiyi & Li, Weilin & Wu, Yuntao & Ni, Lizhou & Bao, Yi & Wang, Kun, 2017. "Short-term electrical load forecasting using the Support Vector Regression (SVR) model to calculate the demand response baseline for office buildings," Applied Energy, Elsevier, vol. 195(C), pages 659-670.
    3. Cai, Mengmeng & Pipattanasomporn, Manisa & Rahman, Saifur, 2019. "Day-ahead building-level load forecasts using deep learning vs. traditional time-series techniques," Applied Energy, Elsevier, vol. 236(C), pages 1078-1088.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jonathan Berrisch & Micha{l} Narajewski & Florian Ziel, 2022. "High-Resolution Peak Demand Estimation Using Generalized Additive Models and Deep Neural Networks," Papers 2203.03342, arXiv.org, revised Nov 2022.
    2. Wang, Ran & Lu, Shilei & Feng, Wei, 2020. "A novel improved model for building energy consumption prediction based on model integration," Applied Energy, Elsevier, vol. 262(C).
    3. Ng, Rong Wang & Begam, Kasim Mumtaj & Rajkumar, Rajprasad Kumar & Wong, Yee Wan & Chong, Lee Wai, 2021. "An improved self-organizing incremental neural network model for short-term time-series load prediction," Applied Energy, Elsevier, vol. 292(C).
    4. Thomas Steens & Jan-Simon Telle & Benedikt Hanke & Karsten von Maydell & Carsten Agert & Gian-Luca Di Modica & Bernd Engel & Matthias Grottke, 2021. "A Forecast-Based Load Management Approach for Commercial Buildings Demonstrated on an Integration of BEV," Energies, MDPI, vol. 14(12), pages 1-25, June.
    5. Li, Yiyan & Zhang, Si & Hu, Rongxing & Lu, Ning, 2021. "A meta-learning based distribution system load forecasting model selection framework," Applied Energy, Elsevier, vol. 294(C).
    6. Liu, Jiefeng & Zhang, Zhenhao & Fan, Xianhao & Zhang, Yiyi & Wang, Jiaqi & Zhou, Ke & Liang, Shuo & Yu, Xiaoyong & Zhang, Wei, 2022. "Power system load forecasting using mobility optimization and multi-task learning in COVID-19," Applied Energy, Elsevier, vol. 310(C).
    7. Lee, Juyong & Cho, Youngsang, 2022. "National-scale electricity peak load forecasting: Traditional, machine learning, or hybrid model?," Energy, Elsevier, vol. 239(PD).
    8. Li, Kangping & Wang, Fei & Mi, Zengqiang & Fotuhi-Firuzabad, Mahmoud & Duić, Neven & Wang, Tieqiang, 2019. "Capacity and output power estimation approach of individual behind-the-meter distributed photovoltaic system for demand response baseline estimation," Applied Energy, Elsevier, vol. 253(C), pages 1-1.
    9. Mingping Liu & Yangze Li & Jiangong Hu & Xiaolong Wu & Suhui Deng & Hongqiao Li, 2023. "A New Hybrid Model Based on SCINet and LSTM for Short-Term Power Load Forecasting," Energies, MDPI, vol. 17(1), pages 1-20, December.
    10. Li, Wenqiang & Gong, Guangcai & Fan, Houhua & Peng, Pei & Chun, Liang, 2020. "Meta-learning strategy based on user preferences and a machine recommendation system for real-time cooling load and COP forecasting," Applied Energy, Elsevier, vol. 270(C).
    11. Chen, Yongbao & Chen, Zhe & Xu, Peng & Li, Weilin & Sha, Huajing & Yang, Zhiwei & Li, Guowen & Hu, Chonghe, 2019. "Quantification of electricity flexibility in demand response: Office building case study," Energy, Elsevier, vol. 188(C).
    12. Alexandra L’Heureux & Katarina Grolinger & Miriam A. M. Capretz, 2022. "Transformer-Based Model for Electrical Load Forecasting," Energies, MDPI, vol. 15(14), pages 1-23, July.
    13. Bingjie Jin & Guihua Zeng & Zhilin Lu & Hongqiao Peng & Shuxin Luo & Xinhe Yang & Haojun Zhu & Mingbo Liu, 2022. "Hybrid LSTM–BPNN-to-BPNN Model Considering Multi-Source Information for Forecasting Medium- and Long-Term Electricity Peak Load," Energies, MDPI, vol. 15(20), pages 1-20, October.
    14. Wang, Kejun & Qi, Xiaoxia & Liu, Hongda & Song, Jiakang, 2018. "Deep belief network based k-means cluster approach for short-term wind power forecasting," Energy, Elsevier, vol. 165(PA), pages 840-852.
    15. Alhamwi, Alaa & Medjroubi, Wided & Vogt, Thomas & Agert, Carsten, 2018. "Modelling urban energy requirements using open source data and models," Applied Energy, Elsevier, vol. 231(C), pages 1100-1108.
    16. Ibrahim, Muhammad Sohail & Dong, Wei & Yang, Qiang, 2020. "Machine learning driven smart electric power systems: Current trends and new perspectives," Applied Energy, Elsevier, vol. 272(C).
    17. Wei, Ziqing & Zhang, Tingwei & Yue, Bao & Ding, Yunxiao & Xiao, Ran & Wang, Ruzhu & Zhai, Xiaoqiang, 2021. "Prediction of residential district heating load based on machine learning: A case study," Energy, Elsevier, vol. 231(C).
    18. Xiaojin Xie & Kangyang Luo & Zhixiang Yin & Guoqiang Wang, 2021. "Nonlinear Combinational Dynamic Transmission Rate Model and Its Application in Global COVID-19 Epidemic Prediction and Analysis," Mathematics, MDPI, vol. 9(18), pages 1-17, September.
    19. Hyunsoo Kim & Jiseok Jeong & Changwan Kim, 2022. "Daily Peak-Electricity-Demand Forecasting Based on Residual Long Short-Term Network," Mathematics, MDPI, vol. 10(23), pages 1-17, November.
    20. Li, Yanbin & Hu, Weikun & Zhang, Feng & Li, Yun, 2025. "Multi-objective collaborative operation optimization of park-level integrated energy system clusters considering green power forecasting and trading," Energy, Elsevier, vol. 319(C).

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:appene:v:392:y:2025:i:c:s0306261925006956. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/405891/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.