IDEAS home Printed from https://ideas.repec.org/a/gam/jeners/v17y2024i24p6452-d1549422.html
   My bibliography  Save this article

Partial Transfer Learning from Patch Transformer to Variate-Based Linear Forecasting Model

Author

Listed:
  • Le Hoang Anh

    (Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea
    These authors contributed equally to this work.)

  • Dang Thanh Vu

    (Research Center, AISeed Inc., Gwangju 61186, Republic of Korea
    These authors contributed equally to this work.)

  • Seungmin Oh

    (Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)

  • Gwang-Hyun Yu

    (Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)

  • Nguyen Bui Ngoc Han

    (Department of Electronic Convergence Engineering, Kwangwoon University, Seoul 01897, Republic of Korea)

  • Hyoung-Gook Kim

    (Department of Electronic Convergence Engineering, Kwangwoon University, Seoul 01897, Republic of Korea)

  • Jin-Sul Kim

    (Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)

  • Jin-Young Kim

    (Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)

Abstract

Transformer-based time series forecasting models use patch tokens for temporal patterns and variate tokens to learn covariates’ dependencies. While patch tokens inherently facilitate self-supervised learning, variate tokens are more suitable for linear forecasters as they help to mitigate distribution drift. However, the use of variate tokens prohibits masked model pretraining, as masking an entire series is absurd. To close this gap, we propose LSPatch-T (Long–Short Patch Transfer), a framework that transfers knowledge from short-length patch tokens into full-length variate tokens. A key implementation is that we selectively transfer a portion of the Transformer encoder to ensure the linear design of the downstream model. Additionally, we introduce a robust frequency loss to maintain consistency across different temporal ranges. The experimental results show that our approach outperforms Transformer-based baselines (Transformer, Informer, Crossformer, Autoformer, PatchTST, iTransformer) on three public datasets (ETT, Exchange, Weather), which is a promising step forward in generalizing time series forecasting models.

Suggested Citation

  • Le Hoang Anh & Dang Thanh Vu & Seungmin Oh & Gwang-Hyun Yu & Nguyen Bui Ngoc Han & Hyoung-Gook Kim & Jin-Sul Kim & Jin-Young Kim, 2024. "Partial Transfer Learning from Patch Transformer to Variate-Based Linear Forecasting Model," Energies, MDPI, vol. 17(24), pages 1-18, December.
  • Handle: RePEc:gam:jeners:v:17:y:2024:i:24:p:6452-:d:1549422
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1996-1073/17/24/6452/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1996-1073/17/24/6452/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Salinas, David & Flunkert, Valentin & Gasthaus, Jan & Januschowski, Tim, 2020. "DeepAR: Probabilistic forecasting with autoregressive recurrent networks," International Journal of Forecasting, Elsevier, vol. 36(3), pages 1181-1191.
    2. Le Hoang Anh & Gwang-Hyun Yu & Dang Thanh Vu & Hyoung-Gook Kim & Jin-Young Kim, 2023. "DelayNet: Enhancing Temporal Feature Extraction for Electronic Consumption Forecasting with Delayed Dilated Convolution," Energies, MDPI, vol. 16(22), pages 1-18, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Spiliotis, Evangelos & Makridakis, Spyros & Kaltsounis, Anastasios & Assimakopoulos, Vassilios, 2021. "Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data," International Journal of Production Economics, Elsevier, vol. 240(C).
    2. Andreas Lenk & Marcus Vogt & Christoph Herrmann, 2024. "An Approach to Predicting Energy Demand Within Automobile Production Using the Temporal Fusion Transformer Model," Energies, MDPI, vol. 18(1), pages 1-34, December.
    3. Bojer, Casper Solheim & Meldgaard, Jens Peder, 2021. "Kaggle forecasting competitions: An overlooked learning opportunity," International Journal of Forecasting, Elsevier, vol. 37(2), pages 587-603.
    4. Ying Shu & Chengfu Ding & Lingbing Tao & Chentao Hu & Zhixin Tie, 2023. "Air Pollution Prediction Based on Discrete Wavelets and Deep Learning," Sustainability, MDPI, vol. 15(9), pages 1-19, April.
    5. Wang, Shengjie & Kang, Yanfei & Petropoulos, Fotios, 2024. "Combining probabilistic forecasts of intermittent demand," European Journal of Operational Research, Elsevier, vol. 315(3), pages 1038-1048.
    6. Pesantez, Jorge E. & Li, Binbin & Lee, Christopher & Zhao, Zhizhen & Butala, Mark & Stillwell, Ashlynn S., 2023. "A Comparison Study of Predictive Models for Electricity Demand in a Diverse Urban Environment," Energy, Elsevier, vol. 283(C).
    7. Wen, Honglin & Pinson, Pierre & Gu, Jie & Jin, Zhijian, 2024. "Wind energy forecasting with missing values within a fully conditional specification framework," International Journal of Forecasting, Elsevier, vol. 40(1), pages 77-95.
    8. Anna Almosova & Niek Andresen, 2023. "Nonlinear inflation forecasting with recurrent neural networks," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 42(2), pages 240-259, March.
    9. Philippe Goulet Coulombe & Mikael Frenette & Karin Klieber, 2023. "From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks," Working Papers 23-04, Chair in macroeconomics and forecasting, University of Quebec in Montreal's School of Management, revised Nov 2023.
    10. Frison, Lilli & Gölzhäuser, Simon & Bitterling, Moritz & Kramer, Wolfgang, 2024. "Evaluating different artificial neural network forecasting approaches for optimizing district heating network operation," Energy, Elsevier, vol. 307(C).
    11. Jayesh Thaker & Robert Höller, 2022. "A Comparative Study of Time Series Forecasting of Solar Energy Based on Irradiance Classification," Energies, MDPI, vol. 15(8), pages 1-26, April.
    12. Liu, Chen & Wang, Chao & Tran, Minh-Ngoc & Kohn, Robert, 2025. "A long short-term memory enhanced realized conditional heteroskedasticity model," Economic Modelling, Elsevier, vol. 142(C).
    13. Kandaswamy Paramasivan & Brinda Subramani & Nandan Sudarsanam, 2022. "Counterfactual analysis of the impact of the first two waves of the COVID-19 pandemic on the reporting and registration of missing people in India," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-14, December.
    14. Bojer, Casper Solheim, 2022. "Understanding machine learning-based forecasting methods: A decomposition framework and research opportunities," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1555-1561.
    15. Sergio Consoli & Luca Tiozzo Pezzoli & Elisa Tosetti, 2022. "Neural forecasting of the Italian sovereign bond market with economic news," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(S2), pages 197-224, December.
    16. Wellens, Arnoud P. & Boute, Robert N. & Udenio, Maximiliano, 2024. "Simplifying tree-based methods for retail sales forecasting with explanatory variables," European Journal of Operational Research, Elsevier, vol. 314(2), pages 523-539.
    17. Pengfei Zhao & Haoren Zhu & Wilfred Siu Hung NG & Dik Lun Lee, 2024. "From GARCH to Neural Network for Volatility Forecast," Papers 2402.06642, arXiv.org.
    18. Li, Xixi & Yuan, Jingsong, 2024. "DeepTVAR: Deep learning for a time-varying VAR model with extension to integrated VAR," International Journal of Forecasting, Elsevier, vol. 40(3), pages 1123-1133.
    19. Zhen Zeng & Rachneet Kaur & Suchetha Siddagangappa & Saba Rahimi & Tucker Balch & Manuela Veloso, 2023. "Financial Time Series Forecasting using CNN and Transformer," Papers 2304.04912, arXiv.org.
    20. Chaokai Huang & Ning Du & Jiahan He & Na Li & Yifan Feng & Weihong Cai, 2023. "Multidimensional Feature-Based Graph Attention Networks and Dynamic Learning for Electricity Load Forecasting," Energies, MDPI, vol. 16(18), pages 1-17, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jeners:v:17:y:2024:i:24:p:6452-:d:1549422. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.