Partial Transfer Learning from Patch Transformer to Variate-Based Linear Forecasting Model

My bibliography Save this article

Partial Transfer Learning from Patch Transformer to Variate-Based Linear Forecasting Model

Author

Listed:

Le Hoang Anh
(Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea
These authors contributed equally to this work.)
Dang Thanh Vu
(Research Center, AISeed Inc., Gwangju 61186, Republic of Korea
These authors contributed equally to this work.)
Seungmin Oh
(Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)
Gwang-Hyun Yu
(Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)
Nguyen Bui Ngoc Han
(Department of Electronic Convergence Engineering, Kwangwoon University, Seoul 01897, Republic of Korea)
Hyoung-Gook Kim
(Department of Electronic Convergence Engineering, Kwangwoon University, Seoul 01897, Republic of Korea)
Jin-Sul Kim
(Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)
Jin-Young Kim
(Department of Intelligent Electronics and Computer Engineering, Chonnam National University, Gwangju 61186, Republic of Korea)

Registered:

Abstract

Transformer-based time series forecasting models use patch tokens for temporal patterns and variate tokens to learn covariates’ dependencies. While patch tokens inherently facilitate self-supervised learning, variate tokens are more suitable for linear forecasters as they help to mitigate distribution drift. However, the use of variate tokens prohibits masked model pretraining, as masking an entire series is absurd. To close this gap, we propose LSPatch-T (Long–Short Patch Transfer), a framework that transfers knowledge from short-length patch tokens into full-length variate tokens. A key implementation is that we selectively transfer a portion of the Transformer encoder to ensure the linear design of the downstream model. Additionally, we introduce a robust frequency loss to maintain consistency across different temporal ranges. The experimental results show that our approach outperforms Transformer-based baselines (Transformer, Informer, Crossformer, Autoformer, PatchTST, iTransformer) on three public datasets (ETT, Exchange, Weather), which is a promising step forward in generalizing time series forecasting models.

Suggested Citation

Le Hoang Anh & Dang Thanh Vu & Seungmin Oh & Gwang-Hyun Yu & Nguyen Bui Ngoc Han & Hyoung-Gook Kim & Jin-Sul Kim & Jin-Young Kim, 2024. "Partial Transfer Learning from Patch Transformer to Variate-Based Linear Forecasting Model," Energies, MDPI, vol. 17(24), pages 1-18, December.

Handle: RePEc:gam:jeners:v:17:y:2024:i:24:p:6452-:d:1549422

Download full text from publisher

References listed on IDEAS

Le Hoang Anh & Gwang-Hyun Yu & Dang Thanh Vu & Hyoung-Gook Kim & Jin-Young Kim, 2023. "DelayNet: Enhancing Temporal Feature Extraction for Electronic Consumption Forecasting with Delayed Dilated Convolution," Energies, MDPI, vol. 16(22), pages 1-18, November.
Salinas, David & Flunkert, Valentin & Gasthaus, Jan & Januschowski, Tim, 2020. "DeepAR: Probabilistic forecasting with autoregressive recurrent networks," International Journal of Forecasting, Elsevier, vol. 36(3), pages 1181-1191.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Spiliotis, Evangelos & Makridakis, Spyros & Kaltsounis, Anastasios & Assimakopoulos, Vassilios, 2021. "Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data," International Journal of Production Economics, Elsevier, vol. 240(C).
Tiantian Tu, 2025. "Bridging Short- and Long-Term Dependencies: A CNN-Transformer Hybrid for Financial Time Series Forecasting," Papers 2504.19309, arXiv.org.
Andreas Lenk & Marcus Vogt & Christoph Herrmann, 2024. "An Approach to Predicting Energy Demand Within Automobile Production Using the Temporal Fusion Transformer Model," Energies, MDPI, vol. 18(1), pages 1-34, December.
Ying Shu & Chengfu Ding & Lingbing Tao & Chentao Hu & Zhixin Tie, 2023. "Air Pollution Prediction Based on Discrete Wavelets and Deep Learning," Sustainability, MDPI, vol. 15(9), pages 1-19, April.
Wang, Shengjie & Kang, Yanfei & Petropoulos, Fotios, 2024. "Combining probabilistic forecasts of intermittent demand," European Journal of Operational Research, Elsevier, vol. 315(3), pages 1038-1048.
Pesantez, Jorge E. & Li, Binbin & Lee, Christopher & Zhao, Zhizhen & Butala, Mark & Stillwell, Ashlynn S., 2023. "A Comparison Study of Predictive Models for Electricity Demand in a Diverse Urban Environment," Energy, Elsevier, vol. 283(C).
Wen, Honglin & Pinson, Pierre & Gu, Jie & Jin, Zhijian, 2024. "Wind energy forecasting with missing values within a fully conditional specification framework," International Journal of Forecasting, Elsevier, vol. 40(1), pages 77-95.
Philippe Goulet Coulombe & Mikael Frenette & Karin Klieber, 2023. "From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks," Papers 2311.16333, arXiv.org, revised Apr 2024.
- Philippe Goulet Coulombe & Mikael Frenette & Karin Klieber, 2023. "From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks," Working Papers 23-04, Chair in macroeconomics and forecasting, University of Quebec in Montreal's School of Management, revised Nov 2023.
Jayesh Thaker & Robert Höller, 2022. "A Comparative Study of Time Series Forecasting of Solar Energy Based on Irradiance Classification," Energies, MDPI, vol. 15(8), pages 1-26, April.
Liu, Chen & Wang, Chao & Tran, Minh-Ngoc & Kohn, Robert, 2025. "A long short-term memory enhanced realized conditional heteroskedasticity model," Economic Modelling, Elsevier, vol. 142(C).
Kandaswamy Paramasivan & Brinda Subramani & Nandan Sudarsanam, 2022. "Counterfactual analysis of the impact of the first two waves of the COVID-19 pandemic on the reporting and registration of missing people in India," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-14, December.
Sergio Consoli & Luca Tiozzo Pezzoli & Elisa Tosetti, 2022. "Neural forecasting of the Italian sovereign bond market with economic news," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(S2), pages 197-224, December.
Wellens, Arnoud P. & Boute, Robert N. & Udenio, Maximiliano, 2024. "Simplifying tree-based methods for retail sales forecasting with explanatory variables," European Journal of Operational Research, Elsevier, vol. 314(2), pages 523-539.
Haodong Wang & Huaxiong Zhang, 2025. "An Anomaly Detection Method for Multivariate Time Series Data Based on Variational Autoencoders and Association Discrepancy," Mathematics, MDPI, vol. 13(7), pages 1-17, April.
Long, Xueying & Bui, Quang & Oktavian, Grady & Schmidt, Daniel F. & Bergmeir, Christoph & Godahewa, Rakshitha & Lee, Seong Per & Zhao, Kaifeng & Condylis, Paul, 2025. "Scalable probabilistic forecasting in retail with gradient boosted trees: A practitioner’s approach," International Journal of Production Economics, Elsevier, vol. 279(C).
de Rezende, Rafael & Egert, Katharina & Marin, Ignacio & Thompson, Guilherme, 2022. "A white-boxed ISSM approach to estimate uncertainty distributions of Walmart sales," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1460-1467.
Heming Chen, 2025. "Can optimal diversification beat the naive 1/N strategy in a highly correlated market? Empirical evidence from cryptocurrencies," Papers 2501.12841, arXiv.org.
Semenoglou, Artemios-Anargyros & Spiliotis, Evangelos & Makridakis, Spyros & Assimakopoulos, Vassilios, 2021. "Investigating the accuracy of cross-learning time series forecasting methods," International Journal of Forecasting, Elsevier, vol. 37(3), pages 1072-1084.
Kandaswamy Paramasivan & Rahul Subburaj & Saish Jaiswal & Nandan Sudarsanam, 2022. "Empirical evidence of the impact of mobility on property crimes during the first two waves of the COVID-19 pandemic," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-14, December.
Elham M. Al-Ali & Yassine Hajji & Yahia Said & Manel Hleili & Amal M. Alanzi & Ali H. Laatar & Mohamed Atri, 2023. "Solar Energy Production Forecasting Based on a Hybrid CNN-LSTM-Transformer Model," Mathematics, MDPI, vol. 11(3), pages 1-19, January.

More about this item

Keywords

multivariate time series forecasting; transfer learning; frequency analysis;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jeners:v:17:y:2024:i:24:p:6452-:d:1549422. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Partial Transfer Learning from Patch Transformer to Variate-Based Linear Forecasting Model

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data