Author
Listed:
- Pedro Sobreiro
(Sports Science School of Rio Maior (ESDRM), Polytechnic Institute of Santarem, 2040-413 Rio Maior, Portugal
Life Quality Research Centre (LQRC), Complexo Andaluz, Apartado 279, 2001-904 Santarem, Portugal)
- Domingos Martinho
(Instituto Superior de Gestão e Administração de Santarem (ISLA Santarem), Polytechnic University, Rua Dr. Teixeira Guedes, 31, 2000-029 Santarem, Portugal
Research Centre for Business Sciences (NECE), Estrada do Sineiro 56, 6200-209 Covilha, Portugal)
- Rui Martins
(Instituto Superior de Gestão e Administração de Santarem (ISLA Santarem), Polytechnic University, Rua Dr. Teixeira Guedes, 31, 2000-029 Santarem, Portugal
Research Centre for Business Sciences (NECE), Estrada do Sineiro 56, 6200-209 Covilha, Portugal)
- Ricardo Vardasca
(Instituto Superior de Gestão e Administração de Santarem (ISLA Santarem), Polytechnic University, Rua Dr. Teixeira Guedes, 31, 2000-029 Santarem, Portugal
Institute of Science and Innovation in Mechanical and Industrial Engineering (INEGI), Universidade do Porto, Rua Dr. Roberto Frias 400, 4200-465 Porto, Portugal)
Abstract
Predicting profitable entry signals in Bitcoin markets remains challenging due to price volatility, the absence of fundamental valuation frameworks, and methodological pitfalls that are common in the literature. In this study, we evaluate five machine learning classifiers using a 37-feature hierarchical multi-timeframe pipeline with price-level-agnostic normalization across four temporal resolutions (15-min, 4-h, daily, and 3-day), spanning January 2020 to November 2025. Binary training labels were generated via majority-vote aggregation across 54 stop-loss/take-profit combinations, producing 6951 balanced samples (48.5% positive class). Five algorithms—Logistic Regression, Decision Tree, Random Forest, XGBoost, and LightGBM—are compared using expanding-window TimeSeriesSplit validation (5 folds). Random Forest achieved the highest cross-validated ROC-AUC (0.6086), with all models showing modest but consistent discriminative ability (range 0.57–0.61). Feature importance analysis identifies 4-hour Bollinger Band position and RSI as dominant predictors, with all timeframes contributing meaningfully. A true out-of-sample holdout on 1136 independently generated 2025 samples confirms generalization, with Logistic Regression achieving 0.6087 ROC-AUC. A subtle multi-timeframe look-ahead bias in higher-timeframe data alignment is identified and corrected, which inflated performance by approximately 0.20 ROC-AUC points before correction. Event-driven backtesting on 2025 out-of-sample data yields a gross upper-bound return of +35.97% (185 trades, SL = 1%, TP = 2%, threshold = 0.7, Sharpe = 0.14) before transaction costs, after realistic round-trip fees, net returns are likely negligible. The central finding is that models with ROC-AUC ≈ 0.60 cannot reliably generate economically significant returns once transaction costs are accounted for. The methodology provides a reproducible framework for ML-based binary classification studies requiring transparent, bias-corrected validation across diverse market regimes.
Suggested Citation
Pedro Sobreiro & Domingos Martinho & Rui Martins & Ricardo Vardasca, 2026.
"Multi-Timeframe Feature Engineering for Bitcoin Market Prediction: A Price-Level-Agnostic Machine Learning Approach,"
Forecasting, MDPI, vol. 8(3), pages 1-27, May.
Handle:
RePEc:gam:jforec:v:8:y:2026:i:3:p:40-:d:1945664
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jforec:v:8:y:2026:i:3:p:40-:d:1945664. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.