Transfer learning for high‐dimensional linear regression: Prediction, estimation and minimax optimality

Transfer learning for high‐dimensional linear regression: Prediction, estimation and minimax optimality

Author

Listed:

Sai Li
T. Tony Cai
Hongzhe Li

Abstract

This paper considers estimation and prediction of a high‐dimensional linear regression in the setting of transfer learning where, in addition to observations from the target model, auxiliary samples from different but possibly related regression models are available. When the set of informative auxiliary studies is known, an estimator and a predictor are proposed and their optimality is established. The optimal rates of convergence for prediction and estimation are faster than the corresponding rates without using the auxiliary samples. This implies that knowledge from the informative auxiliary samples can be transferred to improve the learning performance of the target problem. When the set of informative auxiliary samples is unknown, we propose a data‐driven procedure for transfer learning, called Trans‐Lasso, and show its robustness to non‐informative auxiliary samples and its efficiency in knowledge transfer. The proposed procedures are demonstrated in numerical studies and are applied to a dataset concerning the associations among gene expressions. It is shown that Trans‐Lasso leads to improved performance in gene expression prediction in a target tissue by incorporating data from multiple different tissues as auxiliary samples.

Suggested Citation

Sai Li & T. Tony Cai & Hongzhe Li, 2022. "Transfer learning for high‐dimensional linear regression: Prediction, estimation and minimax optimality," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(1), pages 149-173, February.

Handle: RePEc:bla:jorssb:v:84:y:2022:i:1:p:149-173
DOI: 10.1111/rssb.12479

Download full text from publisher

References listed on IDEAS

Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
Emma Pierson & the GTEx Consortium & Daphne Koller & Alexis Battle & Sara Mostafavi, 2015. "Sharing and Specificity of Co-expression Networks across 35 Human Tissues," PLOS Computational Biology, Public Library of Science, vol. 11(5), pages 1-19, May.
repec:abf:journl:v:31:y:2020:i:3:p:24261-24266 is not listed on IDEAS
Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
Tingni Sun & Cun-Hui Zhang, 2012. "Scaled sparse linear regression," Biometrika, Biometrika Trust, vol. 99(4), pages 879-898.
Patrick Danaher & Pei Wang & Daniela M. Witten, 2014. "The joint graphical lasso for inverse covariance estimation across multiple classes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(2), pages 373-397, March.
Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Rui Yang & Yunquan Song, 2025. "Transfer learning with high-dimensional multiplicative models: least product relative error estimation approach," Statistical Papers, Springer, vol. 66(5), pages 1-18, August.
Xuan Chen & Yunquan Song, 2025. "Transfer learning for semiparametric varying coefficient spatial autoregressive models," Statistical Papers, Springer, vol. 66(2), pages 1-22, February.
Peng, Yanjin & Wang, Lei, 2025. "Heterogeneity-aware transfer learning for high-dimensional linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 206(C).
T. Tony Cai & Zijian Guo & Yin Xia, 2023. "Rejoinder on: statistical inference and large-scale multiple testing for high-dimensional regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(4), pages 1187-1194, December.
Lin, Ziqian & Gao, Yuan & Wang, Feifei & Wang, Hansheng, 2025. "Testing sufficiency for transfer learning," Computational Statistics & Data Analysis, Elsevier, vol. 203(C).
Qu, Xinhao, 2025. "Automatic Transfer Learning for high-dimensional linear regression," Statistics & Probability Letters, Elsevier, vol. 224(C).
Hao Zeng & Wei Zhong & Xingbai Xu, 2024. "Transfer Learning for Spatial Autoregressive Models with Application to U.S. Presidential Election Prediction," Papers 2405.15600, arXiv.org, revised Sep 2024.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Peter Bühlmann & Jacopo Mandozzi, 2014. "High-dimensional variable screening and bias in subsequent inference, with an empirical comparison," Computational Statistics, Springer, vol. 29(3), pages 407-430, June.
He, Yong & Zhang, Liang & Ji, Jiadong & Zhang, Xinsheng, 2019. "Robust feature screening for elliptical copula regression model," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 568-582.
Wenquan Cui & Jianjun Xu & Yuehua Wu, 2023. "A new reproducing kernel‐based nonlinear dimension reduction method for survival data," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 50(3), pages 1365-1390, September.
Xin Wang & Lingchen Kong & Liqun Wang, 2022. "Estimation of Error Variance in Regularized Regression Models via Adaptive Lasso," Mathematics, MDPI, vol. 10(11), pages 1-19, June.
Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
Shan Luo & Zehua Chen, 2014. "Sequential Lasso Cum EBIC for Feature Selection With Ultra-High Dimensional Feature Space," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1229-1240, September.
Shi Chen & Wolfgang Karl Hardle & Brenda L'opez Cabrera, 2020. "Regularization Approach for Network Modeling of German Power Derivative Market," Papers 2009.09739, arXiv.org.
Wang, Christina Dan & Chen, Zhao & Lian, Yimin & Chen, Min, 2022. "Asset selection based on high frequency Sharpe ratio," Journal of Econometrics, Elsevier, vol. 227(1), pages 168-188.
Anders Bredahl Kock, 2012. "On the Oracle Property of the Adaptive Lasso in Stationary and Nonstationary Autoregressions," CREATES Research Papers 2012-05, Department of Economics and Business Economics, Aarhus University.
Tang, Yanlin & Song, Xinyuan & Wang, Huixia Judy & Zhu, Zhongyi, 2013. "Variable selection in high-dimensional quantile varying coefficient models," Journal of Multivariate Analysis, Elsevier, vol. 122(C), pages 115-132.
Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
- Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Post-Print hal-01954386, HAL.
Li, Xinyi & Wang, Li & Nettleton, Dan, 2019. "Sparse model identification and learning for ultra-high-dimensional additive partially linear models," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 204-228.
Jingyuan Liu & Runze Li & Rongling Wu, 2014. "Feature Selection for Varying Coefficient Models With Ultrahigh-Dimensional Covariates," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(505), pages 266-274, March.
Jianqing Fan & Yang Feng & Jiancheng Jiang & Xin Tong, 2016. "Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 275-287, March.
Lee, Ji Hyung & Shi, Zhentao & Gao, Zhan, 2022. "On LASSO for predictive regression," Journal of Econometrics, Elsevier, vol. 229(2), pages 322-349.
- Ji Hyung Lee & Zhentao Shi & Zhan Gao, 2018. "On LASSO for Predictive Regression," Papers 1810.03140, arXiv.org, revised Feb 2021.
Christidis, Anthony-Alexander & Van Aelst, Stefan & Zamar, Ruben, 2025. "Multi-model subset selection," Computational Statistics & Data Analysis, Elsevier, vol. 203(C).
Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers CWP56/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers 56/15, Institute for Fiscal Studies.
- Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers CWP05/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," Papers 1502.03155, arXiv.org, revised Mar 2015.
- Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers 05/15, Institute for Fiscal Studies.
Lan Wang & Yichao Wu & Runze Li, 2012. "Quantile Regression for Analyzing Heterogeneity in Ultra-High Dimension," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(497), pages 214-222, March.
Jingxuan Luo & Lili Yue & Gaorong Li, 2023. "Overview of High-Dimensional Measurement Error Regression Models," Mathematics, MDPI, vol. 11(14), pages 1-22, July.
Canhong Wen & Xueqin Wang & Shaoli Wang, 2015. "Laplace Error Penalty-based Variable Selection in High Dimension," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 42(3), pages 685-700, September.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssb:v:84:y:2022:i:1:p:149-173. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Transfer learning for high‐dimensional linear regression: Prediction, estimation and minimax optimality

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data