IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v66y2025i2d10.1007_s00362-025-01664-3.html
   My bibliography  Save this article

Communication-efficient model averaging prediction for massive data with asymptotic optimality

Author

Listed:
  • Xiaochao Xia

    (Chongqing University)

  • Sijin He

    (Chongqing University)

  • Naiwen Pang

    (Chongqing University)

Abstract

This paper focuses on model averaging prediction for massive dataset. Specifically, in the framework of Mallows model averaging, we propose two distributed approaches to estimate the parameters of each submodel and weights in the final weighted estimator, respectively. The first approach is an one-shot procedure that aggregates the estimated parameters and weights from each local machine via simple average. The second approach is an iterative procedure that approximates the global loss by a surrogate loss in parameter estimation. The two proposed distributed estimators are communication-efficient, where the former requires only one round of communication and the latter requires two rounds of communications between central and local machines for parameter estimation to achieve the globally statistical efficiency. To estimate weight vector, two distributed algorithms are presented. Furthermore, we theoretically justify the two approaches by proving convergence rates and asymptotic normalities. More importantly, we establish the asymptotic optimality of distributed estimator of weight vector in terms of the out-of-sample prediction error criterion. Finally, simulations and a real data analysis are carried out to illustrate the proposed methods.

Suggested Citation

  • Xiaochao Xia & Sijin He & Naiwen Pang, 2025. "Communication-efficient model averaging prediction for massive data with asymptotic optimality," Statistical Papers, Springer, vol. 66(2), pages 1-45, February.
  • Handle: RePEc:spr:stpapr:v:66:y:2025:i:2:d:10.1007_s00362-025-01664-3
    DOI: 10.1007/s00362-025-01664-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-025-01664-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-025-01664-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jianqing Fan & Yongyi Guo & Kaizheng Wang, 2023. "Communication-Efficient Accurate Statistical Estimation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 118(542), pages 1000-1010, April.
    2. Lu, Xun & Su, Liangjun, 2015. "Jackknife model averaging for quantile regressions," Journal of Econometrics, Elsevier, vol. 188(1), pages 40-58.
    3. Claeskens,Gerda & Hjort,Nils Lid, 2008. "Model Selection and Model Averaging," Cambridge Books, Cambridge University Press, number 9780521852258, June.
    4. Baihua He & Yanyan Liu & Guosheng Yin & Yuanshan Wu, 2023. "Model aggregation for doubly divided data with large size and large dimension," Computational Statistics, Springer, vol. 38(1), pages 509-529, March.
    5. Rong Zhu & Alan T. K. Wan & Xinyu Zhang & Guohua Zou, 2019. "A Mallows-Type Model Averaging Estimator for the Varying-Coefficient Partially Linear Model," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 882-892, April.
    6. Peng, Jingfu & Yang, Yuhong, 2022. "On improvability of model selection by model averaging," Journal of Econometrics, Elsevier, vol. 229(2), pages 246-262.
    7. Bruce E. Hansen, 2014. "Model averaging, asymptotic risk, and regressor groups," Quantitative Economics, Econometric Society, vol. 5(3), pages 495-530, November.
    8. Michael I. Jordan & Jason D. Lee & Yun Yang, 2019. "Communication-Efficient Distributed Statistical Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 668-681, April.
    9. Zhang, Xinyu & Ullah, Aman & Zhao, Shangwei, 2016. "On the dominance of Mallows model averaging estimator over ordinary least squares estimator," Economics Letters, Elsevier, vol. 142(C), pages 69-73.
    10. Tomohiro Ando & Ker-Chau Li, 2014. "A Model-Averaging Approach for High-Dimensional Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(505), pages 254-265, March.
    11. Xinyu Zhang & Guohua Zou & Hua Liang & Raymond J. Carroll, 2020. "Parsimonious Model Averaging With a Diverging Number of Parameters," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(530), pages 972-984, April.
    12. Wan, Alan T.K. & Zhang, Xinyu & Zou, Guohua, 2010. "Least squares model averaging by Mallows criterion," Journal of Econometrics, Elsevier, vol. 156(2), pages 277-283, June.
    13. Hansen, Bruce E. & Racine, Jeffrey S., 2012. "Jackknife model averaging," Journal of Econometrics, Elsevier, vol. 167(1), pages 38-46.
    14. Jialiang Li & Jing Lv & Alan T. K. Wan & Jun Liao, 2022. "AdaBoost Semiparametric Model Averaging Prediction for Multiple Categories," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(537), pages 495-509, January.
    15. Yang Feng & Qingfeng Liu & Qingsong Yao & Guoqing Zhao, 2022. "Model Averaging for Nonlinear Regression Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(2), pages 785-798, April.
    16. Bruce E. Hansen, 2007. "Least Squares Model Averaging," Econometrica, Econometric Society, vol. 75(4), pages 1175-1189, July.
    17. Fang, Fang & Li, Jialiang & Xia, Xiaochao, 2022. "Semiparametric model averaging prediction for dichotomous response," Journal of Econometrics, Elsevier, vol. 229(2), pages 219-245.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Haowen Bao & Zongwu Cai & Yuying Sun & Shouyang Wang, 2023. "Penalized Model Averaging for High Dimensional Quantile Regressions," WORKING PAPERS SERIES IN THEORETICAL AND APPLIED ECONOMICS 202302, University of Kansas, Department of Economics.
    2. Yuying Sun & Shaoxin Hong & Zongwu Cai, 2025. "State-Varying Model Averaging Prediction," WORKING PAPERS SERIES IN THEORETICAL AND APPLIED ECONOMICS 202507, University of Kansas, Department of Economics.
    3. Liao, Jun & Zou, Guohua, 2020. "Corrected Mallows criterion for model averaging," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    4. Xianwen Sun & Lixin Zhang, 2024. "Model averaging estimation for nonparametric varying-coefficient models with multiplicative heteroscedasticity," Statistical Papers, Springer, vol. 65(3), pages 1375-1409, May.
    5. Yuying Sun & Shaoxin Hong & Zongwu Cai, 2023. "Optimal Local Model Averaging for Divergent-Dimensional Functional-Coefficient Regressions," WORKING PAPERS SERIES IN THEORETICAL AND APPLIED ECONOMICS 202309, University of Kansas, Department of Economics, revised Sep 2023.
    6. Wenchao Xu & Xinyu Zhang, 2024. "On Asymptotic Optimality of Least Squares Model Averaging When True Model Is Included," Papers 2411.09258, arXiv.org.
    7. Guozhi Hu & Weihu Cheng & Jie Zeng, 2023. "Optimal Model Averaging for Semiparametric Partially Linear Models with Censored Data," Mathematics, MDPI, vol. 11(3), pages 1-21, February.
    8. Jingwen Tu & Hu Yang & Chaohui Guo & Jing Lv, 2021. "Model averaging marginal regression for high dimensional conditional quantile prediction," Statistical Papers, Springer, vol. 62(6), pages 2661-2689, December.
    9. Zhang, Xinyu & Liu, Chu-An, 2023. "Model averaging prediction by K-fold cross-validation," Journal of Econometrics, Elsevier, vol. 235(1), pages 280-301.
    10. Yuan, Chaoxia & Fang, Fang & Ni, Lyu, 2022. "Mallows model averaging with effective model size in fragmentary data prediction," Computational Statistics & Data Analysis, Elsevier, vol. 173(C).
    11. Steven F. Lehrer & Tian Xie, 2022. "The Bigger Picture: Combining Econometrics with Analytics Improves Forecasts of Movie Success," Management Science, INFORMS, vol. 68(1), pages 189-210, January.
    12. Giuseppe De Luca & Jan Magnus & Franco Peracchi, 2022. "Asymptotic properties of the weighted average least squares (WALS) estimator," Tinbergen Institute Discussion Papers 22-022/III, Tinbergen Institute.
    13. Xianwen Sun & Lixin Zhang, 2024. "Jackknife model averaging for mixed-data kernel-weighted spline quantile regressions," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 87(7), pages 805-842, October.
    14. Ryan Greenaway-McGrevy & Kade Sorensen, 2021. "A spatial model averaging approach to measuring house prices," Journal of Spatial Econometrics, Springer, vol. 2(1), pages 1-32, December.
    15. Shi, Ruoyao, 2024. "An Averaging Estimator For Two-Step M-Estimation In Semiparametric Models," Econometric Theory, Cambridge University Press, vol. 40(3), pages 652-687, June.
    16. Fang, Fang & Yang, Qiwei & Tian, Wenling, 2022. "Cross-validation for selecting the penalty factor in least squares model averaging," Economics Letters, Elsevier, vol. 217(C).
    17. Haowen Bao & Zongwu Cai & Yuying Sun & Shouyang Wang, 2023. "Penalized Optimal Forecast Combination for Quantile Regressions," WORKING PAPERS SERIES IN THEORETICAL AND APPLIED ECONOMICS 202514, University of Kansas, Department of Economics, revised May 2025.
    18. Chen, Yi-Ting & Liu, Chu-An, 2023. "Model averaging for asymptotically optimal combined forecasts," Journal of Econometrics, Elsevier, vol. 235(2), pages 592-607.
    19. Xiaochao Xia, 2021. "Model averaging prediction for nonparametric varying-coefficient models with B-spline smoothing," Statistical Papers, Springer, vol. 62(6), pages 2885-2905, December.
    20. Zhao, Shangwei & Xie, Tian & Ai, Xin & Yang, Guangren & Zhang, Xinyu, 2023. "Correcting sample selection bias with model averaging for consumer demand forecasting," Economic Modelling, Elsevier, vol. 123(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:66:y:2025:i:2:d:10.1007_s00362-025-01664-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.