IDEAS home Printed from https://ideas.repec.org/a/bla/scjsta/v49y2022i4p1761-1790.html

Soft maximin estimation for heterogeneous data

Author

Listed:
  • Adam Lund
  • Søren Wengel Mogensen
  • Niels Richard Hansen

Abstract

Extracting a common robust signal from data divided into heterogeneous groups is challenging when each group—in addition to the signal—contains large, unique variation components. Previously, maximin estimation was proposed as a robust method in the presence of heterogeneous noise. We propose soft maximin estimation as a computationally attractive alternative aimed at striking a balance between pooled estimation and (hard) maximin estimation. The soft maximin method provides a range of estimators, controlled by a parameter ζ>0$$ \zeta >0 $$, that interpolates pooled least squares estimation and maximin estimation. By establishing relevant theoretical properties we argue that the soft maximin method is statistically sensible and computationally attractive. We demonstrate, on real and simulated data, that soft maximin estimation can offer improvements over both pooled OLS and hard maximin in terms of predictive performance and computational complexity. A time and memory efficient implementation is provided in the R package SMME available on CRAN.

Suggested Citation

  • Adam Lund & Søren Wengel Mogensen & Niels Richard Hansen, 2022. "Soft maximin estimation for heterogeneous data," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(4), pages 1761-1790, December.
  • Handle: RePEc:bla:scjsta:v:49:y:2022:i:4:p:1761-1790
    DOI: 10.1111/sjos.12580
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/sjos.12580
    Download Restriction: no

    File URL: https://libkey.io/10.1111/sjos.12580?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. I. D. Currie & M. Durban & P. H. C. Eilers, 2006. "Generalized linear array models with applications to multidimensional smoothing," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(2), pages 259-280, April.
    2. Dominik Rothenhäusler & Nicolai Meinshausen & Peter Bühlmann & Jonas Peters, 2021. "Anchor regression: Heterogeneous data meet causality," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(2), pages 215-246, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ruoxuan Xiong & Allison Koenecke & Michael Powell & Zhu Shen & Joshua T. Vogelstein & Susan Athey, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Papers 2107.11732, arXiv.org, revised Apr 2023.
    2. Lee, Dae-Jin & Durbán, María, 2009. "P-spline anova-type interaction models for spatio-temporal smoothing," DES - Working Papers. Statistics and Econometrics. WS ws093312, Universidad Carlos III de Madrid. Departamento de Estadística.
    3. Ludwig Bothmann & Michael Windmann & Göran Kauermann, 2016. "Realtime classification of fish in underwater sonar videos," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 65(4), pages 565-584, August.
    4. Welham, S.J. & Thompson, R., 2009. "A note on bimodality in the log-likelihood function for penalized spline mixed models," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 920-931, February.
    5. Alba Carballo & María Durbán & Dae-Jin Lee, 2021. "Out-of-Sample Prediction in Multidimensional P-Spline Models," Mathematics, MDPI, vol. 9(15), pages 1-23, July.
    6. E. Zanini & E. Eastoe & M. J. Jones & D. Randell & P. Jonathan, 2020. "Flexible covariate representations for extremes," Environmetrics, John Wiley & Sons, Ltd., vol. 31(5), August.
    7. Ahbab Mohammad Fazle Rabbi & Stefano Mazzuco, 2021. "Mortality Forecasting with the Lee–Carter Method: Adjusting for Smoothing and Lifespan Disparity," European Journal of Population, Springer;European Association for Population Studies, vol. 37(1), pages 97-120, March.
    8. Camarda, Carlo G., 2012. "MortalitySmooth: An R Package for Smoothing Poisson Counts with P-Splines," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 50(i01).
    9. Fangting Zhou & Kejun He & Yang Ni, 2023. "Individualized causal discovery with latent trajectory embedded Bayesian networks," Biometrics, The International Biometric Society, vol. 79(4), pages 3191-3202, December.
    10. repec:osf:socarx:4ewv3_v1 is not listed on IDEAS
    11. repec:wyi:journl:002174 is not listed on IDEAS
    12. Lee, Dae-Jin & Durbán, María, 2008. "Smooth-car mixed models for spatial count data," DES - Working Papers. Statistics and Econometrics. WS ws085820, Universidad Carlos III de Madrid. Departamento de Estadística.
    13. Diana Marcela Pérez-Valencia & María Xosé Rodríguez-Álvarez & Martin P. Boer & Fred A. van Eeuwijk, 2026. "A One-Stage Approach for the Spatio-temporal Analysis of High-Throughput Phenotyping Data," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 31(1), pages 98-120, March.
    14. Chelsey Hill & James Li & Matthew J. Schneider & Martin T. Wells, 2021. "The tensor auto‐regressive model," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 40(4), pages 636-652, July.
    15. Carlo G. Camarda & Paul H. C. Eilers & Jutta Gampe, 2017. "Modelling trends in digit preference patterns," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 66(5), pages 893-918, November.
    16. Raghupathi, Laks & Randell, David & Ewans, Kevin & Jonathan, Philip, 2016. "Fast computation of large scale marginal extremes with multi-dimensional covariates," Computational Statistics & Data Analysis, Elsevier, vol. 95(C), pages 243-258.
    17. Yang Liu & Weimeng Wang, 2024. "What Can We Learn from a Semiparametric Factor Analysis of Item Responses and Response Time? An Illustration with the PISA 2015 Data," Psychometrika, Springer;The Psychometric Society, vol. 89(2), pages 386-410, June.
    18. repec:hum:wpaper:sfb649dp2017-024 is not listed on IDEAS
    19. Militino, A.F. & Goicoa, T. & Ugarte, M.D., 2012. "Estimating the percentage of food expenditure in small areas using bias-corrected P-spline based estimators," Computational Statistics & Data Analysis, Elsevier, vol. 56(10), pages 2934-2948.
    20. Markus Reichstein & Vitus Benson & Jan Blunk & Gustau Camps-Valls & Felix Creutzig & Carina J. Fearnley & Boran Han & Kai Kornhuber & Nasim Rahaman & Bernhard Schölkopf & José María Tárraga & Ricardo , 2025. "Early warning of complex climate risk with integrated artificial intelligence," Nature Communications, Nature, vol. 16(1), pages 1-13, December.
    21. Ayma Anza, Diego Armando & Durbán, María & Lee, Dae-Jin & Van de Kassteele, Jan, 2016. "Modelling latent trends from spatio-temporally grouped data using composite link mixed models," DES - Working Papers. Statistics and Econometrics. WS 23448, Universidad Carlos III de Madrid. Departamento de Estadística.
    22. Philipp F. M. Baumann & Enzo Rossi & Alexander Volkmann, 2020. "What Drives Inflation and How: Evidence from Additive Mixed Models Selected by cAIC," Papers 2006.06274, arXiv.org, revised Aug 2022.
    23. Basile, Roberto & Durbán, María & Mínguez, Román & María Montero, Jose & Mur, Jesús, 2014. "Modeling regional economic dynamics: Spatial dependence, spatial heterogeneity and nonlinearities," Journal of Economic Dynamics and Control, Elsevier, vol. 48(C), pages 229-245.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:scjsta:v:49:y:2022:i:4:p:1761-1790. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0303-6898 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.