A review on design inspired subsampling for big data
Author
Abstract
Suggested Citation
DOI: 10.1007/s00362-022-01386-w
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
References listed on IDEAS
- Zhijian He & Art B. Owen, 2016. "Extensible grids: uniform sampling on a space filling curve," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(4), pages 917-931, September.
- repec:hal:spmain:info:hdl:2441/64itsev5509q8aa5mrbhi0g0b6 is not listed on IDEAS
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2014.
"Monge-Kantorovich Depth, Quantiles, Ranks, and Signs,"
Papers
1412.8434, arXiv.org, revised Sep 2015.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2017. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," SciencePo Working papers Main hal-03391975, HAL.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich Depth, Quantiles, Ranks and Signs," Working Papers ECARES ECARES 2015-02, ULB -- Universite Libre de Bruxelles.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich depth, quantiles, ranks and signs," CeMMAP working papers 04/15, Institute for Fiscal Studies.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," SciencePo Working papers Main hal-03460056, HAL.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2017. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," Post-Print hal-03391975, HAL.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," Working Papers hal-03460056, HAL.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich depth, quantiles, ranks and signs," CeMMAP working papers CWP57/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich depth, quantiles, ranks and signs," CeMMAP working papers CWP04/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich depth, quantiles, ranks and signs," CeMMAP working papers 57/15, Institute for Fiscal Studies.
- repec:spo:wpmain:info:hdl:2441/64itsev5509q8aa5mrbhi0g0b6 is not listed on IDEAS
- Sokbae Lee & Serena Ng, 2020.
"An Econometric Perspective on Algorithmic Subsampling,"
Annual Review of Economics, Annual Reviews, vol. 12(1), pages 45-80, August.
- Sokbae Lee & Serena Ng, 2019. "An Econometric Perspective on Algorithmic Subsampling," Papers 1907.01954, arXiv.org, revised Apr 2020.
- Sokbae (Simon) Lee & Serena Ng, 2020. "An econometric perspective on algorithmic subsampling," CeMMAP working papers CWP18/20, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- D. Pfeffermann & C. J. Skinner & D. J. Holmes & H. Goldstein & J. Rasbash, 1998. "Weighting for unequal selection probabilities in multilevel models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(1), pages 23-40.
- Xiong, Shifeng & Li, Guoying, 2008. "Some results on the convergence of conditional distributions," Statistics & Probability Letters, Elsevier, vol. 78(18), pages 3249-3253, December.
- Cheng Meng & Xinlian Zhang & Jingyi Zhang & Wenxuan Zhong & Ping Ma, 2020. "More efficient approximation of smoothing splines via space-filling basis selection," Biometrika, Biometrika Trust, vol. 107(3), pages 723-735.
- Jun Yu & HaiYing Wang, 2022. "Subdata selection algorithm for linear model discrimination," Statistical Papers, Springer, vol. 63(6), pages 1883-1906, December.
- Haojie Ren & Changliang Zou & Nan Chen & Runze Li, 2022. "Large-Scale Datastreams Surveillance via Pattern-Oriented-Sampling," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(538), pages 794-808, April.
- Matias Quiroz & Robert Kohn & Mattias Villani & Minh-Ngoc Tran, 2019.
"Speeding Up MCMC by Efficient Data Subsampling,"
Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 831-843, April.
- Quiroz, Matias & Villani, Mattias & Kohn, Robert, 2015. "Speeding Up Mcmc By Efficient Data Subsampling," Working Paper Series 297, Sveriges Riksbank (Central Bank of Sweden).
- Kohn, Robert & Quiroz, Matias & Tran, Minh-Ngoc & Villani, Mattias, 2016. "Speeding up MCMC by Efficient Data Subsampling," Working Papers 2123/16205, University of Sydney Business School, Discipline of Business Analytics.
- Fred J. Hickernell, 2002. "Uniform designs limit aliasing," Biometrika, Biometrika Trust, vol. 89(4), pages 893-904, December.
- Boivin, Jean & Ng, Serena, 2006.
"Are more data always better for factor analysis?,"
Journal of Econometrics, Elsevier, vol. 132(1), pages 169-194, May.
- Jean Boivin & Serena Ng, 2003. "Are More Data Always Better for Factor Analysis?," NBER Working Papers 9829, National Bureau of Economic Research, Inc.
- Zhang, Haixiang & Wang, HaiYing, 2021. "Distributed subdata selection for big data via sampling-based approach," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).
- Kuang, Kun & Xiong, Ruoxuan & Cui, Peng & Athey, Susan & Li, Bo, 2018. "Stable Predictions across Unknown Environments," Research Papers 3695, Stanford University, Graduate School of Business.
- Jun Yu & HaiYing Wang & Mingyao Ai & Huiming Zhang, 2022. "Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators With Massive Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(537), pages 265-276, January.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2014.
"Monge-Kantorovich Depth, Quantiles, Ranks, and Signs,"
Papers
1412.8434, arXiv.org, revised Sep 2015.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2017. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," SciencePo Working papers hal-03391975, HAL.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich depth, quantiles, ranks and signs," CeMMAP working papers CWP57/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich depth, quantiles, ranks and signs," CeMMAP working papers CWP04/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," SciencePo Working papers hal-03460056, HAL.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," Sciences Po publications info:hdl:2441/3qnaslliat8, Sciences Po.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," Working Papers hal-03460056, HAL.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2017. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," Post-Print hal-03391975, HAL.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2015. "Monge-Kantorovich Depth, Quantiles, Ranks and Signs," Working Papers ECARES ECARES 2015-02, ULB -- Universite Libre de Bruxelles.
- Victor Chernozhukov & Alfred Galichon & Marc Hallin & Marc Henry, 2017. "Monge-Kantorovich Depth, Quantiles, Ranks, and Signs," Sciences Po publications info:hdl:2441/64itsev5509, Sciences Po.
- Yaping Wang & Fasheng Sun & Hongquan Xu, 2022. "On Design Orthogonality, Maximin Distance, and Projection Uniformity for Computer Experiments," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(537), pages 375-385, January.
- Haiying Wang & Yanyuan Ma, 2021. "Optimal subsampling for quantile regression in big data," Biometrika, Biometrika Trust, vol. 108(1), pages 99-112.
- Yaping Wang & Jianfeng Yang & Hongquan Xu, 2018. "On the connection between maximin distance designs and orthogonal designs," Biometrika, Biometrika Trust, vol. 105(2), pages 471-477.
- Serena Ng, 2017. "Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data," NBER Working Papers 23673, National Bureau of Economic Research, Inc.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Min Ren & Shengli Zhao & Mingqiu Wang & Xinbei Zhu, 2024. "Robust optimal subsampling based on weighted asymmetric least squares," Statistical Papers, Springer, vol. 65(4), pages 2221-2251, June.
- Jun Yu & HaiYing Wang, 2022. "Subdata selection algorithm for linear model discrimination," Statistical Papers, Springer, vol. 63(6), pages 1883-1906, December.
- Yue Chao & Lei Huang & Xuejun Ma & Jiajun Sun, 2024. "Optimal subsampling for modal regression in massive data," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 87(4), pages 379-409, May.
- Tao Zou & Xian Li & Xuan Liang & Hansheng Wang, 2021. "On the Subbagging Estimation for Massive Data," Papers 2103.00631, arXiv.org.
- Laurent Ferrara & Anna Simoni, 2023.
"When are Google Data Useful to Nowcast GDP? An Approach via Preselection and Shrinkage,"
Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 41(4), pages 1188-1202, October.
- Laurent Ferrara & Anna Simoni, 2019. "When are Google data useful to nowcast GDP? An approach via pre-selection and shrinkage," Working Papers 2019-04, Center for Research in Economics and Statistics.
- Laurent Ferrara & Anna Simoni, 2023. "When are Google Data Useful to Nowcast GDP? An Approach via Preselection and Shrinkage," Post-Print hal-03919944, HAL.
- Laurent Ferrara & Anna Simoni, 2020. "When are Google data useful to nowcast GDP? An approach via pre-selection and shrinkage," EconomiX Working Papers 2020-11, University of Paris Nanterre, EconomiX.
- Laurent Ferrara & Anna Simoni, 2019. "When are Google data useful to nowcast GDP? An approach via pre-selection and shrinkage," Working papers 717, Banque de France.
- Laurent Ferrara & Anna Simoni, 2020. "When are Google data useful to nowcast GDP? An approach via pre-selection and shrinkage," Papers 2007.00273, arXiv.org, revised Sep 2022.
- Laurent Ferrara & Anna Simoni, 2020. "When are Google data useful to nowcast GDP? An approach via pre-selection and shrinkage," Working Papers hal-04159714, HAL.
- Hongjian Shi & Mathias Drton & Marc Hallin & Fang Han, 2023. "Semiparametrically Efficient Tests of Multivariate Independence Using Center-Outward Quadrant, Spearman, and Kendall Statistics," Working Papers ECARES 2023-03, ULB -- Universite Libre de Bruxelles.
- Jun Yu & Jiaqi Liu & HaiYing Wang, 2023. "Information-based optimal subdata selection for non-linear models," Statistical Papers, Springer, vol. 64(4), pages 1069-1093, August.
- Gunsilius, Florian F., 2023. "A condition for the identification of multivariate models with binary instruments," Journal of Econometrics, Elsevier, vol. 235(1), pages 220-238.
- Florian F Gunsilius, 2025. "A primer on optimal transport for causal inference with observational data," Papers 2503.07811, arXiv.org, revised Mar 2025.
- Tianzhen Wang & Haixiang Zhang, 2022. "Optimal subsampling for multiplicative regression with massive data," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 76(4), pages 418-449, November.
- Alberto González-Sanz & Marc Hallin & Bodhisattva Sen, 2023. "Monotone Measure-Preserving Maps in Hilbert Spaces: Existence, Uniqueness, and Stability," Working Papers ECARES 2023-10, ULB -- Universite Libre de Bruxelles.
- Deng, Jiayi & Huang, Danyang & Ding, Yi & Zhu, Yingqiu & Jing, Bingyi & Zhang, Bo, 2024. "Subsampling spectral clustering for stochastic block models in large-scale networks," Computational Statistics & Data Analysis, Elsevier, vol. 189(C).
- Olivier Paul Faugeras & Ludger Rüschendorf, 2021. "Functional, randomized and smoothed multivariate quantile regions," Post-Print hal-03352330, HAL.
- Hudecová, Šárka & Šiman, Miroslav, 2024. "Stochastic hyperplane-based ranks and their use in multivariate portmanteau tests," Journal of Multivariate Analysis, Elsevier, vol. 204(C).
- Marcel Klatt & Axel Munk & Yoav Zemel, 2022. "Limit laws for empirical optimal solutions in random linear programs," Annals of Operations Research, Springer, vol. 315(1), pages 251-278, August.
- Sokbae Lee & Serena Ng, 2020.
"An Econometric Perspective on Algorithmic Subsampling,"
Annual Review of Economics, Annual Reviews, vol. 12(1), pages 45-80, August.
- Sokbae Lee & Serena Ng, 2019. "An Econometric Perspective on Algorithmic Subsampling," Papers 1907.01954, arXiv.org, revised Apr 2020.
- Sokbae (Simon) Lee & Serena Ng, 2020. "An econometric perspective on algorithmic subsampling," CeMMAP working papers CWP18/20, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Marc Hallin & Hang Liu, 2022. "Center-outward Rank- and Sign-based VARMA Portmanteau Tests," Working Papers ECARES 2022-27, ULB -- Universite Libre de Bruxelles.
- Serena Ng & Susannah Scanlan, 2023. "Constructing High Frequency Economic Indicators by Imputation," Papers 2303.01863, arXiv.org, revised Oct 2023.
- Alfred Galichon, 2021. "The Unreasonable Effectiveness of Optimal Transport in Economics," SciencePo Working papers Main hal-03936221, HAL.
- Bing Guo & Xiao-Rong Li & Min-Qian Liu & Xue Yang, 2023. "Construction of orthogonal general sliced Latin hypercube designs," Statistical Papers, Springer, vol. 64(3), pages 987-1014, June.
More about this item
Keywords
Massive data; Optimal design; Orthogonal array; Space filling design;All these keywords.
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:65:y:2024:i:2:d:10.1007_s00362-022-01386-w. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.