IDEAS home Printed from
   My bibliography  Save this article

More Data or Better Data? A Statistical Decision Problem


  • Jeff Dominitz
  • Charles F. Manski


When designing data collection, crucial questions arise regarding how much data to collect and how much effort to expend to enhance the quality of the collected data. To make choice of sample design a coherent subject of study, it is desirable to specify an explicit decision problem. We use the Wald framework of statistical decision theory to study allocation of a budget between two or more sampling processes. These processes all draw random samples from a population of interest and aim to collect data that are informative about the sample realizations of an outcome. They differ in the cost of data collection and the quality of the data obtained. One may incur lower cost per sample member but yield lower data quality than another. Increasing the allocation of budget to a low-cost process yields more data, while increasing the allocation to a high-cost process yields better data. We initially view the concept of “better data” abstractly and then fix attention on two important cases. In both cases, a high-cost sampling process accurately measures the outcome of each sample member. The cases differ in the data yielded by a low-cost process. In one, the low-cost process has non-response and in the other it provides a low-resolution interval measure of each sample member’s outcome. In these settings, we study minimax-regret sample design for prediction of a real-valued outcome under square loss; that is, design which minimizes maximum mean square error. The analysis imposes no assumptions that restrict the unobserved outcomes. Hence, the decision maker must cope with both the statistical imprecision of finite samples and the partial identification of the true state of nature.

Suggested Citation

  • Jeff Dominitz & Charles F. Manski, 2017. "More Data or Better Data? A Statistical Decision Problem," Review of Economic Studies, Oxford University Press, vol. 84(4), pages 1583-1605.
  • Handle: RePEc:oup:restud:v:84:y:2017:i:4:p:1583-1605.

    Download full text from publisher

    File URL:
    Download Restriction: Access to full text is restricted to subscribers.

    As the access to this document is restricted, you may want to search for a different version of it.


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. Battistin, Erich & De Nadai, Michele & Krishnan, Nandini, 2020. "The Insights and Illusions of Consumption Measurements," CEPR Discussion Papers 14730, C.E.P.R. Discussion Papers.
    2. Pedro Carneiro & Sokbae Lee & Daniel Wilhelm, 2020. "Optimal data collection for randomized control trials [Microcredit impacts: Evidence from a randomized microcredit program placement experiment by Compartamos Banco]," The Econometrics Journal, Royal Economic Society, vol. 23(1), pages 1-31.
    3. Pëllumb Reshidi & Alessandro Lizzeri & Leeat Yariv & Jimmy H. Chan & Wing Suen, 2021. "Individual and Collective Information Acquisition: An Experimental Study," NBER Working Papers 29557, National Bureau of Economic Research, Inc.
    4. Charles F. Manski, 2019. "Statistical inference for statistical decisions," Papers 1909.06853,
    5. Masahiro Kato & Masaaki Imaizumi & Takuya Ishihara & Toru Kitagawa, 2023. "Asymptotically Minimax Optimal Fixed-Budget Best Arm Identification for Expected Simple Regret Minimization," Papers 2302.02988,
    6. Francesca Molinari, 2020. "Microeconometrics with Partial Identification," Papers 2004.11751,
    7. Daniel H. Weinberg & John M. Abowd & Robert F. Belli & Noel Cressie & David C. Folch & Scott H. Holan & Margaret C. Levenstein & Kristen M. Olson & Jerome P. Reiter & Matthew D. Shapiro & Jolene Smyth, 2017. "Effects of a Government-Academic Partnership: Has the NSF-Census Bureau Research Network Helped Improve the U.S. Statistical System?," Working Papers 17-59r, Center for Economic Studies, U.S. Census Bureau.
    8. Charles F. Manski, 2021. "Econometrics for Decision Making: Building Foundations Sketched by Haavelmo and Wald," Econometrica, Econometric Society, vol. 89(6), pages 2827-2853, November.
    9. Dominitz, Jeff & Manski, Charles F., 2022. "Minimax-regret sample design in anticipation of missing data, with application to panel data," Journal of Econometrics, Elsevier, vol. 226(1), pages 104-114.
    10. Francesca Molinari, 2019. "Econometrics with Partial Identification," CeMMAP working papers CWP25/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    11. Charles F. Manski, 2019. "Meta-Analysis for Medical Decisions," NBER Working Papers 25504, National Bureau of Economic Research, Inc.
    12. Charles F. Manski, 2022. "Inference with Imputed Data: The Allure of Making Stuff Up," Papers 2205.07388,
    13. Charles F. Manski, 2019. "Remarks on statistical inference for statistical decisions," CeMMAP working papers CWP06/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.

    More about this item


    Sample design; statistical decision theory; minimax regret; missing data; point prediction;
    All these keywords.

    JEL classification:

    • C44 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Operations Research; Statistical Decision Theory
    • C83 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Survey Methods; Sampling Methods


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:oup:restud:v:84:y:2017:i:4:p:1583-1605.. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Oxford University Press (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.