IDEAS home Printed from https://ideas.repec.org/a/eee/ecosta/v25y2023icp93-109.html
   My bibliography  Save this article

On The Problem of Relevance in Statistical Inference

Author

Listed:
  • Mukhopadhyay, Subhadeep
  • Wang, Kaijun

Abstract

Given a large cohort of “similar” cases one can construct an efficient statistical inference procedure by learning from the experience of others (also known as “borrowing strength” from the ensemble). But, it is not obvious how to go about gathering strength when each piece of information is fuzzy—part of a massive database of heterogeneous cases. The danger is that borrowing information from irrelevant cases might heavily damage the quality of the inference! This raises some fundamental questions for big data inference: When (not) to borrow? Whom (not) to borrow? How (not) to borrow? These questions are at the heart of the “Problem of Relevance” in statistical inference – a puzzle that has remained too little addressed since its inception nearly half a century ago [Efron and Morris, J. Am. Stat. Assoc, 67, 337 (1972)]. A new model of large-scale inference is developed to tackle some of the unsettled issues that surround the relevance problem. Through examples, it is demonstrated how our new statistical perspective answers previously unanswerable questions in a realistic and feasible way.1

Suggested Citation

  • Mukhopadhyay, Subhadeep & Wang, Kaijun, 2023. "On The Problem of Relevance in Statistical Inference," Econometrics and Statistics, Elsevier, vol. 25(C), pages 93-109.
  • Handle: RePEc:eee:ecosta:v:25:y:2023:i:c:p:93-109
    DOI: 10.1016/j.ecosta.2021.10.013
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S2452306221001283
    Download Restriction: Full text for ScienceDirect subscribers only. Contains open access articles

    File URL: https://libkey.io/10.1016/j.ecosta.2021.10.013?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Bradley Efron, 2016. "Empirical Bayes deconvolution estimates," Biometrika, Biometrika Trust, vol. 103(1), pages 1-20.
    2. Subhadeep Mukhopadhyay & Emanuel Parzen, 2020. "Nonparametric universal copula modeling," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 36(1), pages 77-94, January.
    3. Jiaying Gu & Roger Koenker, 2016. "On a Problem of Robbins," International Statistical Review, International Statistical Institute, vol. 84(2), pages 224-244, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Subhadeep Mukhopadhyay, 2023. "Modelplasticity and abductive decision making," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 46(1), pages 255-276, June.
    2. Subhadeep & Mukhopadhyay, 2021. "A Maximum Entropy Copula Model for Mixed Data: Representation, Estimation, and Applications," Papers 2108.09438, arXiv.org, revised Aug 2022.
    3. Gribok, Andrei & Agarwal, Vivek & Yadav, Vaibhav, 2020. "Performance of empirical Bayes estimation techniques used in probabilistic risk assessment," Reliability Engineering and System Safety, Elsevier, vol. 201(C).
    4. Koen Jochmans & Martin Weidner, 2018. "Inference on a distribution from noisy draws," CeMMAP working papers CWP14/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    5. Zhang Qi & Xu Zheng & Lai Yutong, 2021. "An Empirical Bayes approach for the identification of long-range chromosomal interaction from Hi-C data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 20(1), pages 1-15, February.
    6. Raffaella Giacomini & Sokbae Lee & Silvia Sarpietro, 2023. "A Robust Method for Microforecasting and Estimation of Random Effects," Working Paper Series WP 2023-26, Federal Reserve Bank of Chicago.
    7. Feng, Long & Dicker, Lee H., 2018. "Approximate nonparametric maximum likelihood for mixture models: A convex optimization approach to fitting arbitrary multivariate mixing distributions," Computational Statistics & Data Analysis, Elsevier, vol. 122(C), pages 80-91.
    8. Subhadeep & Mukhopadhyay, 2022. "Modelplasticity and Abductive Decision Making," Papers 2203.03040, arXiv.org, revised Mar 2023.
    9. Eisenberg, Julia & Krühner, Paul, 2018. "The impact of negative interest rates on optimal capital injections," Insurance: Mathematics and Economics, Elsevier, vol. 82(C), pages 1-10.
    10. J. R. Lockwood & Katherine E. Castellano & Benjamin R. Shear, 2018. "Flexible Bayesian Models for Inferences From Coarsened, Group-Level Achievement Data," Journal of Educational and Behavioral Statistics, , vol. 43(6), pages 663-692, December.
    11. Patrick Kline, 2023. "A Comment on: “Invidious Comparisons: Ranking and Selection as Compound Decisions” by Jiaying Gu and Roger Koenker," Econometrica, Econometric Society, vol. 91(1), pages 47-52, January.
    12. Roger Koenker, 2017. "Bayesian deconvolution: an R vinaigrette," CeMMAP working papers 38/17, Institute for Fiscal Studies.
    13. Subhadeep Mukhopadhyay, 2023. "Abductive Inference and C. S. Peirce: 150 Years Later," Journal of Quantitative Economics, Springer;The Indian Econometric Society (TIES), vol. 21(1), pages 123-149, March.
    14. Jiaying Gu & Roger Koenker, 2020. "Invidious Comparisons: Ranking and Selection as Compound Decisions," Papers 2012.12550, arXiv.org, revised Sep 2021.
    15. Manuel Arellano & Stéphane Bonhomme, 2023. "Recovering Latent Variables by Matching," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 118(541), pages 693-706, January.
    16. Jiaying Gu & Roger Koenker, 2018. "Nonparametric maximum likelihood methods for binary response models with random coefficients," Papers 1811.03329, arXiv.org, revised Jan 2020.
    17. Roger Koenker, 2017. "Bayesian deconvolution: an R vinaigrette," CeMMAP working papers CWP38/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    18. Patrick Kline & Evan K Rose & Christopher R Walters, 2022. "Systemic Discrimination Among Large U.S. Employers [“Teachers and Student Achievement in the Chicago Public High Schools,”]," The Quarterly Journal of Economics, Oxford University Press, vol. 137(4), pages 1963-2036.
    19. Patrick Kline & Evan K Rose & Christopher R Walters, 2023. "Systemic Discrimination Among Large U.S. Employers," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 137(4), pages 1963-2036.
    20. Roger Koenker & Jiaying Gu, 2019. "Minimalist G-modelling: A comment on Efron," CeMMAP working papers CWP13/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ecosta:v:25:y:2023:i:c:p:93-109. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/econometrics-and-statistics .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.