IDEAS home Printed from https://ideas.repec.org/a/taf/jnlasa/v113y2018i523p1094-1111.html
   My bibliography  Save this article

Distribution-Free Predictive Inference for Regression

Author

Listed:
  • Jing Lei
  • Max G’Sell
  • Alessandro Rinaldo
  • Ryan J. Tibshirani
  • Larry Wasserman

Abstract

We develop a general framework for distribution-free predictive inference in regression, using conformal inference. The proposed methodology allows for the construction of a prediction band for the response variable using any estimator of the regression function. The resulting prediction band preserves the consistency properties of the original estimator under standard assumptions, while guaranteeing finite-sample marginal coverage even when these assumptions do not hold. We analyze and compare, both empirically and theoretically, the two major variants of our conformal framework: full conformal inference and split conformal inference, along with a related jackknife method. These methods offer different tradeoffs between statistical accuracy (length of resulting prediction intervals) and computational efficiency. As extensions, we develop a method for constructing valid in-sample prediction intervals called rank-one-out conformal inference, which has essentially the same computational efficiency as split conformal inference. We also describe an extension of our procedures for producing prediction bands with locally varying length, to adapt to heteroscedasticity in the data. Finally, we propose a model-free notion of variable importance, called leave-one-covariate-out or LOCO inference. Accompanying this article is an R package conformalInference that implements all of the proposals we have introduced. In the spirit of reproducibility, all of our empirical results can also be easily (re)generated using this package.

Suggested Citation

  • Jing Lei & Max G’Sell & Alessandro Rinaldo & Ryan J. Tibshirani & Larry Wasserman, 2018. "Distribution-Free Predictive Inference for Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1094-1111, July.
  • Handle: RePEc:taf:jnlasa:v:113:y:2018:i:523:p:1094-1111
    DOI: 10.1080/01621459.2017.1307116
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/01621459.2017.1307116
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/01621459.2017.1307116?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wayne Xinwei Wan & Thies Lindenthal, 2023. "Testing machine learning systems in real estate," Real Estate Economics, American Real Estate and Urban Economics Association, vol. 51(3), pages 754-778, May.
    2. Bradley Efron, 2021. "Resampling Plans and the Estimation of Prediction Error," Stats, MDPI, vol. 4(4), pages 1-25, December.
    3. Acharki, Naoufal & Bertoncello, Antoine & Garnier, Josselin, 2023. "Robust prediction interval estimation for Gaussian processes by cross-validation method," Computational Statistics & Data Analysis, Elsevier, vol. 178(C).
    4. Lasanthi C. R. Pelawa Watagoda & David J. Olive, 2021. "Comparing six shrinkage estimators with large sample theory and asymptotically optimal prediction intervals," Statistical Papers, Springer, vol. 62(5), pages 2407-2431, October.
    5. Amini, Mostafa & Bagheri, Ali & Delen, Dursun, 2022. "Discovering injury severity risk factors in automobile crashes: A hybrid explainable AI framework for decision support," Reliability Engineering and System Safety, Elsevier, vol. 226(C).
    6. Pedro Delicado & Daniel Peña, 2023. "Understanding complex predictive models with ghost variables," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(1), pages 107-145, March.
    7. Yaxin Zhang & Tao Hu, 2022. "Ensemble Interval Prediction for Solar Photovoltaic Power Generation," Energies, MDPI, vol. 15(19), pages 1-30, September.
    8. Wheeler, Andrew Palmer & Gerell, Manne & Yoo, Youngmin, 2019. "Testing the Spatial Accuracy of Address Based Geocoding for Gun Shot Locations," SocArXiv hrtcf, Center for Open Science.
    9. Hu, Jianming & Luo, Qingxi & Tang, Jingwei & Heng, Jiani & Deng, Yuwen, 2022. "Conformalized temporal convolutional quantile regression networks for wind power interval forecasting," Energy, Elsevier, vol. 248(C).
    10. Brian D. Williamson & Peter B. Gilbert & Marco Carone & Noah Simon, 2021. "Nonparametric variable importance assessment using machine learning techniques," Biometrics, The International Biometric Society, vol. 77(1), pages 9-22, March.
    11. Victor Chernozhukov & Kaspar Wuthrich & Yinchu Zhu, 2019. "Distributional conformal prediction," Papers 1909.07889, arXiv.org, revised Aug 2021.
    12. Jiafeng Chen, 2023. "Synthetic Control as Online Linear Regression," Econometrica, Econometric Society, vol. 91(2), pages 465-491, March.
    13. Linwei Hu & Jie Chen & Joel Vaughan & Soroush Aramideh & Hanyu Yang & Kelly Wang & Agus Sudjianto & Vijayan N. Nair, 2021. "Supervised Machine Learning Techniques: An Overview with Applications to Banking," International Statistical Review, International Statistical Institute, vol. 89(3), pages 573-604, December.
    14. Solari, Aldo & Djordjilović, Vera, 2022. "Multi split conformal prediction," Statistics & Probability Letters, Elsevier, vol. 184(C).
    15. Lihua Lei & Emmanuel J. Candès, 2021. "Conformal inference of counterfactuals and individual treatment effects," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(5), pages 911-938, November.
    16. Fernando Delbianco & Fernando Tohmé, 2023. "Individualized Conformal," Working Papers 247, Red Nacional de Investigadores en Economía (RedNIE).
    17. Tengyuan Liang, 2022. "Universal prediction band via semi‐definite programming," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(4), pages 1558-1580, September.
    18. Xie, Haihan & Kong, Linglong, 2023. "Gaussian copula function-on-scalar regression in reproducing kernel Hilbert space," Journal of Multivariate Analysis, Elsevier, vol. 198(C).
    19. Linwei Hu & Jie Chen & Joel Vaughan & Hanyu Yang & Kelly Wang & Agus Sudjianto & Vijayan N. Nair, 2020. "Supervised Machine Learning Techniques: An Overview with Applications to Banking," Papers 2008.04059, arXiv.org.
    20. Mulubrhan G. Haile & Lingling Zhang & David J. Olive, 2024. "Predicting Random Walks and a Data-Splitting Prediction Region," Stats, MDPI, vol. 7(1), pages 1-11, January.
    21. Diquigiovanni, Jacopo & Fontana, Matteo & Vantini, Simone, 2022. "Conformal prediction bands for multivariate functional data," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    22. Zhang, Yingying & Shi, Chengchun & Luo, Shikai, 2023. "Conformal off-policy prediction," LSE Research Online Documents on Economics 118250, London School of Economics and Political Science, LSE Library.
    23. Varun Gupta & Christopher Jung & Georgy Noarov & Mallesh M. Pai & Aaron Roth, 2021. "Online Multivalid Learning: Means, Moments, and Prediction Intervals," Papers 2101.01739, arXiv.org.
    24. Leying Guan, 2023. "Localized conformal prediction: a generalized inference framework for conformal prediction," Biometrika, Biometrika Trust, vol. 110(1), pages 33-50.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:jnlasa:v:113:y:2018:i:523:p:1094-1111. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/UASA20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.