IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2510.20372.html
   My bibliography  Save this paper

Testing Most Influential Sets

Author

Listed:
  • Lucas Darius Konrad
  • Nikolas Kuschnig

Abstract

Small subsets of data with disproportionate influence on model outcomes can have dramatic impacts on conclusions, with a few data points sometimes overturning key findings. While recent work has developed methods to identify these most influential sets, no formal theory exists to determine when their influence reflects genuine problems rather than natural sampling variation. We address this gap by developing a principled framework for assessing the statistical significance of most influential sets. Our theoretical results characterize the extreme value distributions of maximal influence and enable rigorous hypothesis tests for excessive influence, replacing current ad-hoc sensitivity checks. We demonstrate the practical value of our approach through applications across economics, biology, and machine learning benchmarks.

Suggested Citation

  • Lucas Darius Konrad & Nikolas Kuschnig, 2025. "Testing Most Influential Sets," Papers 2510.20372, arXiv.org, revised Oct 2025.
  • Handle: RePEc:arx:papers:2510.20372
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2510.20372
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Bucher, Axel & Segers, Johan, 2017. "On the maximum likelihood estimator for the Generalized Extreme-Value distribution," LIDAM Reprints ISBA 2017039, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Silius M. Vandeskog & Sara Martino & Daniela Castro-Camilo & Håvard Rue, 2022. "Modelling Sub-daily Precipitation Extremes with the Blended Generalised Extreme Value Distribution," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 27(4), pages 598-621, December.
    2. Guillou, Armelle & Padoan, Simone A. & Rizzelli, Stefano, 2018. "Inference for asymptotically independent samples of extremes," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 114-135.
    3. Sweta Rai & Alexis Hoffman & Soumendra Lahiri & Douglas W. Nychka & Stephan R. Sain & Soutir Bandyopadhyay, 2024. "Fast parameter estimation of generalized extreme value distribution using neural networks," Environmetrics, John Wiley & Sons, Ltd., vol. 35(3), May.
    4. Rebeca Klamerick Lima & Felipe Sousa Quintino & Melquisadec Oliveira & Luan Carlos de Sena Monteiro Ozelim & Tiago A. da Fonseca & Pushpa Narayan Rathie, 2024. "Multicomponent Stress–Strength Reliability with Extreme Value Distribution Margins: Its Theory and Application to Hydrological Data," J, MDPI, vol. 7(4), pages 1-17, December.
    5. Combes, Catherine & Ng, Hon Keung Tony, 2022. "On parameter estimation for Amoroso family of distributions," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 191(C), pages 309-327.
    6. Linda Mhalla & Valérie Chavez‐Demoulin & Debbie J. Dupuis, 2020. "Causal mechanism of extreme river discharges in the upper Danube basin network," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 69(4), pages 741-764, August.
    7. Jona Lilienthal & Leandra Zanger & Axel Bücher & Roland Fried, 2022. "A note on statistical tests for homogeneities in multivariate extreme value models for block maxima," Environmetrics, John Wiley & Sons, Ltd., vol. 33(7), November.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2510.20372. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.