IDEAS home Printed from https://ideas.repec.org/a/bla/scjsta/v49y2022i4p1791-1810.html

Pointwise comparison of two multivariate density functions

Author

Listed:
  • Martin L. Hazelton
  • Tilman M. Davies

Abstract

Testing the equality of two density functions based on independent samples is a classical problem in statistics. While the focus is often on global equality, it is also of interest to conduct local comparisons of density functions. Typically a type of Wald statistic is employed, where the local difference in densities is standardized by an estimate of the asymptotic standard error of that difference. We study the null distribution of this test statistic. The literature has suggested that this will be asymptotically standard normal, but we show that this is by no means always the case. In particular, when using bandwidth matrices of optimal order (for estimation), we prove that the asymptotic mean of this null distribution is nonzero when either the sample sizes differ, or when the Hessian matrices of the densities differ at the point where the densities are equal. In numerical studies we find the erroneous use of the standard normal null distribution in such cases can severely corrupt the test size. We show that these problems can be managed effectively by using common bandwidths when the Hessian matrices are equal, and applying adjusted undersmoothing bandwidth matrices when they are not.

Suggested Citation

  • Martin L. Hazelton & Tilman M. Davies, 2022. "Pointwise comparison of two multivariate density functions," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(4), pages 1791-1810, December.
  • Handle: RePEc:bla:scjsta:v:49:y:2022:i:4:p:1791-1810
    DOI: 10.1111/sjos.12565
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/sjos.12565
    Download Restriction: no

    File URL: https://libkey.io/10.1111/sjos.12565?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Peter J. Diggle, 1990. "A Point Process Modelling Approach to Raised Incidence of a Rare Phenomenon in the Vicinity of a Prespecified Point," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 153(3), pages 349-362, May.
    2. Signorini, D.F. & Jones, M.C., 2004. "Kernel Estimators for Univariate Binary Regression," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 119-126, January.
    3. Tarn Duong, 2013. "Local significant differences from nonparametric two-sample tests," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 25(3), pages 635-645, September.
    4. Anderson, N. H. & Hall, P. & Titterington, D. M., 1994. "Two-Sample Test Statistics for Measuring Discrepancies Between Two Multivariate Probability Density Functions Using Kernel-Based Density Estimates," Journal of Multivariate Analysis, Elsevier, vol. 50(1), pages 41-54, July.
    5. Duong, Tarn & Hazelton, Martin L., 2005. "Convergence rates for unconstrained bandwidth matrix selectors in multivariate kernel density estimation," Journal of Multivariate Analysis, Elsevier, vol. 93(2), pages 417-433, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jean-David Fermanian & Dominique Guégan, 2021. "Fair learning with bagging," Documents de travail du Centre d'Economie de la Sorbonne 21034, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    2. Alexandre Rodrigues & Peter Diggle & Renato Assuncao, 2010. "Semiparametric approach to point source modelling in epidemiology and criminology," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 59(3), pages 533-542, May.
    3. Billings, Stephen B. & Johnson, Erik B., 2012. "A non-parametric test for industrial specialization," Journal of Urban Economics, Elsevier, vol. 71(3), pages 312-331.
    4. Madeleine Cule & Richard Samworth & Michael Stewart, 2010. "Maximum likelihood estimation of a multi‐dimensional log‐concave density," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(5), pages 545-607, November.
    5. Juan Carlos Pardo-Fernández & María Dolores Jiménez-Gamero & Anouar El Ghouch, 2015. "A Non-parametric ANOVA-type Test for Regression Curves Based on Characteristic Functions," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 42(1), pages 197-213, March.
    6. Masayuki Hirukawa & Mari Sakudo, 2016. "Testing Symmetry of Unknown Densities via Smoothing with the Generalized Gamma Kernels," Econometrics, MDPI, vol. 4(2), pages 1-27, June.
    7. Bagnato, Luca & De Capitani, Lucio & Mazza, Angelo & Punzo, Antonio, 2015. "SDD: An R Package for Serial Dependence Diagrams," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 64(c02).
    8. Paciorek, Christopher J., 2007. "Computational techniques for spatial logistic regression with large data sets," Computational Statistics & Data Analysis, Elsevier, vol. 51(8), pages 3631-3653, May.
    9. Li, Qi & Maasoumi, Esfandiar & Racine, Jeffrey S., 2009. "A nonparametric test for equality of distributions with mixed categorical and continuous data," Journal of Econometrics, Elsevier, vol. 148(2), pages 186-200, February.
    10. M. D. Jiménez-Gamero & M. Cousido-Rocha & M. V. Alba-Fernández & F. Jiménez-Jiménez, 2022. "Testing the equality of a large number of populations," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 1-21, March.
    11. M. D. Jiménez-Gamero & J. L. Moreno-Rebollo & J. A. Mayor-Gallego, 2018. "On the estimation of the characteristic function in finite populations with applications," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 27(1), pages 95-121, March.
    12. Marcelo Fernandes & Eduardo Mendes & Olivier Scaillet, 2015. "Testing for symmetry and conditional symmetry using asymmetric kernels," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 67(4), pages 649-671, August.
    13. A C Gatrell & C E Dunn & P J Boyle, 1991. "The Relative Utility of the Central Postcode Directory and Pinpoint Address Code in Applications of Geographical Information Systems," Environment and Planning A, , vol. 23(10), pages 1447-1458, October.
    14. Di Marzio, Marco & Fensore, Stefania & Panzera, Agnese & Taylor, Charles C., 2019. "Local binary regression with spherical predictors," Statistics & Probability Letters, Elsevier, vol. 144(C), pages 30-36.
    15. Pablo Martínez-Camblor & Jacobo Uña-Álvarez, 2013. "Studying the bandwidth in $$k$$ -sample smooth tests," Computational Statistics, Springer, vol. 28(2), pages 875-892, April.
    16. Alessandro Casa & Giovanna Menardi, 2022. "Nonparametric semi-supervised classification with application to signal detection in high energy physics," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 31(3), pages 531-550, September.
    17. Davies, Tilman M. & Jones, Khair & Hazelton, Martin L., 2016. "Symmetric adaptive smoothing regimens for estimation of the spatial relative risk function," Computational Statistics & Data Analysis, Elsevier, vol. 101(C), pages 12-28.
    18. Chiang, Chin-Tsang & Chiu, Chih-Heng, 2012. "Nonparametric and semiparametric optimal transformations of markers," Journal of Multivariate Analysis, Elsevier, vol. 103(1), pages 124-141, January.
    19. Xu Qin & Huiqun Gao, 2024. "Nonparametric binary regression models with spherical predictors based on the random forests kernel," Computational Statistics, Springer, vol. 39(6), pages 3031-3048, September.
    20. Pavia, Jose M., 2015. "Testing Goodness-of-Fit with the Kernel Density Estimator: GoFKernel," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 66(c01).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:scjsta:v:49:y:2022:i:4:p:1791-1810. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0303-6898 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.