IDEAS home Printed from
   My bibliography  Save this article

On the Performance of Kernel Estimators for High-Dimensional, Sparse Binary Data


  • Grund, B.
  • Hall, P.


We develop mathematical models for high-dimensional binary distributions, and apply them to the study of smoothing methods for sparse binary data. Specifically, we treat the kernel-type estimator developed by Aitchison and Aitken (Biometrika63 (1976), 413-420). Our analysis is of an asymptotic nature. It permits a concise account of the way in which data dimension, data sparseness, and distribution smoothness interact to determine the over-all performance of smoothing methods. Previous work on this problem has been hampered by the requirement that the data dimension be fixed. Our approach allows dimension to increase with sample size, so that the theoretical model may accurately reflect the situations encountered in practice; e.g., approximately 20 dimensions and 40 data points. We compare the performance of kernel estimators with that of the cell frequency estimator, and describe the effectiveness of cross-validation.

Suggested Citation

  • Grund, B. & Hall, P., 1993. "On the Performance of Kernel Estimators for High-Dimensional, Sparse Binary Data," Journal of Multivariate Analysis, Elsevier, vol. 44(2), pages 321-344, February.
  • Handle: RePEc:eee:jmvana:v:44:y:1993:i:2:p:321-344

    Download full text from publisher

    File URL:
    Download Restriction: Full text for ScienceDirect subscribers only

    As the access to this document is restricted, you may want to search for a different version of it.


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. Hsiao, Cheng & Li, Qi & Racine, Jeffrey S., 2007. "A consistent model specification test with mixed discrete and continuous data," Journal of Econometrics, Elsevier, vol. 140(2), pages 802-826, October.
    2. repec:wyi:journl:002074 is not listed on IDEAS
    3. Dakshina G. De Silva & Robert P. McComb & Anita R. Schiller, 2013. "Do production subsidies have a wage incidence in wind power?," Applied Economics, Taylor & Francis Journals, vol. 45(28), pages 3963-3972, October.
    4. Li, Qi & Racine, Jeff, 2003. "Nonparametric estimation of distributions with categorical and continuous data," Journal of Multivariate Analysis, Elsevier, vol. 86(2), pages 266-292, August.
    5. Racine, Jeff & Li, Qi, 2004. "Nonparametric estimation of regression functions with both categorical and continuous data," Journal of Econometrics, Elsevier, vol. 119(1), pages 99-130, March.
    6. Li, Qi & Maasoumi, Esfandiar & Racine, Jeffrey S., 2009. "A nonparametric test for equality of distributions with mixed categorical and continuous data," Journal of Econometrics, Elsevier, vol. 148(2), pages 186-200, February.
    7. Efromovich, Sam, 2011. "Nonparametric estimation of the anisotropic probability density of mixed variables," Journal of Multivariate Analysis, Elsevier, vol. 102(3), pages 468-481, March.
    8. Aerts, Marc & Augustyns, Ilse & Janssen, Paul, 1997. "Sparse consistency and smoothing for multinomial data," Statistics & Probability Letters, Elsevier, vol. 33(1), pages 41-48, April.

    More about this item


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:44:y:1993:i:2:p:321-344. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Dana Niculescu). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.