IDEAS home Printed from https://ideas.repec.org/a/bla/istatr/v66y1998i2p137-156.html
   My bibliography  Save this article

Three Sides of Smoothing: Categorical Data Smoothing, Nonparametric Regression, and Density Estimation

Author

Listed:
  • Jeffrey S. Simonoff

Abstract

The past forty years have seen a great deal of research into the construction and properties of nonparametric estimates of smooth functions. This research has focused primarily on two sides of the smoothing problem: nonparametric regression and density estimation. Theoretical results for these two situations are similar, and multivariate density estimation was an early justification for the Nadaraya‐Watson kernel regression estimator. A third, less well‐explored, strand of applications of smoothing is to the estimation of probabilities in categorical data. In this paper the position of categorical data smoothing as a bridge between nonparametric regression and density estimation is explored. Nonparametric regression provides a paradigm for the construction of effective categorical smoothing estimates, and use of an appropriate likelihood function yields cell probability estimates with many desirable properties. Such estimates can be used to construct regression estimates when one or more of the categorical variables are viewed as response variables. They also lead naturally to the construction of well‐behaved density estimates using local or penalized likelihood estimation, which can then be used in a regression context. Several real data sets are used to illustrate these points. Durant les quarantes derni1ères années, I'estimation fonctionnelle nonparamétrique a connuun développement considérable ce travail présente, un bilan des recherches portant sur l'estimation des fonctions de densités et de régression. les régression. Les résultats théoriques associéè ces deux problémes d'estimation sont très similaires. De plus, l'estimateur de Nadaray‐watson d'une fonction de régression trouve se racines dans I'estimation de densités multivariées. Un troisième volet de I'estimation fonctionnelle, moins exploité, est celui de l'estimation par lissage de lois de probabilité de données catégoriques. Ce travail explore le fait que ce type d'estimation constitue un pont entre I'estimation nonparamétrique de densités et de estimateurs de lois de probabilités de données catégoriques. Un choix adéquat de la fonction de vraisemblance permet de construire des estimateurs possédant de nombreuses propriétés intéressantes. les estimateurs ainsi obtenus peuvent âtre utilisés en estimation de régression aussi bien dans le cas de variables réoibses catégiruques ou dans le cas d'une estimation préalable de densités par le biais de la vraisemblance locale ou pénalisée. les divers problèmes abordés dans ce travauk sont illustrés par l'entremise de plusieurs jeux de donnéelles.

Suggested Citation

  • Jeffrey S. Simonoff, 1998. "Three Sides of Smoothing: Categorical Data Smoothing, Nonparametric Regression, and Density Estimation," International Statistical Review, International Statistical Institute, vol. 66(2), pages 137-156, August.
  • Handle: RePEc:bla:istatr:v:66:y:1998:i:2:p:137-156
    DOI: 10.1111/j.1751-5823.1998.tb00411.x
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/j.1751-5823.1998.tb00411.x
    Download Restriction: no

    File URL: https://libkey.io/10.1111/j.1751-5823.1998.tb00411.x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Álvarez de Toledo, Pablo & Núñez, Fernando & Usabiaga, Carlos, 2020. "Matching in segmented labor markets: An analytical proposal based on high-dimensional contingency tables," Economic Modelling, Elsevier, vol. 93(C), pages 175-186.
    2. Ivy Liu & Alan Agresti, 2005. "The analysis of ordered categorical data: An overview and a survey of recent developments," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 14(1), pages 1-73, June.
    3. Mark S. Handcock & Paul L. Janssen, 2002. "Statistical Inference for the Relative Density," Sociological Methods & Research, , vol. 30(3), pages 394-424, February.
    4. Jerzy Rydlewski & Małgorzata Snarska & Dominik Mielczarek & Daniel Kosiorowski, 2014. "Sparse Methods for Analysis of Sparse Multivariate Data From Big Economic Databases," Statistics in Transition new series, Główny Urząd Statystyczny (Polska), vol. 15(1), pages 111-132, January.
    5. Fujin Yi & Mengfei Zhou & Yu Yvette Zhang, 2020. "Value of Incorporating ENSO Forecast in Crop Insurance Programs," American Journal of Agricultural Economics, John Wiley & Sons, vol. 102(2), pages 439-457, March.
    6. Cai, Junmeng & Xu, Di & Dong, Zhujun & Yu, Xi & Yang, Yang & Banks, Scott W. & Bridgwater, Anthony V., 2018. "Processing thermogravimetric analysis data for isoconversional kinetic analysis of lignocellulosic biomass pyrolysis: Case study of corn stalk," Renewable and Sustainable Energy Reviews, Elsevier, vol. 82(P3), pages 2705-2715.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:istatr:v:66:y:1998:i:2:p:137-156. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/isiiinl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.