IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v156y2021ics016794732030219x.html
   My bibliography  Save this article

Density estimation on a network

Author

Listed:
  • Liu, Yang
  • Ruppert, David

Abstract

A novel approach is proposed for density estimation on a network. Nonparametric density estimation on a network is formulated as a nonparametric regression problem by binning. Nonparametric regression using local polynomial kernel-weighted least squares have been studied rigorously, and its asymptotic properties make it superior to kernel estimators such as the Nadaraya–Watson estimator. When applied to a network, the best estimator near a vertex depends on the amount of smoothness at the vertex. Often, there are no compelling reasons to assume that a density will be continuous or discontinuous at a vertex, hence a data driven approach is proposed. To estimate the density in a neighborhood of a vertex, a two-step procedure is proposed. The first step of this pretest estimator fits a separate local polynomial regression on each edge using data only on that edge, and then tests for equality of the estimates at the vertex. If the null hypothesis is not rejected, then the second step re-estimates the regression function in a small neighborhood of the vertex, subject to a joint equality constraint. Since the derivative of the density may be discontinuous at the vertex, a piecewise polynomial local regression estimate is used to model the change in slope. The special case of local piecewise linear regression is studied in detail and the leading bias and variance terms are derived using weighted least squares theory. The proposed approach will remove the bias near a vertex that has been noted for existing methods, which typically do not allow for discontinuity at vertices. For a fixed network, the proposed method scales sub-linearly with sample size and it can be extended to regression and varying coefficient models on a network. The working of the proposed model is demonstrated by simulation studies and applications to a dendrite network dataset.

Suggested Citation

  • Liu, Yang & Ruppert, David, 2021. "Density estimation on a network," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
  • Handle: RePEc:eee:csdana:v:156:y:2021:i:c:s016794732030219x
    DOI: 10.1016/j.csda.2020.107128
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S016794732030219X
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2020.107128?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Adrian Baddeley & Aruna Jammalamadaka & Gopalan Nair, 2014. "Multitype point process analysis of spines on the dendrite network of a neuron," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 63(5), pages 673-694, November.
    2. Greg McSwiggan & Adrian Baddeley & Gopalan Nair, 2017. "Kernel Density Estimation on a Linear Network," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 44(2), pages 324-345, June.
    3. Hall, Peter & Wand, M. P., 1996. "On the Accuracy of Binned Kernel Density Estimators," Journal of Multivariate Analysis, Elsevier, vol. 56(2), pages 165-184, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Guillaume Grégoire & Josée Fortin & Isa Ebtehaj & Hossein Bonakdari, 2022. "Novel Hybrid Statistical Learning Framework Coupled with Random Forest and Grasshopper Optimization Algorithm to Forecast Pesticide Use on Golf Courses," Agriculture, MDPI, vol. 12(7), pages 1-19, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Matthias Eckardt & Jorge Mateu, 2021. "Second-order and local characteristics of network intensity functions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(2), pages 318-340, June.
    2. Michel Harel & Jean-François Lenain & Joseph Ngatchou-Wandji, 2016. "Asymptotic behaviour of binned kernel density estimators for locally non-stationary random fields," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(2), pages 296-321, June.
    3. Holmström, Lasse, 2000. "The Accuracy and the Computational Complexity of a Multivariate Binned Kernel Density Estimator," Journal of Multivariate Analysis, Elsevier, vol. 72(2), pages 264-309, February.
    4. Adriano Z. Zambom & Ronaldo Dias, 2013. "A Review of Kernel Density Estimation with Applications to Econometrics," International Econometric Review (IER), Econometric Research Association, vol. 5(1), pages 20-42, April.
    5. Koo, Ja-Yong & Kooperberg, Charles, 2000. "Logspline density estimation for binned data," Statistics & Probability Letters, Elsevier, vol. 46(2), pages 133-147, January.
    6. Kristian Bjørn Hessellund & Ganggang Xu & Yongtao Guan & Rasmus Waagepetersen, 2022. "Second‐order semi‐parametric inference for multivariate log Gaussian Cox processes," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(1), pages 244-268, January.
    7. Jakob G. Rasmussen & Heidi S. Christensen, 2021. "Point Processes on Directed Linear Networks," Methodology and Computing in Applied Probability, Springer, vol. 23(2), pages 647-667, June.
    8. Meintanis, S. & Ushakov, N. G., 2004. "Binned goodness-of-fit tests based on the empirical characteristic function," Statistics & Probability Letters, Elsevier, vol. 69(3), pages 305-314, September.
    9. M. P. Wand & J. C. F. Yu, 2022. "Density estimation via Bayesian inference engines," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 106(2), pages 199-216, June.
    10. Slone, D.H., 2011. "Increasing accuracy of dispersal kernels in grid-based population models," Ecological Modelling, Elsevier, vol. 222(3), pages 573-579.
    11. Kozek, A. S. & Yin, J., 2004. "On Gauss quadrature and partial cross validation," Computational Statistics & Data Analysis, Elsevier, vol. 45(3), pages 431-448, April.
    12. Laura Anton-Sanchez & Pedro Larrañaga & Ruth Benavides-Piccione & Isabel Fernaud-Espinosa & Javier DeFelipe & Concha Bielza, 2017. "Three-dimensional spatial modeling of spines along dendritic networks in human cortical pyramidal neurons," PLOS ONE, Public Library of Science, vol. 12(6), pages 1-14, June.
    13. Tang, Qingguo & Karunamuni, Rohana J., 2016. "Fast and accurate computation for kernel estimators," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 49-62.
    14. Semeyutin, Artur & O’Neill, Robert, 2019. "A brief survey on the choice of parameters for: “Kernel density estimation for time series data”," The North American Journal of Economics and Finance, Elsevier, vol. 50(C).
    15. Sain, Stephan R., 2002. "Multivariate locally adaptive density estimation," Computational Statistics & Data Analysis, Elsevier, vol. 39(2), pages 165-186, April.
    16. Gonzalez-Manteiga, W. & Sanchez-Sellero, C. & Wand, M. P., 1996. "Accuracy of binned kernel functional approximations," Computational Statistics & Data Analysis, Elsevier, vol. 22(1), pages 1-16, June.
    17. Arnone, Eleonora & Ferraccioli, Federico & Pigolotti, Clara & Sangalli, Laura M., 2022. "A roughness penalty approach to estimate densities over two-dimensional manifolds," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    18. Nicoletta D’Angelo & Giada Adelfio & Jorge Mateu, 2023. "Local inhomogeneous second-order characteristics for spatio-temporal point processes occurring on linear networks," Statistical Papers, Springer, vol. 64(3), pages 779-805, June.
    19. Jan G. de Gooijer & Ao Yuan, 2011. "Kernel-Smoothed Conditional Quantiles of Correlated Bivariate Discrete Data," Tinbergen Institute Discussion Papers 11-011/4, Tinbergen Institute.
    20. Gao, Wenwu & Wang, Jiecheng & Zhang, Ran, 2023. "Quasi-interpolation for multivariate density estimation on bounded domain," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 203(C), pages 592-608.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:156:y:2021:i:c:s016794732030219x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.