IDEAS home Printed from https://ideas.repec.org/p/zbw/fubsbe/336787.html

Mapping high-income taxpayers in Berlin using kernel-smoothed proportions from aggregated georeferenced data

Author

Listed:
  • Gril, Lorena
  • Rendtel, Ulrich

Abstract

The rare access to exact official geocoordinates opens new methodological possibilities for analyzing highly sensitive tax data. We explore their visualization potential and systematically evaluate aggregation as an anonymization strategy, with particular attention to its methodological and analytical implications. For an analysis of high-income taxpayers in Berlin, Germany, the focus is on the presentation of regional shares. In addition to frequency maps, smoothed representations using kernel density estimation are analyzed in particular, and their cartographic characteristics are discussed. Due to the high sensitivity of individual-level data, such data are generally not published, which is why anonymization is required in official statistics. This applies in particular to the group of high-income taxpayers. Using exact data as a gold standard makes it possible to systematically analyze the distortions caused by aggregation, one of the most commonly used anonymization methods in official statistics. In order to correct these distortions, a measurement error model is employed that explicitly accounts for the aggregation process and produces smoothed kernel density estimates for interpretable cartographic representations. In addition, the measurement error model is linked with census information to demonstrate a realistic application scenario. Local and global error measures are intended to empirically substantiate the improvement achieved through the use of the measurement error model.

Suggested Citation

  • Gril, Lorena & Rendtel, Ulrich, 2026. "Mapping high-income taxpayers in Berlin using kernel-smoothed proportions from aggregated georeferenced data," Discussion Papers 2026/2, Free University Berlin, School of Business & Economics.
  • Handle: RePEc:zbw:fubsbe:336787
    DOI: 10.17169/refubium-51220
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/336787/1/1960483811.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.17169/refubium-51220?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xu Guo & Runze Li & Zhe Zhang & Changliang Zou, 2025. "Model-Free Statistical Inference on High-Dimensional Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 120(549), pages 186-197, January.
    2. Hall, Peter & Turlach, Berwin A., 1999. "Reducing bias in curve estimation by use of weights," Computational Statistics & Data Analysis, Elsevier, vol. 30(1), pages 67-86, March.
    3. Necla Gündüz & Şule Karakoç, 2023. "Optimal Bandwidth Selection Methods with Application to Wind Speed Distribution," Mathematics, MDPI, vol. 11(21), pages 1-21, October.
    4. Zhiyuan Wang & Kaden R. A. Hazzard, 2025. "Particle exchange statistics beyond fermions and bosons," Nature, Nature, vol. 637(8045), pages 314-318, January.
    5. Tarn Duong & Martin L. Hazelton, 2005. "Cross‐validation Bandwidth Matrices for Multivariate Kernel Density Estimation," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 32(3), pages 485-506, September.
    6. Jos W. R. Twisk, 2025. "Descriptive Statistics," Springer Books, in: Basic Principles of Applied Medical Statistics, chapter 0, pages 7-15, Springer.
    7. Peter Diggle, 1985. "A Kernel Method for Smoothing Point Process Data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 34(2), pages 138-147, June.
    8. Brenzel, Hanna & Gebers, Kathrin, 2020. "Werkstattbericht: Georeferenzierung im Statistischen Verbund," WISTA – Wirtschaft und Statistik, Statistisches Bundesamt (Destatis), Wiesbaden, vol. 72(6), pages 48-57.
    9. Benjamin Wilson & Neal Wilson & Sierra Martin, 2021. "Using GIS to Advance Social Economics Research: Geocoding, Aggregation, and Spatial Thinking," Forum for Social Economics, Taylor & Francis Journals, vol. 50(4), pages 480-504, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Olli Kurkela & Saara Metso & Leena Forma & Kimmo Suokas & Pekka Rissanen & Jaakko Nevalainen, 2026. "Productivity costs of type 2 diabetes with or without co-occurring substance use disorder and depression," Health Economics Review, Springer, vol. 16(1), pages 1-11, December.
    2. Pejić Sonja & Tomašević Ana Nešić, 2025. "Foreign direct investment and technology transfer: unlocking Serbia’s growth potential," Engineering Management in Production and Services, Sciendo, vol. 17(3), pages 101-112.
    3. Arthur Charpentier & Ewen Gallic, 2016. "Kernel density estimation based on Ripley’s correction," Post-Print halshs-01238499, HAL.
    4. Isabel Fuentes-Santos & Wenceslao González-Manteiga & Jorge Mateu, 2016. "Consistent Smooth Bootstrap Kernel Intensity Estimation for Inhomogeneous Spatial Poisson Point Processes," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(2), pages 416-435, June.
    5. Giuseppe Espa & Giuseppe Arbia & Diego Giuliani, 2013. "Conditional versus unconditional industrial agglomeration: disentangling spatial dependence and spatial heterogeneity in the analysis of ICT firms’ distribution in Milan," Journal of Geographical Systems, Springer, vol. 15(1), pages 31-50, January.
    6. Hu, Shuowen & Poskitt, D.S. & Zhang, Xibin, 2012. "Bayesian adaptive bandwidth kernel density estimation of irregular multivariate distributions," Computational Statistics & Data Analysis, Elsevier, vol. 56(3), pages 732-740.
    7. Billings, Stephen B. & Johnson, Erik B., 2012. "A non-parametric test for industrial specialization," Journal of Urban Economics, Elsevier, vol. 71(3), pages 312-331.
    8. M. N. M. Lieshout, 2020. "Infill Asymptotics and Bandwidth Selection for Kernel Estimators of Spatial Intensity Functions," Methodology and Computing in Applied Probability, Springer, vol. 22(3), pages 995-1008, September.
    9. Boris Branisa & Adriana Cardozo, 2009. "Regional Growth Convergence in Colombia Using Social Indicators," Ibero America Institute for Econ. Research (IAI) Discussion Papers 195, Ibero-America Institute for Economic Research.
    10. Mazo, Gildas & Averyanov, Yaroslav, 2019. "Constraining kernel estimators in semiparametric copula mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 138(C), pages 170-189.
    11. Nicoletta D'Angelo, 2025. "Detecting Changes in Space‐Varying Parameters of Local Poisson Point Processes," Environmetrics, John Wiley & Sons, Ltd., vol. 36(5), July.
    12. Mola-Yudego, Blas & Selkimäki, Mari & González-Olabarria, José Ramón, 2014. "Spatial analysis of the wood pellet production for energy in Europe," Renewable Energy, Elsevier, vol. 63(C), pages 76-83.
    13. Yingqi Zhao & Donglin Zeng & Amy H. Herring & Amy Ising & Anna Waller & David Richardson & Michael R. Kosorok, 2011. "Detecting Disease Outbreaks Using Local Spatiotemporal Methods," Biometrics, The International Biometric Society, vol. 67(4), pages 1508-1517, December.
    14. François Sémécurbe & Cécile Tannier & Stéphane G. Roux, 2019. "Applying two fractal methods to characterise the local and global deviations from scale invariance of built patterns throughout mainland France," Journal of Geographical Systems, Springer, vol. 21(2), pages 271-293, June.
    15. Peng Hou & Xiaojian Yi & Haiping Dong, 2020. "A Spatial Statistic Based Risk Assessment Approach to Prioritize the Pipeline Inspection of the Pipeline Network," Energies, MDPI, vol. 13(3), pages 1-16, February.
    16. Noureddine Kouaissah & Sergio Ortobelli Lozza & Ikram Jebabli, 2022. "Portfolio Selection Using Multivariate Semiparametric Estimators and a Copula PCA-Based Approach," Computational Economics, Springer;Society for Computational Economics, vol. 60(3), pages 833-859, October.
    17. Giacomo Bilotti & Michael Kempf & Eljas Oksanen & Lizzie Scholtus & Oliver Nakoinz, 2024. "Point Pattern Analysis (PPA) as a tool for reproducible archaeological site distribution analyses and location processes in early iron age south-west Germany," PLOS ONE, Public Library of Science, vol. 19(3), pages 1-25, March.
    18. Ondřej Šedivý & Antti Penttinen, 2014. "Intensity estimation for inhomogeneous Gibbs point process with covariates-dependent chemical activity," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 68(3), pages 225-249, August.
    19. Bouezmarni, Taoufik & Rombouts, Jeroen V.K., 2010. "Nonparametric density estimation for positive time series," Computational Statistics & Data Analysis, Elsevier, vol. 54(2), pages 245-261, February.
    20. Isfort, Claudia & Dommermuth, Silke, 2023. "Der neue Kommentar zum Bundesstatistikgesetz: zur Weiterentwicklung des Statistikrechts seit 1988," WISTA – Wirtschaft und Statistik, Statistisches Bundesamt (Destatis), Wiesbaden, vol. 75(2), pages 19-26.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:fubsbe:336787. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/fwfubde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.