IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v137y2019icp16-32.html

Online estimation of individual-level effects using streaming shrinkage factors

Author

Listed:
  • Ippel, L.
  • Kaptein, M.C.
  • Vermunt, J.K.

Abstract

It has become increasingly easy to collect data from individuals over long periods of time. Examples include smart-phone applications used to track movements with GPS, web-log data tracking individuals’ browsing behavior, and longitudinal (cohort) studies where many individuals are monitored over an extensive period of time. All these datasets cover a large number of individuals and collect data on the same individuals repeatedly, causing a nested structure in the data. Moreover, the data collection is never ‘finished’ as new data keep streaming in. It is well known that predictions that use the data of the individual whose individual-level effect is predicted in combination with the data of all the other individuals, are better in terms of squared error than those that just use the individual mean. However, when data are both nested and streaming, and the outcome variable is binary, computing these individual-level predictions can be computationally challenging. Five computationally-efficient estimation methods which do not revise “old” data but do account for the nested data structure are developed and evaluated. The methods are based on existing shrinkage factors. A shrinkage factor is used to predict an individual-level effect (i.e., the probability to score a 1), by weighing the individual mean and the mean over all data points. The performance of the existing and newly developed shrinkage factors are compared in a simulation study. While the existing methods differ in their prediction accuracy, the differences in accuracy between the novel shrinkage factors and the existing methods are extremely small. The novel methods are however computationally much more appealing.

Suggested Citation

  • Ippel, L. & Kaptein, M.C. & Vermunt, J.K., 2019. "Online estimation of individual-level effects using streaming shrinkage factors," Computational Statistics & Data Analysis, Elsevier, vol. 137(C), pages 16-32.
  • Handle: RePEc:eee:csdana:v:137:y:2019:i:c:p:16-32
    DOI: 10.1016/j.csda.2019.01.010
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947319300246
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2019.01.010?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Ippel, L. & Kaptein, M.C. & Vermunt, J.K., 2016. "Estimating random-intercept models on data streams," Computational Statistics & Data Analysis, Elsevier, vol. 104(C), pages 169-182.
    2. Bates, Douglas & Mächler, Martin & Bolker, Ben & Walker, Steve, 2015. "Fitting Linear Mixed-Effects Models Using lme4," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 67(i01).
    3. Mirjam Moerbeek & Gerard J. P. Breukelen & Martijn P. F. Berger, 2003. "A Comparison of Estimation Methods for Multilevel Logistic Models," Computational Statistics, Springer, vol. 18(1), pages 19-37, March.
    4. Sophia Rabe-Hesketh & Anders Skrondal & Andrew Pickles, 2002. "Reliable estimation of generalized linear mixed models using adaptive quadrature," Stata Journal, StataCorp LLC, vol. 2(1), pages 1-21, February.
    5. Philippe Pébay & Timothy B. Terriberry & Hemanth Kolla & Janine Bennett, 2016. "Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights," Computational Statistics, Springer, vol. 31(4), pages 1305-1325, December.
    6. Wang, Li-Yu & Park, Cheolwoo & Yeon, Kyupil & Choi, Hosik, 2017. "Tracking concept drift using a constrained penalized regression combiner," Computational Statistics & Data Analysis, Elsevier, vol. 108(C), pages 52-69.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Fabio Vieira & Roger Leenders & Joris Mulder, 2024. "Fast meta-analytic approximations for relational event models: applications to data streams and multilevel data," Journal of Computational Social Science, Springer, vol. 7(2), pages 1823-1859, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Øystein Sørensen & Anders M. Fjell & Kristine B. Walhovd, 2023. "Longitudinal Modeling of Age-Dependent Latent Traits with Generalized Additive Latent and Mixed Models," Psychometrika, Springer;The Psychometric Society, vol. 88(2), pages 456-486, June.
    2. Harold Doran, 2023. "A Collection of Numerical Recipes Useful for Building Scalable Psychometric Applications," Journal of Educational and Behavioral Statistics, , vol. 48(1), pages 37-69, February.
    3. Steffen Nestler & Edgar Erdfelder, 2023. "Random Effects Multinomial Processing Tree Models: A Maximum Likelihood Approach," Psychometrika, Springer;The Psychometric Society, vol. 88(3), pages 809-829, September.
    4. Frith, Michael J., 2019. "Modelling taste heterogeneity regarding offence location choices," Journal of choice modelling, Elsevier, vol. 33(C).
    5. Chang Sik Kim & Jonathan Cairns & Valentina Quarantotti & Bogumil Kaczkowski & Yinhai Wang & Peter Konings & Xiang Zhang, 2024. "A statistical simulation model to guide the choices of analytical methods in arrayed CRISPR screen experiments," PLOS ONE, Public Library of Science, vol. 19(8), pages 1-16, August.
    6. Madeleine Seale & Annamaria Kiss & Simone Bovio & Ignazio Maria Viola & Enrico Mastropaolo & Arezki Boudaoud & Naomi Nakayama, 2022. "Dandelion pappus morphing is actuated by radially patterned material swelling," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    7. JANSSENS, Jochen & DE CORTE, Annelies & SÖRENSEN, Kenneth, 2016. "Water distribution network design optimisation with respect to reliability," Working Papers 2016007, University of Antwerp, Faculty of Business and Economics.
    8. Marion Chatelain, 2023. "Endogeic Earthworms Avoid Soil Mimicking Metal Pollution Levels in Urban Parks," Sustainability, MDPI, vol. 15(15), pages 1-17, July.
    9. Silva Larson & Anne (Giger)-Dray & Tina Cornioley & Manithaythip Thephavanh & Phomma Thammavong & Sisavan Vorlasan & John G. Connell & Magnus Moglia & Peter Case & Kim S. Alexander & Pascal Perez, 2020. "A Game-Based Approach to Exploring Gender Differences in Smallholder Decisions to Change Farming Practices: White Rice Production in Laos," Sustainability, MDPI, vol. 12(16), pages 1-22, August.
    10. repec:ebl:ecbull:v:3:y:2008:i:42:p:1-13 is not listed on IDEAS
    11. Matthew O. Gribble & Karen Bandeen-Roche & Mary A. Fox, 2015. "Determinants of Exposure to Fragranced Product Chemical Mixtures in a Sample of Twins," IJERPH, MDPI, vol. 12(2), pages 1-21, January.
    12. Teruaki Kido & Yuko Yotsumoto & Masamichi J. Hayashi, 2025. "Hierarchical representations of relative numerical magnitudes in the human frontoparietal cortex," Nature Communications, Nature, vol. 16(1), pages 1-15, December.
    13. Sławomir Kujawski & Agnieszka Kujawska & Mariusz Kozakiewicz & Djordje G. Jakovljevic & Błażej Stankiewicz & Julia L. Newton & Kornelia Kędziora-Kornatowska & Paweł Zalewski, 2022. "Effects of Sitting Callisthenic Balance and Resistance Exercise Programs on Cognitive Function in Older Participants," IJERPH, MDPI, vol. 19(22), pages 1-18, November.
    14. Raymond Hernandez & Elizabeth A. Pyatak & Cheryl L. P. Vigen & Haomiao Jin & Stefan Schneider & Donna Spruijt-Metz & Shawn C. Roll, 2021. "Understanding Worker Well-Being Relative to High-Workload and Recovery Activities across a Whole Day: Pilot Testing an Ecological Momentary Assessment Technique," IJERPH, MDPI, vol. 18(19), pages 1-17, October.
    15. Christopher Hassall & Michael Nisbet & Evan Norcliffe & He Wang, 2024. "The Potential Health Benefits of Urban Tree Planting Suggested through Immersive Environments," Land, MDPI, vol. 13(3), pages 1-12, February.
    16. Jie Zhao & Ji Chen & Damien Beillouin & Hans Lambers & Yadong Yang & Pete Smith & Zhaohai Zeng & Jørgen E. Olesen & Huadong Zang, 2022. "Global systematic review with meta-analysis reveals yield advantage of legume-based rotations and its drivers," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    17. Elisabeth Beckmann & Lukas Olbrich & Joseph Sakshaug, 2024. "Multivariate assessment of interviewer-related errors in a cross-national economic survey (Lukas Olbrich, Elisabeth Beckmann, Joseph W. Sakshaug)," Working Papers 253, Oesterreichische Nationalbank (Austrian Central Bank).
    18. Ellen Poel & Owen O'donnell & Eddy Doorslaer, 2009. "What explains the rural-urban gap in infant mortality: Household or community characteristics?," Demography, Springer;Population Association of America (PAA), vol. 46(4), pages 827-850, November.
    19. Migchelbrink, Koen & Raymaekers, Pieter, 2023. "Nudging people to pay their parking fines on time. Evidence from a cluster-randomized field experiment," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 105(C).
    20. repec:ags:aaea22:335993 is not listed on IDEAS
    21. F J Heather & D Z Childs & A M Darnaude & J L Blanchard, 2018. "Using an integral projection model to assess the effect of temperature on the growth of gilthead seabream Sparus aurata," PLOS ONE, Public Library of Science, vol. 13(5), pages 1-19, May.
    22. Francisco Ruiz-Raya & Jose C Noguera & Alberto Velando, 2022. "Light received by embryos promotes postnatal junior phenotypes in a seabird [The evolution of social behavior]," Behavioral Ecology, International Society for Behavioral Ecology, vol. 33(6), pages 1047-1057.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:137:y:2019:i:c:p:16-32. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.