IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v8y2020i11p1942-d439496.html
   My bibliography  Save this article

Variational Inference over Nonstationary Data Streams for Exponential Family Models

Author

Listed:
  • Andrés R. Masegosa

    (Department of Mathematics and Center for Development and Transfer of Mathematical Research to Industry (CDTIME), University of Almería, 04120 Almería, Spain)

  • Darío Ramos-López

    (Department of Applied Mathematics, Materials Science and Engineering, and Electronic Technology, Rey Juan Carlos University, 28933 Móstoles, Spain)

  • Antonio Salmerón

    (Department of Mathematics and Center for Development and Transfer of Mathematical Research to Industry (CDTIME), University of Almería, 04120 Almería, Spain)

  • Helge Langseth

    (Department of Computer Science, Norwegian University of Science and Technology, 7491 Trondheim, Norway)

  • Thomas D. Nielsen

    (Department of Computer Science, Aalborg University, 9220 Aalborg, Denmark)

Abstract

In many modern data analysis problems, the available data is not static but, instead, comes in a streaming fashion. Performing Bayesian inference on a data stream is challenging for several reasons. First, it requires continuous model updating and the ability to handle a posterior distribution conditioned on an unbounded data set. Secondly, the underlying data distribution may drift from one time step to another, and the classic i.i.d. (independent and identically distributed), or data exchangeability assumption does not hold anymore. In this paper, we present an approximate Bayesian inference approach using variational methods that addresses these issues for conjugate exponential family models with latent variables. Our proposal makes use of a novel scheme based on hierarchical priors to explicitly model temporal changes of the model parameters. We show how this approach induces an exponential forgetting mechanism with adaptive forgetting rates. The method is able to capture the smoothness of the concept drift, ranging from no drift to abrupt drift. The proposed variational inference scheme maintains the computational efficiency of variational methods over conjugate models, which is critical in streaming settings. The approach is validated on four different domains (energy, finance, geolocation, and text) using four real-world data sets.

Suggested Citation

  • Andrés R. Masegosa & Darío Ramos-López & Antonio Salmerón & Helge Langseth & Thomas D. Nielsen, 2020. "Variational Inference over Nonstationary Data Streams for Exponential Family Models," Mathematics, MDPI, vol. 8(11), pages 1-27, November.
  • Handle: RePEc:gam:jmathe:v:8:y:2020:i:11:p:1942-:d:439496
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/8/11/1942/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/8/11/1942/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. David M. Blei & Alp Kucukelbir & Jon D. McAuliffe, 2017. "Variational Inference: A Review for Statisticians," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(518), pages 859-877, April.
    2. Ibrahim J.G. & Chen M-H. & Sinha D., 2003. "On Optimality Properties of the Power Prior," Journal of the American Statistical Association, American Statistical Association, vol. 98, pages 204-213, January.
    3. Michael E. Tipping & Christopher M. Bishop, 1999. "Probabilistic Principal Component Analysis," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(3), pages 611-622.
    4. Kostas Triantafyllopoulos, 2009. "Inference of Dynamic Generalized Linear Models: On‐Line Computation and Appraisal," International Statistical Review, International Statistical Institute, vol. 77(3), pages 430-450, December.
    5. Xi Chen & Kaoru Irie & David Banks & Robert Haslinger & Jewell Thomas & Mike West, 2018. "Scalable Bayesian Modeling, Monitoring, and Analysis of Dynamic Network Flow Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(522), pages 519-533, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wang, Yu & Liu, Qiufa & Lu, Wenjian & Peng, Yizhen, 2023. "A general time-varying Wiener process for degradation modeling and RUL estimation under three-source variability," Reliability Engineering and System Safety, Elsevier, vol. 232(C).
    2. Antonio Salmerón, 2022. "Comments on: Hybrid semiparametric Bayesian networks," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(2), pages 331-334, June.
    3. Krzysztof Drachal & Daniel González Cortés, 2022. "Estimation of Lockdowns’ Impact on Well-Being in Selected Countries: An Application of Novel Bayesian Methods and Google Search Queries Data," IJERPH, MDPI, vol. 20(1), pages 1-24, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nolan, Tui H. & Richardson, Sylvia & Ruffieux, Hélène, 2025. "Efficient Bayesian functional principal component analysis of irregularly-observed multivariate curves," Computational Statistics & Data Analysis, Elsevier, vol. 203(C).
    2. Jianhua Zhao & Changchun Shang & Shulan Li & Ling Xin & Philip L. H. Yu, 2025. "Choosing the number of factors in factor analysis with incomplete data via a novel hierarchical Bayesian information criterion," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 19(1), pages 209-235, March.
    3. Mike West, 2020. "Bayesian forecasting of multivariate time series: scalability, structure uncertainty and decisions," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(1), pages 1-31, February.
    4. José A. Perusquía & Jim E. Griffin & Cristiano Villa, 2022. "Bayesian Models Applied to Cyber Security Anomaly Detection Problems," International Statistical Review, International Statistical Institute, vol. 90(1), pages 78-99, April.
    5. Shen Liu & Hongyan Liu, 2021. "Tagging Items Automatically Based on Both Content Information and Browsing Behaviors," INFORMS Journal on Computing, INFORMS, vol. 33(3), pages 882-897, July.
    6. Xin Xu & Yang Lu & Yupeng Zhou & Zhiguo Fu & Yanjie Fu & Minghao Yin, 2021. "An Information-Explainable Random Walk Based Unsupervised Network Representation Learning Framework on Node Classification Tasks," Mathematics, MDPI, vol. 9(15), pages 1-14, July.
    7. Luo, Nanyu & Ji, Feng & Han, Yuting & He, Jinbo & Zhang, Xiaoya, 2024. "Fitting item response theory models using deep learning computational frameworks," OSF Preprints tjxab, Center for Open Science.
    8. Matteo Barigozzi & Marc Hallin, 2023. "Dynamic Factor Models: a Genealogy," Papers 2310.17278, arXiv.org, revised Jan 2024.
    9. Chen, Andrew Y. & McCoy, Jack, 2024. "Missing values handling for machine learning portfolios," Journal of Financial Economics, Elsevier, vol. 155(C).
    10. Wang, Shao-Hsuan & Huang, Su-Yun, 2022. "Perturbation theory for cross data matrix-based PCA," Journal of Multivariate Analysis, Elsevier, vol. 190(C).
    11. Liu, Jie & Ye, Zifeng & Chen, Kun & Zhang, Panpan, 2024. "Variational Bayesian inference for bipartite mixed-membership stochastic block model with applications to collaborative filtering," Computational Statistics & Data Analysis, Elsevier, vol. 189(C).
    12. Djohan Bonnet & Tifenn Hirtzlin & Atreya Majumdar & Thomas Dalgaty & Eduardo Esmanhotto & Valentina Meli & Niccolo Castellani & Simon Martin & Jean-François Nodin & Guillaume Bourgeois & Jean-Michel P, 2023. "Bringing uncertainty quantification to the extreme-edge with memristor-based Bayesian neural networks," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    13. Wentao Qu & Xianchao Xiu & Huangyue Chen & Lingchen Kong, 2023. "A Survey on High-Dimensional Subspace Clustering," Mathematics, MDPI, vol. 11(2), pages 1-39, January.
    14. Seokhyun Chung & Raed Al Kontar & Zhenke Wu, 2022. "Weakly Supervised Multi-output Regression via Correlated Gaussian Processes," INFORMS Joural on Data Science, INFORMS, vol. 1(2), pages 115-137, October.
    15. Gary Koop & Dimitris Korobilis, 2023. "Bayesian Dynamic Variable Selection In High Dimensions," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 64(3), pages 1047-1074, August.
    16. Ziqi Zhang & Xinye Zhao & Mehak Bindra & Peng Qiu & Xiuwei Zhang, 2024. "scDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    17. Dimitris Korobilis & Davide Pettenuzzo, 2020. "Machine Learning Econometrics: Bayesian algorithms and methods," Working Papers 2020_09, Business School - Economics, University of Glasgow.
    18. Jan Prüser & Florian Huber, 2024. "Nonlinearities in macroeconomic tail risk through the lens of big data quantile regressions," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(2), pages 269-291, March.
    19. Bansal, Prateek & Krueger, Rico & Graham, Daniel J., 2021. "Fast Bayesian estimation of spatial count data models," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
    20. Jiaju Miao & Pawel Polak, 2023. "Online Ensemble of Models for Optimal Predictive Performance with Applications to Sector Rotation Strategy," Papers 2304.09947, arXiv.org.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:8:y:2020:i:11:p:1942-:d:439496. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.