IDEAS home Printed from https://ideas.repec.org/p/cwl/cwldpp/2310.html
   My bibliography  Save this paper

Coresets for Time Series Clustering

Author

Abstract

We study the problem of constructing coresets for clustering problems with time series data. This problem has gained importance across many fields including biology, medicine, and economics due to the proliferation of sensors for real-time measurement and rapid drop in storage costs. In particular, we consider the setting where the time series data on N entities is generated from a Gaussian mixture model with autocorrelations over k clusters in Rd. Our main contribution is an algorithm to construct coresets for the maximum likelihood objective for this mixture model. Our algorithm is efficient, and, under a mild assumption on the covariance matrices of the Gaussians, the size of the coreset is independent of the number of entities N and the number of observations for each entity, and depends only polynomially on k, d and 1/ε, where ε is the error parameter. We empirically assess the performance of our coresets with synthetic data.

Suggested Citation

  • Lingxiao Huang & K. Sudhir & Nisheeth Vishnoi, 2021. "Coresets for Time Series Clustering," Cowles Foundation Discussion Papers 2310, Cowles Foundation for Research in Economics, Yale University.
  • Handle: RePEc:cwl:cwldpp:2310
    as

    Download full text from publisher

    File URL: https://cowles.yale.edu/sites/default/files/files/pub/d23/d2310.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Peter Arcidiacono & John Bailey Jones, 2003. "Finite Mixture Distributions, Sequential Likelihood and the EM Algorithm," Econometrica, Econometric Society, vol. 71(3), pages 933-946, May.
    2. Lingxiao Huang & K. Sudhir & Nisheeth K. Vishnoi, 2020. "Coresets for Regressions with Panel Data," Papers 2011.00981, arXiv.org, revised Nov 2020.
    3. Joong-Ho Won & Johan Lim & Seung-Jean Kim & Bala Rajaratnam, 2013. "Condition-number-regularized covariance estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(3), pages 427-450, June.
    4. Pesaran, M. Hashem, 2015. "Time Series and Panel Data Econometrics," OUP Catalogue, Oxford University Press, number 9780198759980.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lingxiao Huang & K. Sudhir & Nisheeth K. Vishnoi, 2021. "Coresets for Time Series Clustering," Papers 2110.15263, arXiv.org.
    2. Hannart, Alexis & Naveau, Philippe, 2014. "Estimating high dimensional covariance matrices: A new look at the Gaussian conjugate framework," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 149-162.
    3. Ghosh, Soumya Kanti & Nath, Hiranya K., 2023. "What determines private and household savings in India?," International Review of Economics & Finance, Elsevier, vol. 86(C), pages 639-651.
    4. Francisco Javier Forcadell & Fernando Úbeda, 2022. "Individual entrepreneurial orientation and performance: the mediating role of international entrepreneurship," International Entrepreneurship and Management Journal, Springer, vol. 18(2), pages 875-900, June.
    5. Emmanuel Anyigbah & Yusheng Kong & Bless Kofi Edziah & Ahotovi Thomas Ahoto & Wilhelmina Seyome Ahiaku, 2023. "Board Characteristics and Corporate Sustainability Reporting: Evidence from Chinese Listed Companies," Sustainability, MDPI, vol. 15(4), pages 1-26, February.
    6. Orazio Attanasio & Sarah Cattan & Emla Fitzsimons & Costas Meghir & Marta Rubio-Codina, 2020. "Estimating the Production Function for Human Capital: Results from a Randomized Controlled Trial in Colombia," American Economic Review, American Economic Association, vol. 110(1), pages 48-85, January.
    7. Bowei Guo & Giorgio Castagneto Gissey, 2019. "Cost Pass-through in the British Wholesale Electricity Market: Implications of Brexit and the ETS reform," Working Papers EPRG1937, Energy Policy Research Group, Cambridge Judge Business School, University of Cambridge.
    8. Polemis, Michael & Tselekounis, Markos, 2019. "Does deregulation drive innovation intensity? Lessons learned from the OECD telecommunications sector," MPRA Paper 92770, University Library of Munich, Germany.
    9. Marco Costanigro & Yuko Onozaka, 2020. "A Belief‐Preference Model of Choice for Experience and Credence Goods," Journal of Agricultural Economics, Wiley Blackwell, vol. 71(1), pages 70-95, February.
    10. Kristin Forbes & Ida Hjortsoe & Tsvetelina Nenova, 2020. "International Evidence on Shock-Dependent Exchange Rate Pass-Through," IMF Economic Review, Palgrave Macmillan;International Monetary Fund, vol. 68(4), pages 721-763, December.
    11. Jad Beyhum & Eric Gautier, 2020. "Factor and factor loading augmented estimators for panel regression," Working Papers hal-02957008, HAL.
    12. Attanasio, Orazio & Cattan, Sarah & Fitzsimons, Emla & Meghir, Costas & Rubio-Codina, Marta, 2015. "Estimating the Production Function for Human Capital: Results from a Randomized Control Trial in Colombia," IZA Discussion Papers 8856, Institute of Labor Economics (IZA).
    13. Finja Lena Kind & Jennifer Zeppenfeld & Rainer Lueg, 2023. "The impact of chief executive officer narcissism on environmental, social, and governance reporting," Business Strategy and the Environment, Wiley Blackwell, vol. 32(7), pages 4448-4466, November.
    14. Chakraborty, Saptorshee Kanto & Mazzanti, Massimiliano, 2021. "Renewable electricity and economic growth relationship in the long run: Panel data econometric evidence from the OECD," Structural Change and Economic Dynamics, Elsevier, vol. 59(C), pages 330-341.
    15. Peter Arcidiacono & Holger Sieg & Frank Sloan, 2007. "Living Rationally Under The Volcano? An Empirical Analysis Of Heavy Drinking And Smoking," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 48(1), pages 37-65, February.
    16. Belke Ansgar & Dreger Christian, 2019. "Did Interest Rates at the Zero Lower Bound Affect Lending of Commercial Banks? Evidence for the Euro Area," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 239(5-6), pages 841-860, October.
    17. Temple, Jonathan & Van de Sijpe, Nicolas, 2017. "Foreign aid and domestic absorption," Journal of International Economics, Elsevier, vol. 108(C), pages 431-443.
    18. Schulz, Jan & Mayerhoffer, Daniel M., 2021. "A network approach to consumption," BERG Working Paper Series 173, Bamberg University, Bamberg Economic Research Group.
    19. Artur Tarassow, 2017. "Forecasting growth of U.S. aggregate and household-sector M2 after 2000 using economic uncertainty measures," Macroeconomics and Finance Series 201702, University of Hamburg, Department of Socioeconomics.
    20. Dario Laudati & M. Hashem Pesaran, 2023. "Identifying the effects of sanctions on the Iranian economy using newspaper coverage," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 38(3), pages 271-294, April.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cwl:cwldpp:2310. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Brittany Ladd (email available below). General contact details of provider: https://edirc.repec.org/data/cowleus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.