IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2110.15263.html
   My bibliography  Save this paper

Coresets for Time Series Clustering

Author

Listed:
  • Lingxiao Huang
  • K. Sudhir
  • Nisheeth K. Vishnoi

Abstract

We study the problem of constructing coresets for clustering problems with time series data. This problem has gained importance across many fields including biology, medicine, and economics due to the proliferation of sensors facilitating real-time measurement and rapid drop in storage costs. In particular, we consider the setting where the time series data on $N$ entities is generated from a Gaussian mixture model with autocorrelations over $k$ clusters in $\mathbb{R}^d$. Our main contribution is an algorithm to construct coresets for the maximum likelihood objective for this mixture model. Our algorithm is efficient, and under a mild boundedness assumption on the covariance matrices of the underlying Gaussians, the size of the coreset is independent of the number of entities $N$ and the number of observations for each entity, and depends only polynomially on $k$, $d$ and $1/\varepsilon$, where $\varepsilon$ is the error parameter. We empirically assess the performance of our coreset with synthetic data.

Suggested Citation

  • Lingxiao Huang & K. Sudhir & Nisheeth K. Vishnoi, 2021. "Coresets for Time Series Clustering," Papers 2110.15263, arXiv.org.
  • Handle: RePEc:arx:papers:2110.15263
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2110.15263
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Pesaran, M. Hashem, 2015. "Time Series and Panel Data Econometrics," OUP Catalogue, Oxford University Press, number 9780198759980.
    2. Peter Arcidiacono & John Bailey Jones, 2003. "Finite Mixture Distributions, Sequential Likelihood and the EM Algorithm," Econometrica, Econometric Society, vol. 71(3), pages 933-946, May.
    3. Lingxiao Huang & K. Sudhir & Nisheeth K. Vishnoi, 2020. "Coresets for Regressions with Panel Data," Papers 2011.00981, arXiv.org, revised Nov 2020.
    4. Joong-Ho Won & Johan Lim & Seung-Jean Kim & Bala Rajaratnam, 2013. "Condition-number-regularized covariance estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(3), pages 427-450, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. T. Tony Ke & K. Sudhir, 2023. "Privacy Rights and Data Security: GDPR and Personal Data Markets," Management Science, INFORMS, vol. 69(8), pages 4389-4412, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lingxiao Huang & K. Sudhir & Nisheeth Vishnoi, 2021. "Coresets for Time Series Clustering," Cowles Foundation Discussion Papers 2310, Cowles Foundation for Research in Economics, Yale University.
    2. Claudia García-García & Catalina B. García-García & Román Salmerón, 2021. "Confronting collinearity in environmental regression models: evidence from world data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(3), pages 895-926, September.
    3. Hannart, Alexis & Naveau, Philippe, 2014. "Estimating high dimensional covariance matrices: A new look at the Gaussian conjugate framework," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 149-162.
    4. Ghosh, Soumya Kanti & Nath, Hiranya K., 2023. "What determines private and household savings in India?," International Review of Economics & Finance, Elsevier, vol. 86(C), pages 639-651.
    5. Chudik, Alexander & Pesaran, M. Hashem, 2019. "Mean group estimation in presence of weakly cross-correlated estimators," Economics Letters, Elsevier, vol. 175(C), pages 101-105.
    6. Pouta, Eija & Myyra, Sami & Hanninen, Harri, 2009. "Heterogeneous farmland owners: two approaches for objective based classification," 2009 Conference, August 16-22, 2009, Beijing, China 50787, International Association of Agricultural Economists.
    7. Francisco Javier Forcadell & Fernando Úbeda, 2022. "Individual entrepreneurial orientation and performance: the mediating role of international entrepreneurship," International Entrepreneurship and Management Journal, Springer, vol. 18(2), pages 875-900, June.
    8. Emmanuel Anyigbah & Yusheng Kong & Bless Kofi Edziah & Ahotovi Thomas Ahoto & Wilhelmina Seyome Ahiaku, 2023. "Board Characteristics and Corporate Sustainability Reporting: Evidence from Chinese Listed Companies," Sustainability, MDPI, vol. 15(4), pages 1-26, February.
    9. Orazio Attanasio & Sarah Cattan & Emla Fitzsimons & Costas Meghir & Marta Rubio-Codina, 2020. "Estimating the Production Function for Human Capital: Results from a Randomized Controlled Trial in Colombia," American Economic Review, American Economic Association, vol. 110(1), pages 48-85, January.
    10. Fabien Postel-Vinay & Hélène Turon, 2007. "The Public Pay Gap in Britain: Small Differences That (Don't?) Matter," Economic Journal, Royal Economic Society, vol. 117(523), pages 1460-1503, October.
    11. Bowei Guo & Giorgio Castagneto Gissey, 2019. "Cost Pass-through in the British Wholesale Electricity Market: Implications of Brexit and the ETS reform," Working Papers EPRG1937, Energy Policy Research Group, Cambridge Judge Business School, University of Cambridge.
    12. Polemis, Michael & Tselekounis, Markos, 2019. "Does deregulation drive innovation intensity? Lessons learned from the OECD telecommunications sector," MPRA Paper 92770, University Library of Munich, Germany.
    13. Arestis, Philip & Ferreiro, Jesus & Gomez, Carmen, 2023. "Does employment protection legislation affect employment and unemployment?11We acknowledge the comments of an editor and an associate editor of the journal and three reviewers. Their suggestions and r," Economic Modelling, Elsevier, vol. 126(C).
    14. Marco Costanigro & Yuko Onozaka, 2020. "A Belief‐Preference Model of Choice for Experience and Credence Goods," Journal of Agricultural Economics, Wiley Blackwell, vol. 71(1), pages 70-95, February.
    15. Elisa Cavatorta & Ron P. Smith, 2017. "Factor Models in Panels with Cross-sectional Dependence: An Application to the Extended SIPRI Military Expenditure Data," Defence and Peace Economics, Taylor & Francis Journals, vol. 28(4), pages 437-456, July.
    16. Magali BEFFY & Thierry KAMIONKA, 2010. "Public-private Wage Gaps : Is Civil-servant Human Capital Sector-specific ?," Working Papers 2010-55, Center for Research in Economics and Statistics.
    17. Catalán, Mario & Hoffmaister, Alexander W., 2022. "When banks punch back: Macrofinancial feedback loops in stress tests," Journal of International Money and Finance, Elsevier, vol. 124(C).
    18. Kristin Forbes & Ida Hjortsoe & Tsvetelina Nenova, 2020. "International Evidence on Shock-Dependent Exchange Rate Pass-Through," IMF Economic Review, Palgrave Macmillan;International Monetary Fund, vol. 68(4), pages 721-763, December.
    19. Raslan Alzubi & Mustafa Caglayan & Kostas Mouratidis, 2017. "The Risk-Taking Channel in the US: A GVAR Approach," Working Papers 2017009, The University of Sheffield, Department of Economics.
    20. Thomas H. W. Ziesemer, 2022. "Foreign R&D spillovers to the USA and strategic reactions," Applied Economics, Taylor & Francis Journals, vol. 54(37), pages 4274-4291, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2110.15263. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.