IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v38y2023i1d10.1007_s00180-022-01238-z.html
   My bibliography  Save this article

Labeled optimal partitioning

Author

Listed:
  • Toby Dylan Hocking

    (Northern Arizona University)

  • Anuraag Srivastava

    (Northern Arizona University)

Abstract

In data sequences measured over space or time, an important problem is accurate detection of abrupt changes. In partially labeled data, it is important to correctly predict presence/absence of changes in positive/negative labeled regions, in both the train and test sets. One existing dynamic programming algorithm is designed for prediction in unlabeled test regions (and ignores the labels in the train set); another is for accurate fitting of train labels (but does not predict changepoints in unlabeled test regions). We resolve these issues by proposing a new optimal changepoint detection model that is guaranteed to fit the labels in the train data, and can also provide predictions of unlabeled changepoints in test data. We propose a new dynamic programming algorithm, Labeled Optimal Partitioning, and we provide a formal proof that it solves the resulting non-convex optimization problem. We provide theoretical and empirical analysis of the time complexity of our algorithm, in terms of the number of labels and the size of the data sequence to segment. Finally, we provide empirical evidence that our algorithm is more accurate than the existing baselines, in terms of train and test label error.

Suggested Citation

  • Toby Dylan Hocking & Anuraag Srivastava, 2023. "Labeled optimal partitioning," Computational Statistics, Springer, vol. 38(1), pages 461-480, March.
  • Handle: RePEc:spr:compst:v:38:y:2023:i:1:d:10.1007_s00180-022-01238-z
    DOI: 10.1007/s00180-022-01238-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-022-01238-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-022-01238-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Yao, Yi-Ching, 1988. "Estimating the number of change-points via Schwarz' criterion," Statistics & Probability Letters, Elsevier, vol. 6(3), pages 181-189, February.
    2. Nancy R. Zhang & David O. Siegmund, 2007. "A Modified Bayes Information Criterion with Applications to the Analysis of Comparative Genomic Hybridization Data," Biometrics, The International Biometric Society, vol. 63(1), pages 22-32, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Davis, Richard A. & Hancock, Stacey A. & Yao, Yi-Ching, 2016. "On consistency of minimum description length model selection for piecewise autoregressions," Journal of Econometrics, Elsevier, vol. 194(2), pages 360-368.
    2. Kurozumi, Eiji & Tuvaandorj, Purevdorj, 2011. "Model selection criteria in multivariate models with multiple structural changes," Journal of Econometrics, Elsevier, vol. 164(2), pages 218-238, October.
    3. Fryzlewicz, Piotr, 2020. "Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection," LSE Research Online Documents on Economics 103430, London School of Economics and Political Science, LSE Library.
    4. Neil Kellard & Denise Osborn & Jerry Coakley & Alastair R. Hall & Denise R. Osborn & Nikolaos Sakkas, 2015. "Structural Break Inference Using Information Criteria in Models Estimated by Two-Stage Least Squares," Journal of Time Series Analysis, Wiley Blackwell, vol. 36(5), pages 741-762, September.
    5. Chulwoo Han & Abderrahim Taamouti, 2017. "Partial Structural Break Identification," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 79(2), pages 145-164, April.
    6. Jaromír Antoch & Daniela Jarušková, 2013. "Testing for multiple change points," Computational Statistics, Springer, vol. 28(5), pages 2161-2183, October.
    7. Yoshiyuki Ninomiya, 2015. "Change-point model selection via AIC," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 67(5), pages 943-961, October.
    8. Sang Gil Kang & Woo Dong Lee & Yongku Kim, 2021. "Bayesian Multiple Change-Points Detection in a Normal Model with Heterogeneous Variances," Computational Statistics, Springer, vol. 36(2), pages 1365-1390, June.
    9. Shi, Xuesheng & Gallagher, Colin & Lund, Robert & Killick, Rebecca, 2022. "A comparison of single and multiple changepoint techniques for time series data," Computational Statistics & Data Analysis, Elsevier, vol. 170(C).
    10. Alastair R. Hall & Denise R. Osborn & Nikolaos Sakkas, 2013. "Inference on Structural Breaks using Information Criteria," Manchester School, University of Manchester, vol. 81, pages 54-81, October.
    11. Jian Li & Fugee Tsung & Changliang Zou, 2013. "Directional change‐point detection for process control with multivariate categorical data," Naval Research Logistics (NRL), John Wiley & Sons, vol. 60(2), pages 160-173, March.
    12. Young Hoon Lee, 2009. "The Impact of Postseason Restructuring on the Competitive Balance and Fan Demand in Major League Baseball," Journal of Sports Economics, , vol. 10(3), pages 219-235, June.
    13. Wang, Lu & Zhou, Ruichao & Wu, Jianhong, 2021. "Determining the number of breaks in large dimensional factor models with structural changes," Economics Letters, Elsevier, vol. 199(C).
    14. Hayley Jang & Young Hoon Lee & Rodney Fort, 2019. "Winning In Professional Team Sports: Historical Moments," Economic Inquiry, Western Economic Association International, vol. 57(1), pages 103-120, January.
    15. Devi, P. Indira & Shanmugam, K.R. & Jayasree, M.G., 2012. "Compensating Wages for Occupational Risks of Farm Workers in India," Indian Journal of Agricultural Economics, Indian Society of Agricultural Economics, vol. 67(2), pages 1-12.
    16. Gil-Alana, Luis A. & Dadgar, Yadollah & Nazari, Rouhollah, 2020. "An analysis of the OPEC and non-OPEC position in the World Oil Market: A fractionally integrated approach," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 541(C).
    17. Altansukh, Gantungalag & Becker, Ralf & Bratsiotis, George J. & Osborn, Denise R., 2017. "What is the Globalisation of Inflation?," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 74, pages 1-27.
    18. Boldea, Otilia & Hall, Alastair R., 2013. "Estimation and inference in unstable nonlinear least squares models," Journal of Econometrics, Elsevier, vol. 172(1), pages 158-167.
    19. Lee, Chung-Bow, 1996. "Nonparametric multiple change-point estimators," Statistics & Probability Letters, Elsevier, vol. 27(4), pages 295-304, May.
    20. Kayhan, Selim & Adiguzel, Uğur & Bayat, Tayfur & Lebe, Fuat, 2010. "Causality Relationship between Real GDP and Electricity Consumption in Romania (2001-2010)," Journal for Economic Forecasting, Institute for Economic Forecasting, vol. 0(4), pages 169-183, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:38:y:2023:i:1:d:10.1007_s00180-022-01238-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.