IDEAS home Printed from https://ideas.repec.org/a/inm/orijoc/v29y2017i2p301-317.html
   My bibliography  Save this article

Algorithms for Generalized Clusterwise Linear Regression

Author

Listed:
  • Young Woong Park

    (Cox School of Business, Southern Methodist University, Dallas, Texas 75275)

  • Yan Jiang

    (Sears Holdings Corporation, Hoffman Estates, Illinois 60179)

  • Diego Klabjan

    (Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, Illinois 60208)

  • Loren Williams

    (Ernst & Young LLP, Atlanta, Georgia 30308)

Abstract

Clusterwise linear regression (CLR), a clustering problem intertwined with regression, finds clusters of entities such that the overall sum of squared errors from regressions performed over these clusters is minimized, where each cluster may have different variances. We generalize the CLR problem by allowing each entity to have more than one observation and refer to this as generalized CLR. We propose an exact mathematical programming-based approach relying on column generation, a column generation–based heuristic algorithm that clusters predefined groups of entities, a metaheuristic genetic algorithm with adapted Lloyd’s algorithm for K -means clustering, a two-stage approach, and a modified algorithm of Späth [Späth (1979) Algorithm 39 clusterwise linear regression. Comput. 22(4):367–373] for solving generalized CLR. We examine the performance of our algorithms on a stock-keeping unit (SKU)-clustering problem employed in forecasting halo and cannibalization effects in promotions using real-world retail data from a large supermarket chain. In the SKU clustering problem, the retailer needs to cluster SKUs based on their seasonal effects in response to promotions. The seasonal effects result from regressions with predictors being promotion mechanisms and seasonal dummies performed over clusters generated. We compare the performance of all proposed algorithms for the SKU problem with real-world and synthetic data.

Suggested Citation

  • Young Woong Park & Yan Jiang & Diego Klabjan & Loren Williams, 2017. "Algorithms for Generalized Clusterwise Linear Regression," INFORMS Journal on Computing, INFORMS, vol. 29(2), pages 301-317, May.
  • Handle: RePEc:inm:orijoc:v:29:y:2017:i:2:p:301-317
    DOI: 10.1287/ijoc.2016.0729
    as

    Download full text from publisher

    File URL: https://doi.org/10.1287/ijoc.2016.0729
    Download Restriction: no

    File URL: https://libkey.io/10.1287/ijoc.2016.0729?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Vanderbeck, F. & Wolsey, L. A., 1996. "An exact algorithm for IP column generation," LIDAM Reprints CORE 1242, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    2. Carbonneau, Réal A. & Caporossi, Gilles & Hansen, Pierre, 2011. "Globally optimal clusterwise regression by mixed logical-quadratic programming," European Journal of Operational Research, Elsevier, vol. 212(1), pages 213-222, July.
    3. Cynthia Barnhart & Ellis L. Johnson & George L. Nemhauser & Martin W. P. Savelsbergh & Pamela H. Vance, 1998. "Branch-and-Price: Column Generation for Solving Huge Integer Programs," Operations Research, INFORMS, vol. 46(3), pages 316-329, June.
    4. Wayne DeSarbo & Richard Oliver & Arvind Rangaswamy, 1989. "A simulated annealing methodology for clusterwise linear regression," Psychometrika, Springer;The Psychometric Society, vol. 54(4), pages 707-736, September.
    5. Ingrassia, Salvatore & Minotti, Simona C. & Punzo, Antonio, 2014. "Model-based clustering via linear cluster-weighted models," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 159-182.
    6. Lau, Kin-nam & Leung, Pui-lam & Tse, Ka-kit, 1999. "A mathematical programming approach to clusterwise regression model and its extensions," European Journal of Operational Research, Elsevier, vol. 116(3), pages 640-652, August.
    7. Dimitris Bertsimas & Romy Shioda, 2007. "Classification and Regression via Integer Optimization," Operations Research, INFORMS, vol. 55(2), pages 252-271, April.
    8. Réal Carbonneau & Gilles Caporossi & Pierre Hansen, 2014. "Globally Optimal Clusterwise Regression By Column Generation Enhanced with Heuristics, Sequencing and Ending Subset Optimization," Journal of Classification, Springer;The Classification Society, vol. 31(2), pages 219-241, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Shutong Chen & Weijun Xie, 2022. "On Cluster-Aware Supervised Learning: Frameworks, Convergent Algorithms, and Applications," INFORMS Journal on Computing, INFORMS, vol. 34(1), pages 481-502, January.
    2. Joki, Kaisa & Bagirov, Adil M. & Karmitsa, Napsu & Mäkelä, Marko M. & Taheri, Sona, 2020. "Clusterwise support vector linear regression," European Journal of Operational Research, Elsevier, vol. 287(1), pages 19-35.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Réal Carbonneau & Gilles Caporossi & Pierre Hansen, 2014. "Globally Optimal Clusterwise Regression By Column Generation Enhanced with Heuristics, Sequencing and Ending Subset Optimization," Journal of Classification, Springer;The Classification Society, vol. 31(2), pages 219-241, July.
    2. Joki, Kaisa & Bagirov, Adil M. & Karmitsa, Napsu & Mäkelä, Marko M. & Taheri, Sona, 2020. "Clusterwise support vector linear regression," European Journal of Operational Research, Elsevier, vol. 287(1), pages 19-35.
    3. Bagirov, Adil M. & Ugon, Julien & Mirzayeva, Hijran, 2013. "Nonsmooth nonconvex optimization approach to clusterwise linear regression problems," European Journal of Operational Research, Elsevier, vol. 229(1), pages 132-142.
    4. Maenhout, Broos & Vanhoucke, Mario, 2010. "A hybrid scatter search heuristic for personalized crew rostering in the airline industry," European Journal of Operational Research, Elsevier, vol. 206(1), pages 155-167, October.
    5. Omid Shahvari & Rasaratnam Logendran & Madjid Tavana, 2022. "An efficient model-based branch-and-price algorithm for unrelated-parallel machine batching and scheduling problems," Journal of Scheduling, Springer, vol. 25(5), pages 589-621, October.
    6. Melchiori, Anna & Sgalambro, Antonino, 2020. "A branch and price algorithm to solve the Quickest Multicommodity k-splittable Flow Problem," European Journal of Operational Research, Elsevier, vol. 282(3), pages 846-857.
    7. Marc Peeters & Zeger Degraeve, 2004. "The Co-Printing Problem: A Packing Problem with a Color Constraint," Operations Research, INFORMS, vol. 52(4), pages 623-638, August.
    8. Ehrgott, Matthias & Tind, Jørgen, 2009. "Column generation with free replicability in DEA," Omega, Elsevier, vol. 37(5), pages 943-950, October.
    9. Sung, Inkyung & Lee, Taesik, 2016. "Optimal allocation of emergency medical resources in a mass casualty incident: Patient prioritization by column generation," European Journal of Operational Research, Elsevier, vol. 252(2), pages 623-634.
    10. Ojeong Kwon & Kyungsik Lee & Donghan Kang & Sungsoo Park, 2007. "A branch‐and‐price algorithm for a targeting problem," Naval Research Logistics (NRL), John Wiley & Sons, vol. 54(7), pages 732-741, October.
    11. Adil M. Bagirov & Julien Ugon & Hijran G. Mirzayeva, 2015. "Nonsmooth Optimization Algorithm for Solving Clusterwise Linear Regression Problems," Journal of Optimization Theory and Applications, Springer, vol. 164(3), pages 755-780, March.
    12. Belií«n, Jeroen & Demeulemeester, Erik, 2008. "A branch-and-price approach for integrating nurse and surgery scheduling," European Journal of Operational Research, Elsevier, vol. 189(3), pages 652-668, September.
    13. Huisman, D. & Jans, R.F. & Peeters, M. & Wagelmans, A.P.M., 2003. "Combining Column Generation and Lagrangian Relaxation," ERIM Report Series Research in Management ERS-2003-092-LIS, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Management, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus University Rotterdam.
    14. Sarah Root & Amy Cohn, 2008. "A novel modeling approach for express package carrier planning," Naval Research Logistics (NRL), John Wiley & Sons, vol. 55(7), pages 670-683, October.
    15. Marco E. Lübbecke & Jacques Desrosiers, 2005. "Selected Topics in Column Generation," Operations Research, INFORMS, vol. 53(6), pages 1007-1023, December.
    16. Nur Kaynar & Auyon Siddiq, 2023. "Estimating Effects of Incentive Contracts in Online Labor Platforms," Management Science, INFORMS, vol. 69(4), pages 2106-2126, April.
    17. Cynthia Barnhart & Christopher A. Hane & Pamela H. Vance, 2000. "Using Branch-and-Price-and-Cut to Solve Origin-Destination Integer Multicommodity Flow Problems," Operations Research, INFORMS, vol. 48(2), pages 318-326, April.
    18. Ioachim, Irina & Desrosiers, Jacques & Soumis, Francois & Belanger, Nicolas, 1999. "Fleet assignment and routing with schedule synchronization constraints," European Journal of Operational Research, Elsevier, vol. 119(1), pages 75-90, November.
    19. Hu, Qian & Zhu, Wenbin & Qin, Hu & Lim, Andrew, 2017. "A branch-and-price algorithm for the two-dimensional vector packing problem with piecewise linear cost function," European Journal of Operational Research, Elsevier, vol. 260(1), pages 70-80.
    20. Guglielmo Lulli & Suvrajeet Sen, 2004. "A Branch-and-Price Algorithm for Multistage Stochastic Integer Programming with Application to Stochastic Batch-Sizing Problems," Management Science, INFORMS, vol. 50(6), pages 786-796, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijoc:v:29:y:2017:i:2:p:301-317. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.