IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v303y2022i1p337-353.html
   My bibliography  Save this article

Data-driven mixed-Integer linear programming-based optimisation for efficient failure detection in large-scale distributed systems

Author

Listed:
  • Er-Rahmadi, Btissam
  • Ma, Tiejun

Abstract

Failure detectors (FDs) are fundamental building blocks for distributed systems. An FD detects whether a process has crashed or not based on the reception of heartbeats’ messages sent by this process over a communication channel. A key challenge of FDs is to tune their parameters to achieve optimal performance which satisfies the desired system requirements. This is challenging due to the complexities of large-scale networks. Existing FDs ignore such optimisation and adopt ad-hoc parameters. In this paper, we propose a new Mixed Integer Linear Programming (MILP) optimisation-based FD algorithm. We obtain the MILP formulation via piecewise linearisation relaxations. The MILP involves obtaining optimal FD parameters that meet the optimal trade-off between its performance metrics requirements, network conditions and system parameters. The MILP maximises our FD’s accuracy under bounded failure detection time while considering network and system conditions as constraints. The MILP’s solution represents optimised FD parameters that maximise FD’s expected performance. To adapt to real-time network changes, our proposed MILP-based FD fits the probability distribution of heartbeats’ inter-arrivals. To address our FD scalability challenge in large-scale systems where the MILP model needs to compute approximate optimal solutions quickly, we also propose a heuristic algorithm. To test our proposed approach, we adopt Amazon Cloud as a realistic testing environment and develop a simulator for robustness tests. Our results show consistent improvement of overall FD performance and scalability. To the best of our knowledge, this is the first attempt to combine the MILP-based optimisation modelling with FD to achieve performance guarantees.

Suggested Citation

  • Er-Rahmadi, Btissam & Ma, Tiejun, 2022. "Data-driven mixed-Integer linear programming-based optimisation for efficient failure detection in large-scale distributed systems," European Journal of Operational Research, Elsevier, vol. 303(1), pages 337-353.
  • Handle: RePEc:eee:ejores:v:303:y:2022:i:1:p:337-353
    DOI: 10.1016/j.ejor.2022.02.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221722000947
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2022.02.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Steeger, Gregory & Rebennack, Steffen, 2017. "Dynamic convexification within nested Benders decomposition using Lagrangian relaxation: An application to the strategic bidding problem," European Journal of Operational Research, Elsevier, vol. 257(2), pages 669-686.
    2. David G. Luenberger & Yinyu Ye, 2016. "Linear and Nonlinear Programming," International Series in Operations Research and Management Science, Springer, edition 4, number 978-3-319-18842-3, December.
    3. Björn Geißler & Oliver Kolb & Jens Lang & Günter Leugering & Alexander Martin & Antonio Morsi, 2011. "Mixed integer linear models for the optimization of dynamical transport networks," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 73(3), pages 339-362, June.
    4. Zhen Liu & Rhonda Righter, 1998. "Optimal Load Balancing on Distributed Homogeneous Unreliable Processors," Operations Research, INFORMS, vol. 46(4), pages 563-573, August.
    5. Gullhav, Anders N. & Cordeau, Jean-François & Hvattum, Lars Magnus & Nygreen, Bjørn, 2017. "Adaptive large neighborhood search heuristics for multi-tier service deployment problems in clouds," European Journal of Operational Research, Elsevier, vol. 259(3), pages 829-846.
    6. Juan Pablo Vielma & Shabbir Ahmed & George Nemhauser, 2010. "Mixed-Integer Models for Nonseparable Piecewise-Linear Optimization: Unifying Framework and Extensions," Operations Research, INFORMS, vol. 58(2), pages 303-315, April.
    7. Evan Fields & Carolina Osorio & Tianli Zhou, 2021. "A Data-Driven Method for Reconstructing a Distribution from a Truncated Sample with an Application to Inferring Car-Sharing Demand," Transportation Science, INFORMS, vol. 55(3), pages 616-636, May.
    8. Steffen Rebennack, 2016. "Computing tight bounds via piecewise linear functions through the example of circle cutting problems," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 84(1), pages 3-57, August.
    9. Wang Chi Cheung & David Simchi-Levi & He Wang, 2017. "Technical Note—Dynamic Pricing and Demand Learning with Limited Price Experimentation," Operations Research, INFORMS, vol. 65(6), pages 1722-1731, December.
    10. Jiaxi Liu & Zhibo Wu & Jin Wu & Jian Dong & Yao Zhao & Dongxin Wen, 2017. "A Weibull distribution accrual failure detector for cloud computing," PLOS ONE, Public Library of Science, vol. 12(3), pages 1-16, March.
    11. Florian L. Kutzner & Daniel Read & Neil Stewart & Gordon Brown, 2017. "Choosing the Devil You Don’t Know: Evidence for Limited Sensitivity to Sample Size–Based Uncertainty When It Offers an Advantage," Management Science, INFORMS, vol. 63(5), pages 1519-1528, May.
    12. Tan, K.C. & Chiam, S.C. & Mamun, A.A. & Goh, C.K., 2009. "Balancing exploration and exploitation with adaptive variation for evolutionary multi-objective optimization," European Journal of Operational Research, Elsevier, vol. 197(2), pages 701-713, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kazda, Kody & Li, Xiang, 2024. "A linear programming approach to difference-of-convex piecewise linear approximation," European Journal of Operational Research, Elsevier, vol. 312(2), pages 493-511.
    2. Steffen Rebennack, 2016. "Computing tight bounds via piecewise linear functions through the example of circle cutting problems," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 84(1), pages 3-57, August.
    3. Codas, Andrés & Camponogara, Eduardo, 2012. "Mixed-integer linear optimization for optimal lift-gas allocation with well-separator routing," European Journal of Operational Research, Elsevier, vol. 217(1), pages 222-231.
    4. Christensen, Tue R.L. & Labbé, Martine, 2015. "A branch-cut-and-price algorithm for the piecewise linear transportation problem," European Journal of Operational Research, Elsevier, vol. 245(3), pages 645-655.
    5. Jon Lee & Daphne Skipper & Emily Speakman & Luze Xu, 2023. "Gaining or Losing Perspective for Piecewise-Linear Under-Estimators of Convex Univariate Functions," Journal of Optimization Theory and Applications, Springer, vol. 196(1), pages 1-35, January.
    6. Jason Rhuggenaath & Alp Akcay & Yingqian Zhang & Uzay Kaymak, 2022. "Setting Reserve Prices in Second-Price Auctions with Unobserved Bids," INFORMS Journal on Computing, INFORMS, vol. 34(6), pages 2950-2967, November.
    7. Mengying Xue & Tianhu Deng & Zuo‐Jun Max Shen, 2019. "Optimizing natural gas pipeline transmission with nonuniform elevation: A new initialization approach," Naval Research Logistics (NRL), John Wiley & Sons, vol. 66(7), pages 547-564, October.
    8. Xuejun Zhao & Ruihao Zhu & William B. Haskell, 2022. "Learning to Price Supply Chain Contracts against a Learning Retailer," Papers 2211.04586, arXiv.org.
    9. Sandra S. Y. Tan & Antonios Varvitsiotis & Vincent Y. F. Tan, 2021. "Analysis of Optimization Algorithms via Sum-of-Squares," Journal of Optimization Theory and Applications, Springer, vol. 190(1), pages 56-81, July.
    10. Michael K. McWilliam & Antariksh C. Dicholkar & Frederik Zahle & Taeseong Kim, 2022. "Post-Optimum Sensitivity Analysis with Automatically Tuned Numerical Gradients Applied to Swept Wind Turbine Blades," Energies, MDPI, vol. 15(9), pages 1-19, April.
    11. Devine, Mel T. & Siddiqui, Sauleh, 2023. "Strategic investment decisions in an oligopoly with a competitive fringe: An equilibrium problem with equilibrium constraints approach," European Journal of Operational Research, Elsevier, vol. 306(3), pages 1473-1494.
    12. Gel, Esma S. & Salman, F. Sibel, 2022. "Dynamic ordering decisions with approximate learning of supply yield uncertainty," International Journal of Production Economics, Elsevier, vol. 243(C).
    13. repec:hal:wpspec:info:hdl:2441/4lhe3u3c38ojohjlcbfaupcjr is not listed on IDEAS
    14. Victor Martínez‐de‐Albéniz & Arnau Planas & Stefano Nasini, 2020. "Using Clickstream Data to Improve Flash Sales Effectiveness," Production and Operations Management, Production and Operations Management Society, vol. 29(11), pages 2508-2531, November.
    15. Hu Wang & Di Li & Changbin Jiang & Yuxiang Zhang, 2023. "Exploring the Interactive Relationship between Retailers’ Free Shipping Decisions and Manufacturers’ Product Sales in Digital Retailing," Sustainability, MDPI, vol. 15(17), pages 1-19, August.
    16. Eduardo Rauh Müller & Eduardo Camponogara & Laio Oriel Seman & Eduardo Otte Hülse & Bruno Ferreira Vieira & Luis Kin Miyatake & Alex Furtado Teixeira, 2022. "Short-term steady-state production optimization of offshore oil platforms: wells with dual completion (gas-lift and ESP) and flow assurance," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 152-180, April.
    17. Hedlund, Jonas, 2017. "Bayesian persuasion by a privately informed sender," Journal of Economic Theory, Elsevier, vol. 167(C), pages 229-268.
    18. Lambert, Mathieu & Hassani, Rachid, 2023. "Diesel genset optimization in remote microgrids," Applied Energy, Elsevier, vol. 340(C).
    19. Han-Lin Li & Yao-Huei Huang & Shu-Cherng Fang, 2017. "Linear Reformulation of Polynomial Discrete Programming for Fast Computation," INFORMS Journal on Computing, INFORMS, vol. 29(1), pages 108-122, February.
    20. Andreas Bärmann & Alexander Martin & Oskar Schneider, 2021. "Efficient Formulations and Decomposition Approaches for Power Peak Reduction in Railway Traffic via Timetabling," Transportation Science, INFORMS, vol. 55(3), pages 747-767, May.
    21. Hua, Hao & Hovestadt, Ludger & Tang, Peng & Li, Biao, 2019. "Integer programming for urban design," European Journal of Operational Research, Elsevier, vol. 274(3), pages 1125-1137.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:303:y:2022:i:1:p:337-353. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.