IDEAS home Printed from https://ideas.repec.org/a/eee/reensy/v199y2020ics0951832019310385.html
   My bibliography  Save this article

Warm-spare provisioning computing network with switching failure, common cause failure, vacation interruption, and synchronized reneging

Author

Listed:
  • Shekhar, Chandra
  • Kumar, Neeraj
  • Gupta, Amit
  • Kumar, Amit
  • Varshney, Shreekant

Abstract

The loss of a node in a computing network, either for unplanned reasons, such as hardware failure, or planned outages, such as upgrades, can result in degraded computing network performance or loss of redundancy. To reduce this possibility, a spare node is kept powered on and visible on the network. In this paper, we study the Markovian warm-spare nodes provisioning computing network where spare nodes may also be predisposed to failure in switching from standby state to the operating state. In addition to switching failure and common cause failure, the realistic and economical maintenance server’s modified multiple working vacation policy and failed nodes’ synchronized reneging are also considered. Reliability characteristics of the computing network for I/O operations have been derived using transient-state probabilities which have been computed using the theory of the Quasi-Birth-and-Death process, Laplace transforms, Eigenvalue and Eigenvector. The critical analysis of reliability characteristics has also been done and the paper has been enriched with numerical results in the form of tables and graphs to provide a glance at the investigation. The concluding remarks and future scope have also been included.

Suggested Citation

  • Shekhar, Chandra & Kumar, Neeraj & Gupta, Amit & Kumar, Amit & Varshney, Shreekant, 2020. "Warm-spare provisioning computing network with switching failure, common cause failure, vacation interruption, and synchronized reneging," Reliability Engineering and System Safety, Elsevier, vol. 199(C).
  • Handle: RePEc:eee:reensy:v:199:y:2020:i:c:s0951832019310385
    DOI: 10.1016/j.ress.2020.106910
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0951832019310385
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ress.2020.106910?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Haque, Lani & Armstrong, Michael J., 2007. "A survey of the machine interference problem," European Journal of Operational Research, Elsevier, vol. 179(2), pages 469-482, June.
    2. Shekhar, Chandra & Kumar, Amit & Varshney, Shreekant, 2020. "Load sharing redundant repairable systems with switching and reboot delay," Reliability Engineering and System Safety, Elsevier, vol. 193(C).
    3. Kuo, Ching-Chang & Ke, Jau-Chuan, 2016. "Comparative analysis of standby systems with unreliable server and switching failure," Reliability Engineering and System Safety, Elsevier, vol. 145(C), pages 74-82.
    4. Ammar, Sherif I., 2015. "Transient analysis of an M/M/1 queue with impatient behavior and multiple vacations," Applied Mathematics and Computation, Elsevier, vol. 260(C), pages 97-105.
    5. Fiems, Dieter & Maertens, Tom & Bruneel, Herwig, 2008. "Queueing systems with different types of server interruptions," European Journal of Operational Research, Elsevier, vol. 188(3), pages 838-845, August.
    6. Levitin, Gregory & Xing, Liudong & Amari, Suprasad V. & Dai, Yuanshun, 2013. "Reliability of non-repairable phased-mission systems with propagated failures," Reliability Engineering and System Safety, Elsevier, vol. 119(C), pages 218-228.
    7. Sztrik, J. & Bunday, B. D., 1993. "Machine interference problem with a random environment," European Journal of Operational Research, Elsevier, vol. 65(2), pages 259-269, March.
    8. Shin, Yang Woo, 2015. "Algorithmic approach to Markovian multi-server retrial queues with vacations," Applied Mathematics and Computation, Elsevier, vol. 250(C), pages 287-297.
    9. Hsu, Ying-Lin & Ke, Jau-Chuan & Liu, Tzu-Hsin, 2011. "Standby system with general repair, reboot delay, switching failure and unreliable repair facility—A statistical standpoint," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 81(11), pages 2400-2413.
    10. Madhu Jain & Chandra Shekhar & Shalini Shukla, 2013. "Queueing analysis of two unreliable servers machining system with switching and common cause failure," International Journal of Mathematics in Operational Research, Inderscience Enterprises Ltd, vol. 5(4), pages 508-536.
    11. Chandra Shekhar & Shreekant Varshney & Amit Kumar, 2020. "Reliability and Vacation: The Critical Issue," Springer Series in Reliability Engineering, in: Mangey Ram & Hoang Pham (ed.), Advances in Reliability Analysis and its Applications, pages 251-292, Springer.
    12. Liu, Baoliang & Cui, Lirong & Wen, Yanqing & Shen, Jingyuan, 2015. "A cold standby repairable system with working vacations and vacation interruption following Markovian arrival process," Reliability Engineering and System Safety, Elsevier, vol. 142(C), pages 1-8.
    13. Chandra Shekhar & Madhu Jain & Ather Aziz Raina, 2017. "Transient analysis of machining system with spare provisioning and geometric reneging," International Journal of Mathematics in Operational Research, Inderscience Enterprises Ltd, vol. 11(3), pages 396-421.
    14. Coit, David W. & Chatwattanasiri, Nida & Wattanapongsakorn, Naruemon & Konak, Abdullah, 2015. "Dynamic k-out-of-n system reliability with component partnership," Reliability Engineering and System Safety, Elsevier, vol. 138(C), pages 82-92.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Gao, Shan, 2023. "Reliability analysis and optimization for a redundant system with dependent failures and variable repair rates," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 208(C), pages 637-659.
    2. Jafary, Bentolhoda & Mele, Andrew & Fiondella, Lance, 2020. "Component-based system reliability subject to positive and negative correlation," Reliability Engineering and System Safety, Elsevier, vol. 202(C).
    3. Carpitella, Silvia & Mzougui, Ilyas & Benítez, Julio & Carpitella, Fortunato & Certa, Antonella & Izquierdo, Joaquín & La Cascia, Marco, 2021. "A risk evaluation framework for the best maintenance strategy: The case of a marine salt manufacture firm," Reliability Engineering and System Safety, Elsevier, vol. 205(C).
    4. Gao, Shan & Wang, Jinting & Zhang, Jie, 2023. "Reliability analysis of a redundant series system with common cause failures and delayed vacation," Reliability Engineering and System Safety, Elsevier, vol. 239(C).
    5. Yang, Dong-Yuh & Wu, Chia-Huang, 2021. "Evaluation of the availability and reliability of a standby repairable system incorporating imperfect switchovers and working breakdowns," Reliability Engineering and System Safety, Elsevier, vol. 207(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ke, Jau-Chuan & Liu, Tzu-Hsin & Yang, Dong-Yuh, 2018. "Modeling of machine interference problem with unreliable repairman and standbys imperfect switchover," Reliability Engineering and System Safety, Elsevier, vol. 174(C), pages 12-18.
    2. Sharifi, Mani & Taghipour, Sharareh & Abhari, Abdolreza, 2021. "Inspection interval optimization for a k-out-of-n load sharing system under a hybrid mixed redundancy strategy," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    3. Yang, Dong-Yuh & Tsao, Chih-Lung, 2019. "Reliability and availability analysis of standby systems with working vacations and retrial of failed components," Reliability Engineering and System Safety, Elsevier, vol. 182(C), pages 46-55.
    4. Shekhar, Chandra & Kumar, Amit & Varshney, Shreekant, 2020. "Load sharing redundant repairable systems with switching and reboot delay," Reliability Engineering and System Safety, Elsevier, vol. 193(C).
    5. Pedram Sahba & Bariş Balciog̃lu & Dragan Banjevic, 2013. "Analysis of the finite‐source multiclass priority queue with an unreliable server and setup time," Naval Research Logistics (NRL), John Wiley & Sons, vol. 60(4), pages 331-342, June.
    6. Gao, Shan & Wang, Jinting & Zhang, Jie, 2023. "Reliability analysis of a redundant series system with common cause failures and delayed vacation," Reliability Engineering and System Safety, Elsevier, vol. 239(C).
    7. Madhu Jain & Chandra Shekhar & Rakesh Kumar Meena, 2019. "Performance analysis and control F-policy for fault-tolerant system with working vacation," OPSEARCH, Springer;Operational Research Society of India, vol. 56(2), pages 409-431, June.
    8. Pedram Sahba & Barış Balcıog̃lu & Dragan Banjevic, 2022. "The impact of disruption characteristics on the performance of a server," Annals of Operations Research, Springer, vol. 317(1), pages 239-252, October.
    9. Quintanilha, Igor M. & Elias, Vitor R.M. & da Silva, Felipe B. & Fonini, Pedro A.M. & da Silva, Eduardo A.B. & Netto, Sergio L. & Apolinário, José A. & de Campos, Marcello L.R. & Martins, Wallace A., 2021. "A fault detector/classifier for closed-ring power generators using machine learning," Reliability Engineering and System Safety, Elsevier, vol. 212(C).
    10. Ye, Xiong-Fei & Zhang, Yi & Harutoshi, Ogai & Kim, Chul-Woo, 2019. "Hierarchical probability and risk assessment for K-out-of-N system in hierarchy," Reliability Engineering and System Safety, Elsevier, vol. 189(C), pages 242-260.
    11. Andrei Sleptchenko & M. Eric Johnson, 2015. "Maintaining Secure and Reliable Distributed Control Systems," INFORMS Journal on Computing, INFORMS, vol. 27(1), pages 103-117, February.
    12. Manickam Vadivukarasi & Kaliappan Kalidass, 2021. "Discussion on the transient behavior of single server Markovian multiple variant vacation queues," Operations Research and Decisions, Wroclaw University of Science and Technology, Faculty of Management, vol. 31(1), pages 123-146.
    13. Du, Shijia & Zeng, Zhiguo & Cui, Lirong & Kang, Rui, 2017. "Reliability analysis of Markov history-dependent repairable systems with neglected failures," Reliability Engineering and System Safety, Elsevier, vol. 159(C), pages 134-142.
    14. Matsuoka, Takeshi, 2023. "Reliability analysis of a BWR plant system at startup stage  - analysis by the GO-FLOW methodology with consideration of loop structures and phased mission problem -," Reliability Engineering and System Safety, Elsevier, vol. 233(C).
    15. R. Sudhesh & P. Savitha & S. Dharmaraja, 2017. "Transient analysis of a two-heterogeneous servers queue with system disaster, server repair and customers’ impatience," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(1), pages 179-205, April.
    16. Levitin, Gregory & Finkelstein, Maxim, 2018. "Optimal mission abort policy for systems in a random environment with variable shock rate," Reliability Engineering and System Safety, Elsevier, vol. 169(C), pages 11-17.
    17. Cai, Zhiqiang & Si, Shubin & Sun, Shudong & Li, Caitao, 2016. "Optimization of linear consecutive-k-out-of-n system with a Birnbaum importance-based genetic algorithm," Reliability Engineering and System Safety, Elsevier, vol. 152(C), pages 248-258.
    18. Sahba, Pedram & BalcIog[small tilde]lu, BarIs, 2011. "The impact of transportation delays on repairshop capacity pooling and spare part inventories," European Journal of Operational Research, Elsevier, vol. 214(3), pages 674-682, November.
    19. Haque, Lani & Armstrong, Michael J., 2007. "A survey of the machine interference problem," European Journal of Operational Research, Elsevier, vol. 179(2), pages 469-482, June.
    20. Kuo, Ching-Chang & Ke, Jau-Chuan, 2016. "Comparative analysis of standby systems with unreliable server and switching failure," Reliability Engineering and System Safety, Elsevier, vol. 145(C), pages 74-82.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:reensy:v:199:y:2020:i:c:s0951832019310385. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/reliability-engineering-and-system-safety .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.