IDEAS home Printed from https://ideas.repec.org/a/spr/jcomop/v49y2025i3d10.1007_s10878-025-01272-9.html
   My bibliography  Save this article

Superposed semi-Markov decision process with application to optimal maintenance systems

Author

Listed:
  • Jianmin Shi

    (Wuhan University
    Haitong Securities)

Abstract

This paper investigates the superposition problem of two or more individual semi-Markov decision processes (SMDPs). The new sequential decision process superposed by individual SMDPs is no longer an SMDP and cannot be handled by routine iterative algorithms, but we can expand its state spaces to obtain a hybrid-state SMDP. Using this hybrid-state SMDP as an auxiliary and inspired by the Robbins–Monro algorithm underlying the reinforcement learning method, we propose an iteration algorithm based on a combination of dynamic programming and reinforcement learning to numerically solve the superposed sequential decision problem. As an illustration example, we apply our superposition model and algorithm to solve the optimal maintenance problem of a two-component independent parallel system.

Suggested Citation

  • Jianmin Shi, 2025. "Superposed semi-Markov decision process with application to optimal maintenance systems," Journal of Combinatorial Optimization, Springer, vol. 49(3), pages 1-19, April.
  • Handle: RePEc:spr:jcomop:v:49:y:2025:i:3:d:10.1007_s10878-025-01272-9
    DOI: 10.1007/s10878-025-01272-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10878-025-01272-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10878-025-01272-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Richard F. Serfozo, 1979. "Technical Note—An Equivalence Between Continuous and Discrete Time Markov Decision Processes," Operations Research, INFORMS, vol. 27(3), pages 616-620, June.
    2. Daniel McInnes & Boris Miller & Gregory Miller & Sergei Schreider, 2020. "Towards Tensor Representation of Controlled Coupled Markov Chains," Mathematics, MDPI, vol. 8(10), pages 1-17, October.
    3. Jack P. C. Kleijnen, 2015. "Simulation Optimization," International Series in Operations Research & Management Science, in: Design and Analysis of Simulation Experiments, edition 2, chapter 6, pages 241-300, Springer.
    4. Dimitris Bertsimas & Velibor V. Mišić, 2016. "Decomposable Markov Decision Processes: A Fluid Optimization Approach," Operations Research, INFORMS, vol. 64(6), pages 1537-1555, December.
    5. Huang, Yonghui & Guo, Xianping, 2011. "Finite horizon semi-Markov decision processes with application to maintenance systems," European Journal of Operational Research, Elsevier, vol. 212(1), pages 131-140, July.
    6. Irene Votsi & Nikolaos Limnios & George Tsaklidis & Eleftheria Papadimitriou, 2012. "Estimation of the Expected Number of Earthquake Occurrences Based on Semi-Markov Models," Methodology and Computing in Applied Probability, Springer, vol. 14(3), pages 685-703, September.
    7. Milan Kumar Das & Anindya Goswami & Nimit Rana, 2016. "Risk Sensitive Portfolio Optimization in a Jump Diffusion Model with Regimes," Papers 1603.09149, arXiv.org, revised Jan 2018.
    8. David B. Brown & Jingwei Zhang, 2022. "Dynamic Programs with Shared Resources and Signals: Dynamic Fluid Policies and Asymptotic Optimality," Operations Research, INFORMS, vol. 70(5), pages 3015-3033, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Milan Kumar Das & Anindya Goswami, 2019. "Testing of binary regime switching models using squeeze duration analysis," International Journal of Financial Engineering (IJFE), World Scientific Publishing Co. Pte. Ltd., vol. 6(01), pages 1-20, March.
    2. Votsi, I. & Limnios, N. & Tsaklidis, G. & Papadimitriou, E., 2013. "Hidden Markov models revealing the stress field underlying the earthquake generation," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 392(13), pages 2868-2885.
    3. Fang Chen & Xianping Guo & Zhong-Wei Liao, 2022. "Optimal Stopping Time on Semi-Markov Processes with Finite Horizon," Journal of Optimization Theory and Applications, Springer, vol. 194(2), pages 408-439, August.
    4. Allahviranloo, Mahdieh & Recker, Will, 2013. "Daily activity pattern recognition by using support vector machines with multiple classes," Transportation Research Part B: Methodological, Elsevier, vol. 58(C), pages 16-43.
    5. Deligiannis, Michalis & Liberopoulos, George, 2023. "Dynamic ordering and buyer selection policies when service affects future demand," Omega, Elsevier, vol. 118(C).
    6. Alexander Zadorojniy & Guy Even & Adam Shwartz, 2009. "A Strongly Polynomial Algorithm for Controlled Queues," Mathematics of Operations Research, INFORMS, vol. 34(4), pages 992-1007, November.
    7. Xiuli Chao & Frank Y. Chen, 2005. "An Optimal Production and Shutdown Strategy when a Supplier Offers an Incentive Program," Manufacturing & Service Operations Management, INFORMS, vol. 7(2), pages 130-143, March.
    8. Dwi Ertiningsih & Sandjai Bhulai & Flora Spieksma, 2018. "A novel use of value iteration for deriving bounds for threshold and switching curve optimal policies," Naval Research Logistics (NRL), John Wiley & Sons, vol. 65(8), pages 638-659, December.
    9. Anindya Goswami & Subhamay Saha & Ravishankar Kapildev Yadav, 2024. "Semimartingale Representation of a Class of Semi-Markov Dynamics," Journal of Theoretical Probability, Springer, vol. 37(1), pages 489-510, March.
    10. Pahr, Alexander & Grunow, Martin & Amorim, Pedro, 2025. "Learning from the aggregated optimum: Managing port wine inventory in the face of climate risks," European Journal of Operational Research, Elsevier, vol. 323(2), pages 671-685.
    11. Hui Zhao & Vinayak Deshpande & Jennifer K. Ryan, 2006. "Emergency transshipment in decentralized dealer networks: When to send and accept transshipment requests," Naval Research Logistics (NRL), John Wiley & Sons, vol. 53(6), pages 547-567, September.
    12. I. M. MacPhee & L. J. Müller, 2007. "Stability Criteria for Multi-class Queueing Networks with Re-entrant Lines," Methodology and Computing in Applied Probability, Springer, vol. 9(3), pages 377-388, September.
    13. Waal, P.R. & Dijk, N.M. van, 1988. "Monotonicity of performance measures in a processor sharing queue," Serie Research Memoranda 0051, VU University Amsterdam, Faculty of Economics, Business Administration and Econometrics.
    14. Minglian Lin & Indranil SenGupta, 2021. "Analysis of optimal portfolio on finite and small time horizons for a stochastic volatility market model," Papers 2104.06293, arXiv.org.
    15. Nicole Leder & Bernd Heidergott & Arie Hordijk, 2010. "An Approximation Approach for the Deviation Matrix of Continuous-Time Markov Processes with Application to Markov Decision Theory," Operations Research, INFORMS, vol. 58(4-part-1), pages 918-932, August.
    16. Marcin Pitera & Łukasz Stettner, 2023. "Discrete‐time risk sensitive portfolio optimization with proportional transaction costs," Mathematical Finance, Wiley Blackwell, vol. 33(4), pages 1287-1313, October.
    17. Laura Eslava & Fernando Baltazar-Larios & Bor Reynoso, 2022. "Maximum Likelihood Estimation for a Markov-Modulated Jump-Diffusion Model," Papers 2211.17220, arXiv.org.
    18. Sobel, Matthew J. & Szmerekovsky, Joseph G. & Tilson, Vera, 2009. "Scheduling projects with stochastic activity duration to maximize expected net present value," European Journal of Operational Research, Elsevier, vol. 198(3), pages 697-705, November.
    19. Doraszelski, Ulrich & Escobar, Juan F., 2019. "Protocol invariance and the timing of decisions in dynamic games," Theoretical Economics, Econometric Society, vol. 14(2), May.
    20. Epaminondas G. Kyriakidis & Theodosis D. Dimitrakos, 2005. "Computation of the Optimal Policy for the Control of a Compound Immigration Process through Total Catastrophes," Methodology and Computing in Applied Probability, Springer, vol. 7(1), pages 97-118, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jcomop:v:49:y:2025:i:3:d:10.1007_s10878-025-01272-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.