IDEAS home Printed from https://ideas.repec.org/a/eee/reensy/v191y2019ics0951832018313309.html
   My bibliography  Save this article

Managing engineering systems with large state and action spaces through deep reinforcement learning

Author

Listed:
  • Andriotis, C.P.
  • Papakonstantinou, K.G.

Abstract

Decision-making for engineering systems management can be efficiently formulated using Markov Decision Processes (MDPs) or Partially Observable MDPs (POMDPs). Typical MDP/POMDP solution procedures utilize offline knowledge about the environment and provide detailed policies for relatively small systems with tractable state and action spaces. However, in large multi-component systems the dimensions of these spaces easily explode, as system states and actions scale exponentially with the number of components, whereas environment dynamics are difficult to be described explicitly for the entire system and may, often, only be accessible through computationally expensive numerical simulators. In this work, to address these issues, an integrated Deep Reinforcement Learning (DRL) framework is introduced. The Deep Centralized Multi-agent Actor Critic (DCMAC) is developed, an off-policy actor-critic DRL algorithm that directly probes the state/belief space of the underlying MDP/POMDP, providing efficient life-cycle policies for large multi-component systems operating in high-dimensional spaces. Apart from deep network approximators parametrizing complex functions with vast state spaces, DCMAC also adopts a factorized representation of the system actions, thus being able to designate individualized component- and subsystem-level decisions, while maintaining a centralized value function for the entire system. DCMAC compares well against Deep Q-Network and exact solutions, where applicable, and outperforms optimized baseline policies that incorporate time-based, condition-based, and periodic inspection and maintenance considerations.

Suggested Citation

  • Andriotis, C.P. & Papakonstantinou, K.G., 2019. "Managing engineering systems with large state and action spaces through deep reinforcement learning," Reliability Engineering and System Safety, Elsevier, vol. 191(C).
  • Handle: RePEc:eee:reensy:v:191:y:2019:i:c:s0951832018313309
    DOI: 10.1016/j.ress.2019.04.036
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0951832018313309
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ress.2019.04.036?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Alaswad, Suzan & Xiang, Yisha, 2017. "A review on condition-based maintenance optimization models for stochastically deteriorating system," Reliability Engineering and System Safety, Elsevier, vol. 157(C), pages 54-63.
    2. Kallen, M.J. & van Noortwijk, J.M., 2005. "Optimal maintenance decisions under imperfect inspection," Reliability Engineering and System Safety, Elsevier, vol. 90(2), pages 177-185.
    3. Kuhn, Kenneth D. & Madanat, Samer M., 2005. "Model Uncertainty and the Management of a System of Infrastructure Facilities," University of California Transportation Center, Working Papers qt6c84b9b4, University of California Transportation Center.
    4. Papakonstantinou, K.G. & Shinozuka, M., 2014. "Planning structural inspection and maintenance policies via dynamic programming and Markov processes. Part I: Theory," Reliability Engineering and System Safety, Elsevier, vol. 130(C), pages 202-213.
    5. Li, Xiang & Ding, Qian & Sun, Jian-Qiao, 2018. "Remaining useful life estimation in prognostics using deep convolution neural networks," Reliability Engineering and System Safety, Elsevier, vol. 172(C), pages 1-11.
    6. Zhou, Xiaojun & Xi, Lifeng & Lee, Jay, 2007. "Reliability-centered predictive maintenance scheduling for a continuously monitored system subject to degradation," Reliability Engineering and System Safety, Elsevier, vol. 92(4), pages 530-534.
    7. van Noortwijk, J.M., 2009. "A survey of the application of gamma processes in maintenance," Reliability Engineering and System Safety, Elsevier, vol. 94(1), pages 2-21.
    8. Chen, Dongyan & Trivedi, Kishor S., 2005. "Optimization for condition-based maintenance with semi-Markov decision process," Reliability Engineering and System Safety, Elsevier, vol. 90(1), pages 25-29.
    9. B. Castanier & C. Bérenguer & A. Grall, 2003. "A sequential condition‐based repair/replacement policy with non‐periodic inspections for a system subject to continuous wear," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 19(4), pages 327-347, October.
    10. Memarzadeh, Milad & Pozzi, Matteo & Kolter, J. Zico, 2016. "Hierarchical modeling of systems with similar components: A framework for adaptive monitoring and control," Reliability Engineering and System Safety, Elsevier, vol. 153(C), pages 159-169.
    11. Papakonstantinou, K.G. & Shinozuka, M., 2014. "Planning structural inspection and maintenance policies via dynamic programming and Markov processes. Part II: POMDP implementation," Reliability Engineering and System Safety, Elsevier, vol. 130(C), pages 214-224.
    12. Malings, C. & Pozzi, M., 2018. "Value-of-information in spatio-temporal systems: Sensor placement and scheduling," Reliability Engineering and System Safety, Elsevier, vol. 172(C), pages 45-57.
    13. Bocchini, Paolo & Frangopol, Dan M., 2011. "A probabilistic computational framework for bridge network optimal maintenance scheduling," Reliability Engineering and System Safety, Elsevier, vol. 96(2), pages 332-349.
    14. Yang, David Y. & Frangopol, Dan M., 2019. "Life-cycle management of deteriorating civil infrastructure considering resilience to lifetime hazards: A general approach based on renewal-reward processes," Reliability Engineering and System Safety, Elsevier, vol. 183(C), pages 197-212.
    15. Tapas K. Das & Abhijit Gosavi & Sridhar Mahadevan & Nicholas Marchalleck, 1999. "Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning," Management Science, INFORMS, vol. 45(4), pages 560-574, April.
    16. Memarzadeh, Milad & Pozzi, Matteo, 2016. "Value of information in sequential decision making: Component inspection, permanent monitoring and system-level scheduling," Reliability Engineering and System Safety, Elsevier, vol. 154(C), pages 137-151.
    17. Nguyen, Kim-Anh & Do, Phuc & Grall, Antoine, 2015. "Multi-level predictive maintenance for multi-component systems," Reliability Engineering and System Safety, Elsevier, vol. 144(C), pages 83-94.
    18. Tamilselvan, Prasanna & Wang, Pingfeng, 2013. "Failure diagnosis using deep belief learning based health state classification," Reliability Engineering and System Safety, Elsevier, vol. 115(C), pages 124-135.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Andriotis, C.P. & Papakonstantinou, K.G., 2021. "Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints," Reliability Engineering and System Safety, Elsevier, vol. 212(C).
    2. Alaswad, Suzan & Xiang, Yisha, 2017. "A review on condition-based maintenance optimization models for stochastically deteriorating system," Reliability Engineering and System Safety, Elsevier, vol. 157(C), pages 54-63.
    3. Fauriat, William & Zio, Enrico, 2020. "Optimization of an aperiodic sequential inspection and condition-based maintenance policy driven by value of information," Reliability Engineering and System Safety, Elsevier, vol. 204(C).
    4. de Jonge, Bram & Scarf, Philip A., 2020. "A review on maintenance optimization," European Journal of Operational Research, Elsevier, vol. 285(3), pages 805-824.
    5. Mancuso, A. & Compare, M. & Salo, A. & Zio, E., 2021. "Optimal Prognostics and Health Management-driven inspection and maintenance strategies for industrial systems," Reliability Engineering and System Safety, Elsevier, vol. 210(C).
    6. Xu, Zhaoyi & Saleh, Joseph Homer, 2021. "Machine learning for reliability engineering and safety applications: Review of current status and future opportunities," Reliability Engineering and System Safety, Elsevier, vol. 211(C).
    7. Song, Chaolin & Zhang, Chi & Shafieezadeh, Abdollah & Xiao, Rucheng, 2022. "Value of information analysis in non-stationary stochastic decision environments: A reliability-assisted POMDP approach," Reliability Engineering and System Safety, Elsevier, vol. 217(C).
    8. Esposito, Nicola & Mele, Agostino & Castanier, Bruno & GIORGIO, Massimiliano, 2023. "A hybrid maintenance policy for a deteriorating unit in the presence of three forms of variability," Reliability Engineering and System Safety, Elsevier, vol. 237(C).
    9. Kamariotis, Antonios & Tatsis, Konstantinos & Chatzi, Eleni & Goebel, Kai & Straub, Daniel, 2024. "A metric for assessing and optimizing data-driven prognostic algorithms for predictive maintenance," Reliability Engineering and System Safety, Elsevier, vol. 242(C).
    10. Lee, Juseong & Mitici, Mihaela, 2020. "An integrated assessment of safety and efficiency of aircraft maintenance strategies using agent-based modelling and stochastic Petri nets," Reliability Engineering and System Safety, Elsevier, vol. 202(C).
    11. Giorgio, Massimiliano & Pulcini, Gianpaolo, 2024. "The effect of model misspecification of the bounded transformed gamma process on maintenance optimization," Reliability Engineering and System Safety, Elsevier, vol. 241(C).
    12. Zou, Guang & Faber, Michael Havbro & González, Arturo & Banisoleiman, Kian, 2021. "Computing the value of information from periodic testing in holistic decision making under uncertainty," Reliability Engineering and System Safety, Elsevier, vol. 206(C).
    13. Lee, Juseong & Mitici, Mihaela, 2022. "Multi-objective design of aircraft maintenance using Gaussian process learning and adaptive sampling," Reliability Engineering and System Safety, Elsevier, vol. 218(PA).
    14. Yuan, Xian-Xun & Higo, Eishiro & Pandey, Mahesh D., 2021. "Estimation of the value of an inspection and maintenance program: A Bayesian gamma process model," Reliability Engineering and System Safety, Elsevier, vol. 216(C).
    15. Dinh, Duc-Hanh & Do, Phuc & Iung, Benoit, 2022. "Multi-level opportunistic predictive maintenance for multi-component systems with economic dependence and assembly/disassembly impacts," Reliability Engineering and System Safety, Elsevier, vol. 217(C).
    16. KarabaÄŸ, Oktay & Eruguz, Ayse Sena & Basten, Rob, 2020. "Integrated optimization of maintenance interventions and spare part selection for a partially observable multi-component system," Reliability Engineering and System Safety, Elsevier, vol. 200(C).
    17. Huynh, K.T., 2021. "An adaptive predictive maintenance model for repairable deteriorating systems using inverse Gaussian degradation process," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    18. Lozano, Jorge-Mario & Zuluaga, Santiago & Sánchez-Silva, Mauricio, 2020. "Developing flexible management strategies in infrastructure: The sequential expansion problem for infrastructure analysis (SEPIA)," Reliability Engineering and System Safety, Elsevier, vol. 200(C).
    19. Nguyen, Khanh T.P. & Medjaher, Kamal, 2019. "A new dynamic predictive maintenance framework using deep learning for failure prognostics," Reliability Engineering and System Safety, Elsevier, vol. 188(C), pages 251-262.
    20. Memarzadeh, Milad & Pozzi, Matteo, 2016. "Value of information in sequential decision making: Component inspection, permanent monitoring and system-level scheduling," Reliability Engineering and System Safety, Elsevier, vol. 154(C), pages 137-151.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:reensy:v:191:y:2019:i:c:s0951832018313309. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/reliability-engineering-and-system-safety .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.