IDEAS home Printed from https://ideas.repec.org/a/inm/orijds/v1y2022i1p27-49.html
   My bibliography  Save this article

Adapting Reinforcement Learning Treatment Policies Using Limited Data to Personalize Critical Care

Author

Listed:
  • Matt Baucum

    (Department of Business Analytics, Information Systems & Supply Chain, Florida State University, Tallahassee, Florida 32306)

  • Anahita Khojandi

    (Department of Industrial and Systems Engineering, University of Tennessee, Knoxville, Tennessee 37996)

  • Rama Vasudevan

    (Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831)

  • Robert Davis

    (University of Tennessee Health Science Center, Memphis, Tennessee 38163)

Abstract

Reinforcement learning (RL) demonstrates promise for developing effective treatment policies in critical care settings. However, existing RL methods often require large and comprehensive patient data sets and do not readily lend themselves to settings in which certain patient subpopulations are severely underrepresented. In this study, we develop a new method, noisy Bayesian policy updates (NBPU), for selecting high-performing reinforcement learning–based treatment policies for underrepresented patient subpopulations using limited observations. Our method uses variational inference to learn a probability distribution over treatment policies based on a reference patient subpopulation for which sufficient data are available. It then exploits limited data from an underrepresented patient subpopulation to update this probability distribution and adapts its recommendations to this subpopulation. We demonstrate our method’s utility on a data set of ICU patients receiving intravenous blood anticoagulant medication. Our results show that NBPU outperforms state-of-the-art methods in terms of both selecting effective treatment policies for patients with nontypical clinical characteristics and predicting the corresponding policies’ performance for these patients.

Suggested Citation

  • Matt Baucum & Anahita Khojandi & Rama Vasudevan & Robert Davis, 2022. "Adapting Reinforcement Learning Treatment Policies Using Limited Data to Personalize Critical Care," INFORMS Joural on Data Science, INFORMS, vol. 1(1), pages 27-49, April.
  • Handle: RePEc:inm:orijds:v:1:y:2022:i:1:p:27-49
    DOI: 10.1287/ijds.2022.0015
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/ijds.2022.0015
    Download Restriction: no

    File URL: https://libkey.io/10.1287/ijds.2022.0015?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Diana M. Negoescu & Kostas Bimpikis & Margaret L. Brandeau & Dan A. Iancu, 2018. "Dynamic Learning of Patient Response Types: An Application to Treating Chronic Diseases," Management Science, INFORMS, vol. 64(8), pages 3469-3488, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Preil, Deniz & Krapp, Michael, 2022. "Bandit-based inventory optimisation: Reinforcement learning in multi-echelon supply chains," International Journal of Production Economics, Elsevier, vol. 252(C).
    2. Wenjuan Fan & Yang Zong & Subodha Kumar, 2022. "Optimal treatment of chronic kidney disease with uncertainty in obtaining a transplantable kidney: an MDP based approach," Annals of Operations Research, Springer, vol. 316(1), pages 269-302, September.
    3. Wei Chen & Yixin Lu & Liangfei Qiu & Subodha Kumar, 2021. "Designing Personalized Treatment Plans for Breast Cancer," Information Systems Research, INFORMS, vol. 32(3), pages 932-949, September.
    4. Kaustav Das & Nicolas Klein, 2020. "Do Stronger Patents Lead to Faster Innovation? The Effect of Duplicative Search," Discussion Papers in Economics 20/03, Division of Economics, School of Business, University of Leicester.
    5. Mintz, Yonatan & Aswani, Anil & Kaminsky, Philip & Flowers, Elena & Fukuoka, Yoshimi, 2023. "Behavioral analytics for myopic agents," European Journal of Operational Research, Elsevier, vol. 310(2), pages 793-811.
    6. Hao Zhang, 2022. "Analytical Solution to a Discrete-Time Model for Dynamic Learning and Decision Making," Management Science, INFORMS, vol. 68(8), pages 5924-5957, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijds:v:1:y:2022:i:1:p:27-49. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.