IDEAS home Printed from https://ideas.repec.org/p/ehl/lserod/119445.html
   My bibliography  Save this paper

Deep spectral Q-learning with application to mobile health

Author

Listed:
  • Gao, Yuhe
  • Shi, Chengchun
  • Song, Rui

Abstract

Dynamic treatment regimes assign personalized treatments to patients sequentially over time based on their baseline information and time-varying covariates. In mobile health applications, these covariates are typically collected at different frequencies over a long time horizon. In this paper, we propose a deep spectral Q-learning algorithm, which integrates principal component analysis (PCA) with deep Q-learning to handle the mixed frequency data. In theory, we prove that the mean return under the estimated optimal policy converges to that under the optimal one and establish its rate of convergence. The usefulness of our proposal is further illustrated via simulations and an application to a diabetes dataset.

Suggested Citation

  • Gao, Yuhe & Shi, Chengchun & Song, Rui, 2023. "Deep spectral Q-learning with application to mobile health," LSE Research Online Documents on Economics 119445, London School of Economics and Political Science, LSE Library.
  • Handle: RePEc:ehl:lserod:119445
    as

    Download full text from publisher

    File URL: http://eprints.lse.ac.uk/119445/
    File Function: Open access version.
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Ana-Maria Staicu & Yingxing Li & Ciprian M. Crainiceanu & David Ruppert, 2014. "Likelihood Ratio Tests for Dependent Data with Applications to Longitudinal and Functional Data Analysis," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 41(4), pages 932-949, December.
    2. Kwiatkowski, Denis & Phillips, Peter C. B. & Schmidt, Peter & Shin, Yongcheol, 1992. "Testing the null hypothesis of stationarity against the alternative of a unit root : How sure are we that economic time series have a unit root?," Journal of Econometrics, Elsevier, vol. 54(1-3), pages 159-178.
    3. Chengchun Shi & Rui Song & Wenbin Lu & Bo Fu, 2018. "Maximin projection learning for optimal treatment decision with heterogeneous individualized treatment effects," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 681-702, September.
    4. Adam Ciarleglio & Eva Petkova & R. Todd Ogden & Thaddeus Tarpey, 2015. "Treatment decisions based on scalar and functional baseline covariates," Biometrics, The International Biometric Society, vol. 71(4), pages 884-894, December.
    5. Weibin Mo & Zhengling Qi & Yufeng Liu, 2021. "Rejoinder: Learning Optimal Distributionally Robust Individualized Treatment Rules," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(534), pages 699-707, April.
    6. Shi, Chengchun & Song, R & Lu, W, 2021. "Concordance and value information criteria for optimal treatment decision," LSE Research Online Documents on Economics 102105, London School of Economics and Political Science, LSE Library.
    7. Shi, Chengchun & Song, Rui & Lu, Wenbin & Fu, Bo, 2018. "Maximin projection learning for optimal treatment decision with heterogeneous individualized treatment effects," LSE Research Online Documents on Economics 102112, London School of Economics and Political Science, LSE Library.
    8. Bair, Eric & Hastie, Trevor & Paul, Debashis & Tibshirani, Robert, 2006. "Prediction by Supervised Principal Components," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 119-137, March.
    9. Ashkan Ertefaie & Robert L Strawderman, 2018. "Constructing dynamic treatment regimes over indefinite time horizons," Biometrika, Biometrika Trust, vol. 105(4), pages 963-977.
    10. Yichi Zhang & Eric B. Laber & Marie Davidian & Anastasios A. Tsiatis, 2018. "Interpretable Dynamic Treatment Regimes," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(524), pages 1541-1549, October.
    11. Daniel J. Luckett & Eric B. Laber & Anna R. Kahkoska & David M. Maahs & Elizabeth Mayer-Davis & Michael R. Kosorok, 2020. "Estimating Dynamic Treatment Regimes in Mobile Health Using V-Learning," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(530), pages 692-706, April.
    12. Xinkun Nie & Emma Brunskill & Stefan Wager, 2021. "Learning When-to-Treat Policies," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(533), pages 392-409, January.
    13. S. A. Murphy, 2003. "Optimal dynamic treatment regimes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 65(2), pages 331-355, May.
    14. Baqun Zhang & Anastasios A. Tsiatis & Eric B. Laber & Marie Davidian, 2013. "Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions," Biometrika, Biometrika Trust, vol. 100(3), pages 681-694.
    15. Ying-Qi Zhao & Donglin Zeng & Eric B. Laber & Michael R. Kosorok, 2015. "New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(510), pages 583-598, June.
    16. Yao, Fang & Muller, Hans-Georg & Wang, Jane-Ling, 2005. "Functional Data Analysis for Sparse Longitudinal Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 577-590, June.
    17. Peng Liao & Predrag Klasnja & Susan Murphy, 2021. "Off-Policy Estimation of Long-Term Average Outcomes With Applications to Mobile Health," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(533), pages 382-391, March.
    18. Eric B. Laber & Ana-Maria Staicu, 2018. "Functional Feature Construction for Individualized Treatment Regimes," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1219-1227, July.
    19. Weibin Mo & Zhengling Qi & Yufeng Liu, 2021. "Learning Optimal Distributionally Robust Individualized Treatment Rules," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(534), pages 659-674, April.
    20. Adam Ciarleglio & Eva Petkova & Todd Ogden & Thaddeus Tarpey, 2018. "Constructing treatment decision rules based on scalar and functional predictors when moderators of treatment effect are unknown," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 67(5), pages 1331-1356, November.
    21. Shi, Chengchun & Zhang, Shengxing & Lu, Wenbin & Song, Rui, 2022. "Statistical inference of the value function for reinforcement learning in infinite-horizon settings," LSE Research Online Documents on Economics 110882, London School of Economics and Political Science, LSE Library.
    22. Chengchun Shi & Sheng Zhang & Wenbin Lu & Rui Song, 2022. "Statistical inference of the value function for reinforcement learning in infinite‐horizon settings," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(3), pages 765-793, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhen Li & Jie Chen & Eric Laber & Fang Liu & Richard Baumgartner, 2023. "Optimal Treatment Regimes: A Review and Empirical Comparison," International Statistical Review, International Statistical Institute, vol. 91(3), pages 427-463, December.
    2. Shi, Chengchun & Luo, Shikai & Le, Yuan & Zhu, Hongtu & Song, Rui, 2022. "Statistically efficient advantage learning for offline reinforcement learning in infinite horizons," LSE Research Online Documents on Economics 115598, London School of Economics and Political Science, LSE Library.
    3. Chengchun Shi & Sheng Zhang & Wenbin Lu & Rui Song, 2022. "Statistical inference of the value function for reinforcement learning in infinite‐horizon settings," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(3), pages 765-793, July.
    4. Shi, Chengchun & Wan, Runzhe & Song, Ge & Luo, Shikai & Zhu, Hongtu & Song, Rui, 2023. "A multiagent reinforcement learning framework for off-policy evaluation in two-sided markets," LSE Research Online Documents on Economics 117174, London School of Economics and Political Science, LSE Library.
    5. Zhang, Yingying & Shi, Chengchun & Luo, Shikai, 2023. "Conformal off-policy prediction," LSE Research Online Documents on Economics 118250, London School of Economics and Political Science, LSE Library.
    6. Pan Zhao & Yifan Cui, 2023. "A Semiparametric Instrumented Difference-in-Differences Approach to Policy Learning," Papers 2310.09545, arXiv.org.
    7. Zhou, Yunzhe & Qi, Zhengling & Shi, Chengchun & Li, Lexin, 2023. "Optimizing pessimism in dynamic treatment regimes: a Bayesian learning approach," LSE Research Online Documents on Economics 118233, London School of Economics and Political Science, LSE Library.
    8. Shi, Chengchun & Zhang, Shengxing & Lu, Wenbin & Song, Rui, 2022. "Statistical inference of the value function for reinforcement learning in infinite-horizon settings," LSE Research Online Documents on Economics 110882, London School of Economics and Political Science, LSE Library.
    9. Hyung Park & Eva Petkova & Thaddeus Tarpey & R. Todd Ogden, 2023. "Functional additive models for optimizing individualized treatment rules," Biometrics, The International Biometric Society, vol. 79(1), pages 113-126, March.
    10. Jingxiang Chen & Yufeng Liu & Donglin Zeng & Rui Song & Yingqi Zhao & Michael R. Kosorok, 2016. "Comment," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(515), pages 942-947, July.
    11. Xin Qiu & Donglin Zeng & Yuanjia Wang, 2018. "Estimation and evaluation of linear individualized treatment rules to guarantee performance," Biometrics, The International Biometric Society, vol. 74(2), pages 517-528, June.
    12. Shosei Sakaguchi, 2021. "Estimation of Optimal Dynamic Treatment Assignment Rules under Policy Constraints," Papers 2106.05031, arXiv.org, revised Apr 2024.
    13. Shosei Sakaguchi, 2024. "Robust Learning for Optimal Dynamic Treatment Regimes with Observational Data," Papers 2404.00221, arXiv.org.
    14. Baqun Zhang & Min Zhang, 2018. "C‐learning: A new classification framework to estimate optimal dynamic treatment regimes," Biometrics, The International Biometric Society, vol. 74(3), pages 891-899, September.
    15. Yuqian Zhang & Weijie Ji & Jelena Bradic, 2021. "Dynamic treatment effects: high-dimensional inference under model misspecification," Papers 2111.06818, arXiv.org, revised Jun 2023.
    16. Yunan Wu & Lan Wang, 2021. "Resampling‐based confidence intervals for model‐free robust inference on optimal treatment regimes," Biometrics, The International Biometric Society, vol. 77(2), pages 465-476, June.
    17. Kristin A. Linn & Eric B. Laber & Leonard A. Stefanski, 2017. "Interactive -Learning for Quantiles," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(518), pages 638-649, April.
    18. Zhang, Xiaoke & Xue, Wu & Wang, Qiyue, 2021. "Covariate balancing functional propensity score for functional treatments in cross-sectional observational studies," Computational Statistics & Data Analysis, Elsevier, vol. 163(C).
    19. Hyung G. Park & Danni Wu & Eva Petkova & Thaddeus Tarpey & R. Todd Ogden, 2023. "Bayesian Index Models for Heterogeneous Treatment Effects on a Binary Outcome," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 15(2), pages 397-418, July.
    20. Rebecca Hager & Anastasios A. Tsiatis & Marie Davidian, 2018. "Optimal two‐stage dynamic treatment regimes from a classification perspective with censored survival data," Biometrics, The International Biometric Society, vol. 74(4), pages 1180-1192, December.

    More about this item

    Keywords

    dynamic treatment regimes; mixed frequency data; principal component analysis; reinforcement learning;
    All these keywords.

    JEL classification:

    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ehl:lserod:119445. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: LSERO Manager (email available below). General contact details of provider: https://edirc.repec.org/data/lsepsuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.