IDEAS home Printed from
   My bibliography  Save this article

Using machine learning to assess the predictive potential of standardized nursing data for home healthcare case-mix classification


  • Maud H. Korte

    (Dutch Healthcare Authority (NZa)
    Tilburg University)

  • Gertjan S. Verhoeven

    (Dutch Healthcare Authority (NZa)
    Tilburg University)

  • Arianne M. J. Elissen

    (Maastricht University)

  • Silke F. Metzelthin

    (Maastricht University)

  • Dirk Ruwaard

    (Maastricht University)

  • Misja C. Mikkers

    (Dutch Healthcare Authority (NZa)
    Tilburg University
    Tilburg University)


Background The Netherlands is currently investigating the feasibility of moving from fee-for-service to prospective payments for home healthcare, which would require a suitable case-mix system. In 2017, health insurers mandated a preliminary case-mix system as a first step towards generating information on client differences in relation to care use. Home healthcare providers have also increasingly adopted standardized nursing terminology (SNT) as part of their electronic health records (EHRs), providing novel data for predictive modelling. Objective To explore the predictive potential of SNT data for improvement of the existing preliminary Dutch case-mix classification for home healthcare utilization. Methods We extracted client-level data from the EHRs of a large home healthcare provider, including data from the existing Dutch case-mix system, SNT data (specifically, NANDA-I) and the hours of home healthcare provided. We evaluated the predictive accuracy of the case-mix system and the SNT data separately, and combined, using the machine learning algorithm Random Forest. Results The case-mix system had a predictive performance of 22.4% cross-validated R-squared and 6.2% cross-validated Cumming’s Prediction Measure (CPM). Adding SNT data led to a substantial relative improvement in predicting home healthcare hours, yielding 32.1% R-squared and 15.4% CPM. Discussion The existing preliminary Dutch case-mix system distinguishes client needs to some degree, but not sufficiently. The results indicate that routinely collected SNT data contain sufficient additional predictive value to warrant further research for use in case-mix system design.

Suggested Citation

  • Maud H. Korte & Gertjan S. Verhoeven & Arianne M. J. Elissen & Silke F. Metzelthin & Dirk Ruwaard & Misja C. Mikkers, 2020. "Using machine learning to assess the predictive potential of standardized nursing data for home healthcare case-mix classification," The European Journal of Health Economics, Springer;Deutsche Gesellschaft für Gesundheitsökonomie (DGGÖ), vol. 21(8), pages 1121-1129, November.
  • Handle: RePEc:spr:eujhec:v:21:y:2020:i:8:d:10.1007_s10198-020-01213-9
    DOI: 10.1007/s10198-020-01213-9

    Download full text from publisher

    File URL:
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL:
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    1. Maarse, J.A.M. (Hans) & Jeurissen, P.P. (Patrick), 2016. "The policy and politics of the 2015 long-term care reform in the Netherlands," Health Policy, Elsevier, vol. 120(3), pages 241-245.
    2. Cattel, Daniëlle & Eijkenaar, Frank & Schut, Frederik T., 2020. "Value-based provider payment: towards a theoretically preferred design," Health Economics, Policy and Law, Cambridge University Press, vol. 15(1), pages 94-112, January.
    3. Jegers, Marc & Kesteloot, Katrien & De Graeve, Diana & Gilles, Willem, 2002. "A typology for provider payment systems in health care," Health Policy, Elsevier, vol. 60(3), pages 255-273, June.
    4. Steinbusch, Paul J.M. & Oostenbrink, Jan B. & Zuurbier, Joost J. & Schaepkens, Frans J.M., 2007. "The risk of upcoding in casemix systems: A comparative study," Health Policy, Elsevier, vol. 81(2-3), pages 289-299, May.
    5. Elissen, Arianne M.J. & Struijs, Jeroen N. & Baan, Caroline A. & Ruwaard, Dirk, 2015. "Estimating community health needs against a Triple Aim background: What can we learn from current predictive risk models?," Health Policy, Elsevier, vol. 119(5), pages 672-679.
    6. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    7. Alders, Peter & Schut, Frederik T., 2019. "The 2015 long-term care reform in the Netherlands: Getting the financial incentives right?," Health Policy, Elsevier, vol. 123(3), pages 312-316.
    Full references (including those not matched with items on IDEAS)


    Blog mentions

    As found by, the blog aggregator for Economics research:
    1. Chris Sampson’s journal round-up for 26th October 2020
      by Chris Sampson in The Academic Health Economists' Blog on 2020-10-26 12:00:03

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rainer Kotschy & David E. Bloom, 2022. "A Comparative Perspective on Long-Term Care Systems," NBER Working Papers 29951, National Bureau of Economic Research, Inc.
    2. Grol, Sietske & Molleman, Gerard & van Heumen, Nanne & Muijsenbergh, Maria van den & Scherpbier-de Haan, Nynke & Schers, Henk, 2021. "General practitioners’ views on the influence of long-term care reforms on integrated elderly care in the Netherlands: a qualitative interview study," Health Policy, Elsevier, vol. 125(7), pages 930-940.
    3. Alders, Peter & Schut, Frederik T., 2022. "Strategic cost-shifting in long-term care. Evidence from the Netherlands," Health Policy, Elsevier, vol. 126(1), pages 43-48.
    4. Melberg, Hans Olav & Pedersen, Kine, 2015. "Do changes in reimbursement fees affect hospital prioritization?," HERO Online Working Paper Series 2015:1, University of Oslo, Health Economics Research Programme.
    5. Mohnen Sigrid M. & Rotteveel Adriënne H. & Doornbos Gerda & Polder Johan J., 2020. "Healthcare Expenditure Prediction with Neighbourhood Variables – A Random Forest Model," Statistics, Politics and Policy, De Gruyter, vol. 11(2), pages 111-138, December.
    6. Mads Leth Felsager Jakobsen & Thomas Pallesen, 2017. "Performance Budgeting in Practice: the Case of Danish Hospital Management," Public Organization Review, Springer, vol. 17(2), pages 255-273, June.
    7. Matus-López, Mauricio, 2021. "Diferencia entre cuidados a la dependencia y cuidados de larga duración o long-term care. Una aclaración necesaria [Difference between dependency care and long-term care. A necessary clarification]," MPRA Paper 107959, University Library of Munich, Germany.
    8. Buczak-Stec, Elżbieta & Goryński, Paweł & Nitsch-Osuch, Aneta & Kanecki, Krzysztof & Tyszko, Piotr, 2017. "The impact of introducing a new hospital financing system (DRGs) in Poland on hospitalisations for atherosclerosis: An interrupted time series analysis (2004–2012)," Health Policy, Elsevier, vol. 121(11), pages 1186-1193.
    9. Albert Stuart Reece & Gary Kenneth Hulse, 2022. "European Epidemiological Patterns of Cannabis- and Substance-Related Congenital Neurological Anomalies: Geospatiotemporal and Causal Inferential Study," IJERPH, MDPI, vol. 20(1), pages 1-35, December.
    10. Nolan, Anne, 2019. "Reforming the delivery of public dental services in Ireland: potential cost implications," Research Series, Economic and Social Research Institute (ESRI), number RS80, August.
    11. Carine Milcent, 2016. "Upcoding and heterogeneity in hospitals’ response: A Natural Experiment," PSE Working Papers halshs-01340557, HAL.
    12. Van Belle, Jente & Guns, Tias & Verbeke, Wouter, 2021. "Using shared sell-through data to forecast wholesaler demand in multi-echelon supply chains," European Journal of Operational Research, Elsevier, vol. 288(2), pages 466-479.
    13. Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603,, revised Jan 2023.
    14. Mikkers, Misja, 2016. "The Dutch Healthcare System in International Perspective," Other publications TiSEM 800704a0-24ee-4830-8659-2, Tilburg University, School of Economics and Management.
    15. Marchetto, Elisa & Da Re, Daniele & Tordoni, Enrico & Bazzichetto, Manuele & Zannini, Piero & Celebrin, Simone & Chieffallo, Ludovico & Malavasi, Marco & Rocchini, Duccio, 2023. "Testing the effect of sample prevalence and sampling methods on probability- and favourability-based SDMs," Ecological Modelling, Elsevier, vol. 477(C).
    16. K. P. M. Winssen & R. C. Kleef & W. P. M. M. Ven, 2017. "A voluntary deductible in health insurance: the more years you opt for it, the lower your premium?," The European Journal of Health Economics, Springer;Deutsche Gesellschaft für Gesundheitsökonomie (DGGÖ), vol. 18(2), pages 209-226, March.
    17. Andersson, Tommy & Ellegård, Lina Maria & Enache, Andreea & Erlanson, Albin & Thami, Prakriti, 2021. "Multiple Pricing for Personal Assistance Services," Working Papers 2021:14, Lund University, Department of Economics, revised 25 Oct 2023.
    18. Eeva-Katri Kumpula & Pauline Norris & Adam C Pomerleau, 2020. "Stocks of paracetamol products stored in urban New Zealand households: A cross-sectional study," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-11, June.
    19. Michael Bucker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2020. "Transparency, Auditability and eXplainability of Machine Learning Models in Credit Scoring," Papers 2009.13384,
    20. Costa-Font, Joan & Zigante, Valentina, 2020. "Building ‘implicit partnerships’? Financial long-term care entitlements in Europe," LSE Research Online Documents on Economics 106099, London School of Economics and Political Science, LSE Library.

    More about this item


    Case-mix; Home care; Electronic health records; Machine learning; Predictive modelling;
    All these keywords.

    JEL classification:

    • I13 - Health, Education, and Welfare - - Health - - - Health Insurance, Public and Private
    • I11 - Health, Education, and Welfare - - Health - - - Analysis of Health Care Markets
    • I18 - Health, Education, and Welfare - - Health - - - Government Policy; Regulation; Public Health
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:eujhec:v:21:y:2020:i:8:d:10.1007_s10198-020-01213-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.