IDEAS home Printed from https://ideas.repec.org/a/inm/orijoc/v34y2022i6p3259-3276.html
   My bibliography  Save this article

Disjunctive Rule Lists

Author

Listed:
  • Ronilo Ragodos

    (Tippie College of Business, University of Iowa, Iowa City, Iowa 52242)

  • Tong Wang

    (Tippie College of Business, University of Iowa, Iowa City, Iowa 52242)

Abstract

In this study, we present an interpretable model, disjunctive rule list (DisRL) for regression. This research is motivated by the increasing need for model interpretability, especially in high-stakes decisions such as medicine, where decisions are made on or related to humans. DisRL is a generalized form of rule lists. A DisRL model consists of a list of disjunctive rules embedded in an if-else logic structure that stratifies the data space. Compared with traditional decision trees and other rule list models in the literature that stratify the feature space with single itemsets (an itemset is a conjunction of conditions), each disjunctive rule in DisRL uses a set of itemsets to collectively cover a subregion in the feature space. In addition, a DisRL model is constructed under a global objective that balances the predictive performance and model complexity. To train a DisRL model, we devise a hierarchical stochastic local search algorithm that exploits the properties of DisRL’s unique structure to improve search efficiency. The algorithm adopts the main structure of simulated annealing and customizes the proposing strategy for faster convergence. Meanwhile, the algorithm uses a prefix bound to locate a subset of the search area, effectively pruning the search space at each iteration. An ablation study shows the effectiveness of this strategy in pruning the search space. Experiments on public benchmark datasets demonstrate that DisRL outperforms baseline interpretable models, including decision trees and other rule-based regressors.

Suggested Citation

  • Ronilo Ragodos & Tong Wang, 2022. "Disjunctive Rule Lists," INFORMS Journal on Computing, INFORMS, vol. 34(6), pages 3259-3276, November.
  • Handle: RePEc:inm:orijoc:v:34:y:2022:i:6:p:3259-3276
    DOI: 10.1287/ijoc.2022.1242
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/ijoc.2022.1242
    Download Restriction: no

    File URL: https://libkey.io/10.1287/ijoc.2022.1242?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Grubinger, Thomas & Zeileis, Achim & Pfeiffer, Karl-Peter, 2014. "evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 61(i01).
    2. Bruce Hajek, 1988. "Cooling Schedules for Optimal Annealing," Mathematics of Operations Research, INFORMS, vol. 13(2), pages 311-329, May.
    3. Stanislav Vojíř & Tomáš Kliegr, 2020. "Editable machine learning models? A rule-based framework for user studies of explainability," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(4), pages 785-799, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. C. P. Stephens & W. Baritompa, 1998. "Global Optimization Requires Global Information," Journal of Optimization Theory and Applications, Springer, vol. 96(3), pages 575-588, March.
    2. Stoica, R.S. & Gregori, P. & Mateu, J., 2005. "Simulated annealing and object point processes: Tools for analysis of spatial patterns," Stochastic Processes and their Applications, Elsevier, vol. 115(11), pages 1860-1882, November.
    3. Emmanuel Jordy Menvouta & Jolien Ponnet & Robin Van Oirbeek & Tim Verdonck, 2022. "mCube: Multinomial Micro-level reserving Model," Papers 2212.00101, arXiv.org.
    4. George Kapetanios, 2005. "Variable Selection using Non-Standard Optimisation of Information Criteria," Working Papers 533, Queen Mary University of London, School of Economics and Finance.
    5. Souvik Das & Ashwin Aravind & Ashish Cherukuri & Debasish Chatterjee, 2022. "Near-optimal solutions of convex semi-infinite programs via targeted sampling," Annals of Operations Research, Springer, vol. 318(1), pages 129-146, November.
    6. Fernandez Martinez, Roberto & Lostado Lorza, Ruben & Santos Delgado, Ana Alexandra & Piedra, Nelson, 2021. "Use of classification trees and rule-based models to optimize the funding assignment to research projects: A case study of UTPL," Journal of Informetrics, Elsevier, vol. 15(1).
    7. Höppner, Sebastiaan & Stripling, Eugen & Baesens, Bart & Broucke, Seppe vanden & Verdonck, Tim, 2020. "Profit driven decision trees for churn prediction," European Journal of Operational Research, Elsevier, vol. 284(3), pages 920-933.
    8. Pirlot, Marc, 1996. "General local search methods," European Journal of Operational Research, Elsevier, vol. 92(3), pages 493-511, August.
    9. Steinhofel, K. & Albrecht, A. & Wong, C. K., 1999. "Two simulated annealing-based heuristics for the job shop scheduling problem," European Journal of Operational Research, Elsevier, vol. 118(3), pages 524-548, November.
    10. Löwe, Matthias, 1997. "On the invariant measure of non-reversible simulated annealing," Statistics & Probability Letters, Elsevier, vol. 36(2), pages 189-193, December.
    11. Miclo, Laurent, 1995. "Remarques sur l'ergodicité des algorithmes de recuit simulé sur un graphe," Stochastic Processes and their Applications, Elsevier, vol. 58(2), pages 329-360, August.
    12. Kapetanios, George, 2006. "Cluster analysis of panel data sets using non-standard optimisation of information criteria," Journal of Economic Dynamics and Control, Elsevier, vol. 30(8), pages 1389-1408, August.
    13. Antonio Jiménez-Martín & Alfonso Mateos & Josefa Z. Hernández, 2021. "Aluminium Parts Casting Scheduling Based on Simulated Annealing," Mathematics, MDPI, vol. 9(7), pages 1-18, March.
    14. Van Buer, Michael G. & Woodruff, David L. & Olson, Rick T., 1999. "Solving the medium newspaper production/distribution problem," European Journal of Operational Research, Elsevier, vol. 115(2), pages 237-253, June.
    15. Zhang Lihao & Ye Zeyang & Deng Yuefan, 2019. "Parallel MCMC methods for global optimization," Monte Carlo Methods and Applications, De Gruyter, vol. 25(3), pages 227-237, September.
    16. Yiyo Kuo, 2014. "Design method using hybrid of line-type and circular-type routes for transit network system optimization," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 22(2), pages 600-613, July.
    17. Hajko, Vladimír, 2017. "The failure of Energy-Economy Nexus: A meta-analysis of 104 studies," Energy, Elsevier, vol. 125(C), pages 771-787.
    18. Broekmeulen, Rob A. C. M. & van Weert, Arjen & Saedt, Anton P. H., 2002. "Comparing three alternative optimisation methods for the treatment planning of bulbs," Agricultural Systems, Elsevier, vol. 72(1), pages 59-71, April.
    19. F. R. B. Cruz & A. R. Duarte & G. L. Souza, 2018. "Multi-objective performance improvements of general finite single-server queueing networks," Journal of Heuristics, Springer, vol. 24(5), pages 757-781, October.
    20. Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijoc:v:34:y:2022:i:6:p:3259-3276. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.