IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v37y2022i5d10.1007_s00180-022-01195-7.html
   My bibliography  Save this article

Oblique decision tree induction by cross-entropy optimization based on the von Mises–Fisher distribution

Author

Listed:
  • Ferdinand Bollwein

    (Clausthal University of Technology)

  • Stephan Westphal

    (Clausthal University of Technology)

Abstract

Oblique decision trees recursively divide the feature space by using splits based on linear combinations of attributes. Compared to their univariate counterparts, which only use a single attribute per split, they are often smaller and more accurate. A common approach to learn decision trees is by iteratively introducing splits on a training set in a top–down manner, yet determining a single optimal oblique split is in general computationally intractable. Therefore, one has to rely on heuristics to find near-optimal splits. In this paper, we adapt the cross-entropy optimization method to tackle this problem. The approach is motivated geometrically by the observation that equivalent oblique splits can be interpreted as connected regions on a unit hypersphere which are defined by the samples in the training data. In each iteration, the algorithm samples multiple candidate solutions from this hypersphere using the von Mises–Fisher distribution which is parameterized by a mean direction and a concentration parameter. These parameters are then updated based on the best performing samples such that when the algorithm terminates a high probability mass is assigned to a region of near-optimal solutions. Our experimental results show that the proposed method is well-suited for the induction of compact and accurate oblique decision trees in a small amount of time.

Suggested Citation

  • Ferdinand Bollwein & Stephan Westphal, 2022. "Oblique decision tree induction by cross-entropy optimization based on the von Mises–Fisher distribution," Computational Statistics, Springer, vol. 37(5), pages 2203-2229, November.
  • Handle: RePEc:spr:compst:v:37:y:2022:i:5:d:10.1007_s00180-022-01195-7
    DOI: 10.1007/s00180-022-01195-7
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-022-01195-7
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-022-01195-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Blanquero, Rafael & Carrizosa, Emilio & Molero-Río, Cristina & Romero Morales, Dolores, 2020. "Sparsity in optimal randomized classification trees," European Journal of Operational Research, Elsevier, vol. 284(1), pages 255-272.
    2. Rubinstein, Reuven Y., 1997. "Optimization of computer simulation models with rare events," European Journal of Operational Research, Elsevier, vol. 99(1), pages 89-112, May.
    3. Pieter-Tjerk de Boer & Dirk Kroese & Shie Mannor & Reuven Rubinstein, 2005. "A Tutorial on the Cross-Entropy Method," Annals of Operations Research, Springer, vol. 134(1), pages 19-67, February.
    4. Reuven Rubinstein, 1999. "The Cross-Entropy Method for Combinatorial and Continuous Optimization," Methodology and Computing in Applied Probability, Springer, vol. 1(2), pages 127-190, September.
    5. Suvrit Sra, 2012. "A short note on parameter approximation for von Mises-Fisher distributions: and a fast implementation of I s (x)," Computational Statistics, Springer, vol. 27(1), pages 177-190, March.
    6. Gary Ulrich, 1984. "Computer Generation of Distributions on the M‐Sphere," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 33(2), pages 158-163, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mattrand, C. & Bourinet, J.-M., 2014. "The cross-entropy method for reliability assessment of cracked structures subjected to random Markovian loads," Reliability Engineering and System Safety, Elsevier, vol. 123(C), pages 171-182.
    2. Nguyen, Hoa T.M. & Chow, Andy H.F. & Ying, Cheng-shuo, 2021. "Pareto routing and scheduling of dynamic urban rail transit services with multi-objective cross entropy method," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 156(C).
    3. Hao Su & Qun Niu & Zhile Yang, 2023. "Optimal Power Flow Using Improved Cross-Entropy Method," Energies, MDPI, vol. 16(14), pages 1-33, July.
    4. Dirk P. Kroese & Sergey Porotsky & Reuven Y. Rubinstein, 2006. "The Cross-Entropy Method for Continuous Multi-Extremal Optimization," Methodology and Computing in Applied Probability, Springer, vol. 8(3), pages 383-407, September.
    5. Jiaqiao Hu & Michael C. Fu & Steven I. Marcus, 2007. "A Model Reference Adaptive Search Method for Global Optimization," Operations Research, INFORMS, vol. 55(3), pages 549-568, June.
    6. R. Y. Rubinstein, 2005. "A Stochastic Minimum Cross-Entropy Method for Combinatorial Optimization and Rare-event Estimation," Methodology and Computing in Applied Probability, Springer, vol. 7(1), pages 5-50, March.
    7. Kin-Ping Hui, 2011. "Cooperative Cross-Entropy method for generating entangled networks," Annals of Operations Research, Springer, vol. 189(1), pages 205-214, September.
    8. Mathieu Balesdent & Jérôme Morio & Loïc Brevault, 2016. "Rare Event Probability Estimation in the Presence of Epistemic Uncertainty on Input Probability Distribution Parameters," Methodology and Computing in Applied Probability, Springer, vol. 18(1), pages 197-216, March.
    9. K.-P. Hui & N. Bean & M. Kraetzl & Dirk Kroese, 2005. "The Cross-Entropy Method for Network Reliability Estimation," Annals of Operations Research, Springer, vol. 134(1), pages 101-118, February.
    10. Fahimnia, Behnam & Sarkis, Joseph & Eshragh, Ali, 2015. "A tradeoff model for green supply chain planning:A leanness-versus-greenness analysis," Omega, Elsevier, vol. 54(C), pages 173-190.
    11. Joshua C. C. Chan & Liana Jacobi & Dan Zhu, 2022. "An automated prior robustness analysis in Bayesian model comparison," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(3), pages 583-602, April.
    12. Reuven Y. Rubinstein, 2006. "How Many Needles are in a Haystack, or How to Solve #P-Complete Counting Problems Fast," Methodology and Computing in Applied Probability, Springer, vol. 8(1), pages 5-51, March.
    13. Ad Ridder & Bruno Tuffin, 2012. "Probabilistic Bounded Relative Error Property for Learning Rare Event Simulation Techniques," Tinbergen Institute Discussion Papers 12-103/III, Tinbergen Institute.
    14. Satyajith Amaran & Nikolaos V. Sahinidis & Bikram Sharda & Scott J. Bury, 2016. "Simulation optimization: a review of algorithms and applications," Annals of Operations Research, Springer, vol. 240(1), pages 351-380, May.
    15. Caballero, Rafael & Hernández-Díaz, Alfredo G. & Laguna, Manuel & Molina, Julián, 2015. "Cross entropy for multiobjective combinatorial optimization problems with linear relaxations," European Journal of Operational Research, Elsevier, vol. 243(2), pages 362-368.
    16. Ad Ridder, 2004. "Importance Sampling Simulations of Markovian Reliability Systems using Cross Entropy," Tinbergen Institute Discussion Papers 04-018/4, Tinbergen Institute.
    17. Tito Homem-de-Mello, 2007. "A Study on the Cross-Entropy Method for Rare-Event Probability Estimation," INFORMS Journal on Computing, INFORMS, vol. 19(3), pages 381-394, August.
    18. Masoud Esmaeilikia & Behnam Fahimnia & Joeseph Sarkis & Kannan Govindan & Arun Kumar & John Mo, 2016. "A tactical supply chain planning model with multiple flexibility options: an empirical evaluation," Annals of Operations Research, Springer, vol. 244(2), pages 429-454, September.
    19. Fahimnia, Behnam & Sarkis, Joseph & Choudhary, Alok & Eshragh, Ali, 2015. "Tactical supply chain planning under a carbon tax policy scheme: A case study," International Journal of Production Economics, Elsevier, vol. 164(C), pages 206-215.
    20. Ali Eshragh & Jerzy Filar & Michael Haythorpe, 2011. "A hybrid simulation-optimization algorithm for the Hamiltonian cycle problem," Annals of Operations Research, Springer, vol. 189(1), pages 103-125, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:37:y:2022:i:5:d:10.1007_s00180-022-01195-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.