IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v74y1997i0p129-15710.1023-a1018962102794.html
   My bibliography  Save this article

A mixed integer programming algorithm for minimizing the training sample misclassification cost in two-group classification

Author

Listed:
  • A. Duarte Silva
  • Antonie Stam

Abstract

In this paper, we introduce the Divide and Conquer (D&C) algorithm, a computationally attractive algorithm for determining classification rules which minimize the training sample misclassification cost in two-group classification. This classification rule can be derived using mixed integer programming (MIP) techniques. However, it is well-documented that the complexity of MIP-based classification problems grows exponentially as a function of the size of the training sample and the number of attributes describing the observations, requiring special-purpose algorithms to solve even small size problems within a reasonable computational time. The D&C algorithm derives its name from the fact that it relies, a.o., on partitioning the problem in smaller, more easily handled sub-problems, rendering it substantially faster than previously proposed algorithms. The D&C algorithm solves the problem to the exact optimal solution (i.e., it is not a heuristic that approximates the solution), and allows for the analysis of much larger training samples than previous methods. For instance, our computational experiments indicate that, on average, the D&C algorithm solves problems with 2 attributes and 500 observations more than 3 times faster, and problems with 5 attributes and 100 observations over 50 times faster than Soltysik and Yarnold's software, which may be the fastest existing algorithm. We believe that the D&C algorithm contributes significantly to the field of classification analysis, because it substantially widens the array of data sets that can be analyzed meaningfully using methods which require MIP techniques, in particular methods which seek to minimize the misclassification cost in the training sample. The programs implementing the D&C algorithm are available from the authors upon request. Copyright Kluwer Academic Publishers 1997

Suggested Citation

  • A. Duarte Silva & Antonie Stam, 1997. "A mixed integer programming algorithm for minimizing the training sample misclassification cost in two-group classification," Annals of Operations Research, Springer, vol. 74(0), pages 129-157, November.
  • Handle: RePEc:spr:annopr:v:74:y:1997:i:0:p:129-157:10.1023/a:1018962102794
    DOI: 10.1023/A:1018962102794
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1023/A:1018962102794
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1023/A:1018962102794?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Parag Pendharkar & Marvin Troutt, 2014. "Interactive classification using data envelopment analysis," Annals of Operations Research, Springer, vol. 214(1), pages 125-141, March.
    2. Asparoukhov, Ognian K. & Krzanowski, Wojtek J., 2001. "A comparison of discriminant procedures for binary variables," Computational Statistics & Data Analysis, Elsevier, vol. 38(2), pages 139-160, December.
    3. Pendharkar, Parag C. & Troutt, Marvin D., 2011. "DEA based dimensionality reduction for classification problems satisfying strict non-satiety assumption," European Journal of Operational Research, Elsevier, vol. 212(1), pages 155-163, July.
    4. Pedro Duarte Silva, A., 2017. "Optimization approaches to Supervised Classification," European Journal of Operational Research, Elsevier, vol. 261(2), pages 772-788.
    5. Loucopoulos, Constantine, 2001. "Three-group classification with unequal misclassification costs: a mathematical programming approach," Omega, Elsevier, vol. 29(3), pages 291-297, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:74:y:1997:i:0:p:129-157:10.1023/a:1018962102794. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.