IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v98y2007i4p695-705.html
   My bibliography  Save this article

A penalized criterion for variable selection in classification

Author

Listed:
  • Mary-Huard, Tristan
  • Robin, Stéphane
  • Daudin, Jean-Jacques

Abstract

In this paper, the problem of variable selection in classification is considered. On the basis of recent developments in model selection theory, we provide a criterion based on penalized empirical risk, where the penalization explicitly takes into account the number of variables of the considered models. Moreover, we give an oracle-type inequality that non-asymptotically guarantees the performance of the resulting classification rule. We discuss the optimality of the proposed criterion and present an application of the main result to backward and forward selection procedures.

Suggested Citation

  • Mary-Huard, Tristan & Robin, Stéphane & Daudin, Jean-Jacques, 2007. "A penalized criterion for variable selection in classification," Journal of Multivariate Analysis, Elsevier, vol. 98(4), pages 695-705, April.
  • Handle: RePEc:eee:jmvana:v:98:y:2007:i:4:p:695-705
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047-259X(06)00092-3
    Download Restriction: Full text for ScienceDirect subscribers only
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Peter L. Bartlett & Stéphane Boucheron & Gábor Lugosi, 2000. "Model selection and error estimation," Economics Working Papers 508, Department of Economics and Business, Universitat Pompeu Fabra.
    2. C. E. McHenry, 1978. "Computation of a Best Subset in Multivariate Analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 27(3), pages 291-296, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. A. Iduseri & J. E. Osemwenkhae, 2018. "A New Approach for Improving Classification Accuracy in Predictive Discriminant Analysis," Annals of Data Science, Springer, vol. 5(3), pages 339-357, September.
    2. Maugis, C. & Celeux, G. & Martin-Magniette, M.-L., 2011. "Variable selection in model-based discriminant analysis," Journal of Multivariate Analysis, Elsevier, vol. 102(10), pages 1374-1387, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fischer, Aurélie, 2010. "Quantization and clustering with Bregman divergences," Journal of Multivariate Analysis, Elsevier, vol. 101(9), pages 2207-2221, October.
    2. Eric Mbakop & Max Tabord‐Meehan, 2021. "Model Selection for Treatment Choice: Penalized Welfare Maximization," Econometrica, Econometric Society, vol. 89(2), pages 825-848, March.
    3. Thomas M. Russell, 2020. "Policy Transforms and Learning Optimal Policies," Papers 2012.11046, arXiv.org.
    4. Daudin, Jean-Jacques & Mary-Huard, Tristan, 2008. "Estimation of the conditional risk in classification: The swapping method," Computational Statistics & Data Analysis, Elsevier, vol. 52(6), pages 3220-3232, February.
    5. Duarte Silva, António Pedro, 2001. "Efficient Variable Screening for Multivariate Analysis," Journal of Multivariate Analysis, Elsevier, vol. 76(1), pages 35-62, January.
    6. Cipollini, Francesca & Oneto, Luca & Coraddu, Andrea & Murphy, Alan John & Anguita, Davide, 2018. "Condition-based maintenance of naval propulsion systems: Data analysis with minimal feedback," Reliability Engineering and System Safety, Elsevier, vol. 177(C), pages 12-23.
    7. Hutter, Marcus & Tran, Minh-Ngoc, 2010. "Model selection with the Loss Rank Principle," Computational Statistics & Data Analysis, Elsevier, vol. 54(5), pages 1288-1306, May.
    8. Olivier Bousquet, 2003. "New approaches to statistical learning theory," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 55(2), pages 371-389, June.
    9. Alessio Sancetta, 2010. "Bootstrap model selection for possibly dependent and heterogeneous data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 62(3), pages 515-546, June.
    10. Jiun-Hua Su, 2019. "Model Selection in Utility-Maximizing Binary Prediction," Papers 1903.00716, arXiv.org, revised Jul 2020.
    11. Adam B. Kashlak & John A. D. Aston & Richard Nickl, 2019. "Inference on Covariance Operators via Concentration Inequalities: k-sample Tests, Classification, and Clustering via Rademacher Complexities," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 81(1), pages 214-243, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:98:y:2007:i:4:p:695-705. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.