IDEAS home Printed from https://ideas.repec.org/p/arx/papers/math-0703811.html
   My bibliography  Save this paper

Suboptimality of Penalized Empirical Risk Minimization in Classification

Author

Listed:
  • Guillaume Lecu'e

    (PMA)

Abstract

Let $\cF$ be a set of $M$ classification procedures with values in $[-1,1]$. Given a loss function, we want to construct a procedure which mimics at the best possible rate the best procedure in $\cF$. This fastest rate is called optimal rate of aggregation. Considering a continuous scale of loss functions with various types of convexity, we prove that optimal rates of aggregation can be either $((\log M)/n)^{1/2}$ or $(\log M)/n$. We prove that, if all the $M$ classifiers are binary, the (penalized) Empirical Risk Minimization procedures are suboptimal (even under the margin/low noise condition) when the loss function is somewhat more than convex, whereas, in that case, aggregation procedures with exponential weights achieve the optimal rate of aggregation.

Suggested Citation

  • Guillaume Lecu'e, 2007. "Suboptimality of Penalized Empirical Risk Minimization in Classification," Papers math/0703811, arXiv.org.
  • Handle: RePEc:arx:papers:math/0703811
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/math/0703811
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Bartlett, Peter L. & Jordan, Michael I. & McAuliffe, Jon D., 2006. "Convexity, Classification, and Risk Bounds," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 138-156, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Steffen Borgwardt & Rafael M. Frongillo, 2019. "Power Diagram Detection with Applications to Information Elicitation," Journal of Optimization Theory and Applications, Springer, vol. 181(1), pages 184-196, April.
    2. Adam N. Elmachtoub & Paul Grigas, 2022. "Smart “Predict, then Optimize”," Management Science, INFORMS, vol. 68(1), pages 9-26, January.
    3. Nam Ho-Nguyen & Fatma Kılınç-Karzan, 2022. "Risk Guarantees for End-to-End Prediction and Optimization Processes," Management Science, INFORMS, vol. 68(12), pages 8680-8698, December.
    4. Ghysels, Eric & Babii, Andrii & Chen, Xi & Kumar, Rohit, 2020. "Binary Choice with Asymmetric Loss in a Data-Rich Environment: Theory and an Application to Racial Justice," CEPR Discussion Papers 15418, C.E.P.R. Discussion Papers.
    5. Christmann, Andreas & Steinwart, Ingo & Hubert, Mia, 2006. "Robust Learning from Bites for Data Mining," Technical Reports 2006,03, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    6. Xiang Zhang & Yichao Wu & Lan Wang & Runze Li, 2016. "Variable selection for support vector machines in moderately high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(1), pages 53-76, January.
    7. Yaoyao Xu & Menggang Yu & Ying‐Qi Zhao & Quefeng Li & Sijian Wang & Jun Shao, 2015. "Regularized outcome weighted subgroup identification for differential treatment effects," Biometrics, The International Biometric Society, vol. 71(3), pages 645-653, September.
    8. Pierre Alquier & Vincent Cottet & Guillaume Lecué, 2017. "Estimation bounds and sharp oracle inequalities of regularized procedures with Lipschitz loss functions," Working Papers 2017-30, Center for Research in Economics and Statistics.
    9. Andrew Bennett & Nathan Kallus, 2020. "Efficient Policy Learning from Surrogate-Loss Classification Reductions," Papers 2002.05153, arXiv.org.
    10. Christmann, Andreas & Steinwart, Ingo & Hubert, Mia, 2007. "Robust learning from bites for data mining," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 347-361, September.
    11. Steinwart, Ingo & Hush, Don & Scovel, Clint, 2009. "Learning from dependent observations," Journal of Multivariate Analysis, Elsevier, vol. 100(1), pages 175-194, January.
    12. Xiaotong Shen & Lifeng Wang, 2007. "Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization'' by V. Koltchinskii," Papers 0708.0121, arXiv.org.
    13. Peter L. Bartlett & Shahar Mendelson, 2007. "Discussion of "2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization" by V. Koltchinskii," Papers 0708.0089, arXiv.org.
    14. Toru Kitagawa & Shosei Sakaguchi & Aleksey Tetenov, 2021. "Constrained Classification and Policy Learning," Papers 2106.12886, arXiv.org, revised Jul 2023.
    15. Hung Yi Lee & Charles Hernandez & Hongcheng Liu, 2023. "Regularized sample average approximation for high-dimensional stochastic optimization under low-rankness," Journal of Global Optimization, Springer, vol. 85(2), pages 257-282, February.
    16. Jun-ya Gotoh & Stan Uryasev, 2017. "Support vector machines based on convex risk functions and general norms," Annals of Operations Research, Springer, vol. 249(1), pages 301-328, February.
    17. Rubin Daniel B. & van der Laan Mark J., 2012. "Statistical Issues and Limitations in Personalized Medicine Research with Clinical Trials," The International Journal of Biostatistics, De Gruyter, vol. 8(1), pages 1-20, July.
    18. Zhiyu Ma & Shaowen Yao & Liwen Wu & Song Gao & Yunqi Zhang, 2022. "Hateful Memes Detection Based on Multi-Task Learning," Mathematics, MDPI, vol. 10(23), pages 1-16, November.
    19. Lee, Yoonkyung & Wang, Rui, 2015. "Does modeling lead to more accurate classification?: A study of relative efficiency in linear classification," Journal of Multivariate Analysis, Elsevier, vol. 133(C), pages 232-250.
    20. Aakil M. Caunhye & Douglas Alem, 2023. "Practicable robust stochastic optimization under divergence measures with an application to equitable humanitarian response planning," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 45(3), pages 759-806, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:math/0703811. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.