IDEAS home Printed from https://ideas.repec.org/a/spr/testjl/v34y2025i1d10.1007_s11749-024-00959-1.html
   My bibliography  Save this article

Convolution smoothing and online updating estimation for support vector machine

Author

Listed:
  • Kangning Wang

    (Shandong Technology and Business University)

  • Xiaoqing Meng

    (Shandong Technology and Business University)

  • Xiaofei Sun

    (Shandong Technology and Business University)

Abstract

Support vector machine (SVM) is a powerful binary classification statistical learning tool. In real applications, streaming data are common, which arrive in batches and have unbounded cumulative size. Because of the memory constraints of one single computer, the classical SVM solving the entire data together is unsuitable. Furthermore, the non-smoothness of hinge loss in SVM also poses high computational complexity. To overcome these issues, we first develop a convolution smoothing approach that achieves smooth and convex approximation to SVM. Then an online updating SVM is proposed, in which the estimators are renewed with current data and historical summary statistics. In theory, we prove that the convolution smoothing SVM achieves adequate approximation to SVM, and they are asymptotically equivalent in inference. Furthermore, the online updating SVM achieves the same efficiency as the classical SVM applying to the entire dataset. Numerical experiments on both synthetic and real data also validate our new methods.

Suggested Citation

  • Kangning Wang & Xiaoqing Meng & Xiaofei Sun, 2025. "Convolution smoothing and online updating estimation for support vector machine," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 34(1), pages 288-323, March.
  • Handle: RePEc:spr:testjl:v:34:y:2025:i:1:d:10.1007_s11749-024-00959-1
    DOI: 10.1007/s11749-024-00959-1
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11749-024-00959-1
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11749-024-00959-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Liu, Yufeng & Helen Zhang, Hao & Park, Cheolwoo & Ahn, Jeongyoun, 2007. "Support vector machines with adaptive Lq penalty," Computational Statistics & Data Analysis, Elsevier, vol. 51(12), pages 6380-6394, August.
    2. Xiang Zhang & Yichao Wu & Lan Wang & Runze Li, 2016. "Variable selection for support vector machines in moderately high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(1), pages 53-76, January.
    3. Lan Luo & Peter X.‐K. Song, 2020. "Renewable estimation and incremental inference in generalized linear models with streaming data sets," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(1), pages 69-97, February.
    4. Bartlett, Peter L. & Jordan, Michael I. & McAuliffe, Jon D., 2006. "Convexity, Classification, and Risk Bounds," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 138-156, March.
    5. Lee, JooChul & Wang, HaiYing & Schifano, Elizabeth D., 2020. "Online updating method to correct for measurement error in big data streams," Computational Statistics & Data Analysis, Elsevier, vol. 149(C).
    6. Koenker, Roger W & Bassett, Gilbert, Jr, 1978. "Regression Quantiles," Econometrica, Econometric Society, vol. 46(1), pages 33-50, January.
    7. Chen, Lanjue & Zhou, Yong, 2020. "Quantile regression in big data: A divide and conquer based strategy," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    8. Yishu Xue & HaiYing Wang & Jun Yan & Elizabeth D. Schifano, 2020. "An online updating approach for testing the proportional hazards assumption with streams of survival data," Biometrics, The International Biometric Society, vol. 76(1), pages 171-182, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tianzhen Wang & Haixiang Zhang & Liuquan Sun, 2024. "Renewable learning for multiplicative regression with streaming datasets," Computational Statistics, Springer, vol. 39(3), pages 1559-1586, May.
    2. Yue Chao & Lei Huang & Xuejun Ma & Jiajun Sun, 2024. "Optimal subsampling for modal regression in massive data," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 87(4), pages 379-409, May.
    3. Lee, JooChul & Schifano, Elizabeth D. & Wang, HaiYing, 2024. "Fast Optimal Subsampling Probability Approximation for Generalized Linear Models," Econometrics and Statistics, Elsevier, vol. 29(C), pages 224-237.
    4. Jiang, He & Tao, Changqi & Dong, Yao & Xiong, Ren, 2021. "Robust low-rank multiple kernel learning with compound regularization," European Journal of Operational Research, Elsevier, vol. 295(2), pages 634-647.
    5. Xianhua Zhang & Lu Lin & Qihua Wang, 2025. "Updatable Estimation in Generalized Linear Models with Missing Data," Statistical Papers, Springer, vol. 66(1), pages 1-26, January.
    6. Ye Fan & Nan Lin & Liqun Yu, 2024. "Distributed quantile regression for longitudinal big data," Computational Statistics, Springer, vol. 39(2), pages 751-779, April.
    7. Luis M. Briceño-Arias & Giovanni Chierchia & Emilie Chouzenoux & Jean-Christophe Pesquet, 2019. "A random block-coordinate Douglas–Rachford splitting method with low computational complexity for binary logistic regression," Computational Optimization and Applications, Springer, vol. 72(3), pages 707-726, April.
    8. Akosah, Nana Kwame & Alagidede, Imhotep Paul & Schaling, Eric, 2020. "Testing for asymmetry in monetary policy rule for small-open developing economies: Multiscale Bayesian quantile evidence from Ghana," The Journal of Economic Asymmetries, Elsevier, vol. 22(C).
    9. Paul Hewson & Keming Yu, 2008. "Quantile regression for binary performance indicators," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 24(5), pages 401-418, September.
    10. Salimata Sissoko, 2011. "Working Paper 03-11 - Niveau de décentralisation de la négociation et structure des salaires," Working Papers 1103, Federal Planning Bureau, Belgium.
    11. Korom, Philipp, 2016. "Inherited advantage: The importance of inheritance for private wealth accumulation in Europe," MPIfG Discussion Paper 16/11, Max Planck Institute for the Study of Societies.
    12. Daniele, Vittorio, 2007. "Criminalità e investimenti esteri. Un’analisi per le province italiane [The effect of organized crime on Foreign Investments. An Empirical Analysis for the Italian Provinces]," MPRA Paper 6417, University Library of Munich, Germany.
    13. Ma, Lingjie & Koenker, Roger, 2006. "Quantile regression methods for recursive structural equation models," Journal of Econometrics, Elsevier, vol. 134(2), pages 471-506, October.
    14. Dutta, Anupam & Bouri, Elie & Rothovius, Timo & Uddin, Gazi Salah, 2023. "Climate risk and green investments: New evidence," Energy, Elsevier, vol. 265(C).
    15. Cowling, Marc & Ughetto, Elisa & Lee, Neil, 2018. "The innovation debt penalty: Cost of debt, loan default, and the effects of a public loan guarantee on high-tech firms," Technological Forecasting and Social Change, Elsevier, vol. 127(C), pages 166-176.
    16. Haddou, Samira, 2024. "Determinants of CDS in core and peripheral European countries: A comparative study during crisis and calm periods," The North American Journal of Economics and Finance, Elsevier, vol. 71(C).
    17. Niematallah Elamin & Mototsugu Fukushige, 2016. "A Quantile Regression Model for Electricity Peak Demand Forecasting: An Approach to Avoiding Power Blackouts," Discussion Papers in Economics and Business 16-22, Osaka University, Graduate School of Economics.
    18. Meng, Chang & Ghafoori, Noorulhaq, 2024. "The economic impact of terrorism in South Asia," Socio-Economic Planning Sciences, Elsevier, vol. 96(C).
    19. Peracchi, Franco, 2002. "On estimating conditional quantiles and distribution functions," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 433-447, February.
    20. Jan Fałkowski & Maciej Jakubowski & Paweł Strawiński, 2014. "Returns from income strategies in rural Poland," The Economics of Transition, The European Bank for Reconstruction and Development, vol. 22(1), pages 139-178, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:testjl:v:34:y:2025:i:1:d:10.1007_s11749-024-00959-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.