IDEAS home Printed from https://ideas.repec.org/a/spr/advdac/v12y2018i4d10.1007_s11634-015-0227-5.html
   My bibliography  Save this article

Ensemble of a subset of kNN classifiers

Author

Listed:
  • Asma Gul

    (University of Essex
    Shaheed Benazir Bhutto Women University)

  • Aris Perperoglou

    (University of Essex)

  • Zardad Khan

    (University of Essex
    Abdul Wali Khan University)

  • Osama Mahmoud

    (University of Essex)

  • Miftahuddin Miftahuddin

    (University of Essex)

  • Werner Adler

    (University of Erlangen-Nuremberg)

  • Berthold Lausen

    (University of Essex)

Abstract

Combining multiple classifiers, known as ensemble methods, can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. We propose an ensemble of subset of kNN classifiers, ESkNN, for classification task in two steps. Firstly, we choose classifiers based upon their individual performance using the out-of-sample accuracy. The selected classifiers are then combined sequentially starting from the best model and assessed for collective performance on a validation data set. We use bench mark data sets with their original and some added non-informative features for the evaluation of our method. The results are compared with usual kNN, bagged kNN, random kNN, multiple feature subset method, random forest and support vector machines. Our experimental comparisons on benchmark classification problems and simulated data sets reveal that the proposed ensemble gives better classification performance than the usual kNN and its ensembles, and performs comparable to random forest and support vector machines.

Suggested Citation

  • Asma Gul & Aris Perperoglou & Zardad Khan & Osama Mahmoud & Miftahuddin Miftahuddin & Werner Adler & Berthold Lausen, 2018. "Ensemble of a subset of kNN classifiers," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(4), pages 827-840, December.
  • Handle: RePEc:spr:advdac:v:12:y:2018:i:4:d:10.1007_s11634-015-0227-5
    DOI: 10.1007/s11634-015-0227-5
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11634-015-0227-5
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11634-015-0227-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hothorn, Torsten & Lausen, Berthold, 2005. "Bundling classifiers by bagging trees," Computational Statistics & Data Analysis, Elsevier, vol. 49(4), pages 1068-1078, June.
    2. Peter Hall & Richard J. Samworth, 2005. "Properties of bagged nearest neighbour classifiers," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(3), pages 363-379, June.
    3. Zhiliang Liu & Xiaomin Zhao & Ming Zuo & Hongbing Xu, 2014. "Feature selection for fault level diagnosis of planetary gearboxes," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(4), pages 377-401, December.
    4. Ludwig Lausser & Christoph Müssel & Alexander Melkozerov & Hans Kestler, 2014. "Identifying predictive hubs to condense the training set of $$k$$ -nearest neighbour classifiers," Computational Statistics, Springer, vol. 29(1), pages 81-95, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Leonie Selk & Jan Gertheiss, 2023. "Nonparametric regression and classification with functional, categorical, and mixed covariates," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(2), pages 519-543, June.
    2. Khadidja Henni & Pierre-Yves Louis & Brigitte Vannier & Ahmed Moussa, 2020. "Is-ClusterMPP: clustering algorithm through point processes and influence space towards high-dimensional data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(3), pages 543-570, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Petersen, Maya L. & Molinaro, Annette M. & Sinisi, Sandra E. & van der Laan, Mark J., 2007. "Cross-validated bagged learning," Journal of Multivariate Analysis, Elsevier, vol. 98(9), pages 1693-1704, October.
    2. Adler, Werner & Lausen, Berthold, 2009. "Bootstrap estimated true and false positive rates and ROC curve," Computational Statistics & Data Analysis, Elsevier, vol. 53(3), pages 718-729, January.
    3. De Bock, Koen W. & Coussement, Kristof & Van den Poel, Dirk, 2010. "Ensemble classification based on generalized additive models," Computational Statistics & Data Analysis, Elsevier, vol. 54(6), pages 1535-1546, June.
    4. Will Wei Sun & Xingye Qiao & Guang Cheng, 2016. "Stabilized Nearest Neighbor Classifier and its Statistical Properties," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(515), pages 1254-1265, July.
    5. Rokach, Lior, 2009. "Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4046-4072, October.
    6. Kusiak, Andrew & Zheng, Haiyang & Song, Zhe, 2009. "Models for monitoring wind farm power," Renewable Energy, Elsevier, vol. 34(3), pages 583-590.
    7. Stefan Lessmann & Stefan Voß, 2010. "Customer-Centric Decision Support," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 2(2), pages 79-93, April.
    8. Chung, Dongjun & Kim, Hyunjoong, 2015. "Accurate ensemble pruning with PL-bagging," Computational Statistics & Data Analysis, Elsevier, vol. 83(C), pages 1-13.
    9. Cholaquidis, Alejandro & Fraiman, Ricardo & Kalemkerian, Juan & Llop, Pamela, 2016. "A nonlinear aggregation type classifier," Journal of Multivariate Analysis, Elsevier, vol. 146(C), pages 269-281.
    10. Pere Marti-Puig & Alejandro Blanco-M & Juan José Cárdenas & Jordi Cusidó & Jordi Solé-Casals, 2019. "Feature Selection Algorithms for Wind Turbine Failure Prediction," Energies, MDPI, vol. 12(3), pages 1-18, January.
    11. Harald Binder & Hans Kestler & Matthias Schmid, 2014. "Proceedings of Reisensburg 2011," Computational Statistics, Springer, vol. 29(1), pages 1-2, February.
    12. Diogo Menezes & Mateus Mendes & Jorge Alexandre Almeida & Torres Farinha, 2020. "Wind Farm and Resource Datasets: A Comprehensive Survey and Overview," Energies, MDPI, vol. 13(18), pages 1-24, September.
    13. Zhang, Chun-Xia & Zhang, Jiang-She & Zhang, Gai-Ying, 2009. "Using Boosting to prune Double-Bagging ensembles," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1218-1231, February.
    14. Wei-Yin Loh, 2014. "Fifty Years of Classification and Regression Trees," International Statistical Review, International Statistical Institute, vol. 82(3), pages 329-348, December.
    15. Croux, Christophe & Joossens, Kristel & Lemmens, Aurelie, 2007. "Trimmed bagging," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 362-368, September.
    16. Adler, Werner & Brenning, Alexander & Potapov, Sergej & Schmid, Matthias & Lausen, Berthold, 2011. "Ensemble classification of paired data," Computational Statistics & Data Analysis, Elsevier, vol. 55(5), pages 1933-1941, May.
    17. E. Emary & Hossam M. Zawbaa & Aboul Ella Hassanien & B. Parv, 2017. "Multi-objective retinal vessel localization using flower pollination search algorithm with pattern search," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(3), pages 611-627, September.
    18. Timothy I. Cannings & Richard J. Samworth, 2017. "Random-projection ensemble classification," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 959-1035, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:advdac:v:12:y:2018:i:4:d:10.1007_s11634-015-0227-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.