IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v348y2025i1d10.1007_s10479-023-05271-z.html
   My bibliography  Save this article

Compactness score: a fast filter method for unsupervised feature selection

Author

Listed:
  • Peican Zhu

    (Northwestern Polytechnical University (NWPU)
    Northwestern Polytechnical University (NWPU))

  • Xin Hou

    (Northwestern Polytechnical University (NWPU)
    Northwestern Polytechnical University (NWPU))

  • Keke Tang

    (Cyberspace Institute of Advanced Technology, Guangzhou University)

  • Zhen Wang

    (Northwestern Polytechnical University (NWPU)
    School of Cybersecurity, Northwestern Polytechnical University (NWPU))

  • Feiping Nie

    (Northwestern Polytechnical University (NWPU)
    Northwestern Polytechnical University (NWPU))

Abstract

The rapid development of big data era incurs the generation of huge amount of data day by day in various fields. Due to the large-scale and high-dimensional characteristics of these data, it is often difficult to achieve better decision-making in practical applications. Therefore, an efficient big data analytical method is urgently necessary. For feature engineering, feature selection seems to be an important research topic which is anticipated to select “excellent” features from candidate ones. The implementation of feature selection can not only achieve the purpose of dimensionality reduction, but also improve the computational efficiency and result performance of the model. In many classification tasks, researchers found that data seem to be usually close to each other if they are from the same class; thus, local compactness is of great importance for the evaluation of a feature. Based on this discovery, we propose a fast unsupervised feature selection algorithm, named Compactness Score (CSUFS), to select desired features. To prove the superiority of the proposed algorithm, several public data sets are considered with extensive experiments being performed. The experiments are presented by applying feature subsets selected through several different algorithms to the clustering task. The performance of clustering tasks is indicated by two well-known evaluation metrics, while the efficiency is reflected by the corresponding running time. As demonstrated, our proposed algorithm is more accurate and efficient compared with existing ones.

Suggested Citation

  • Peican Zhu & Xin Hou & Keke Tang & Zhen Wang & Feiping Nie, 2025. "Compactness score: a fast filter method for unsupervised feature selection," Annals of Operations Research, Springer, vol. 348(1), pages 299-315, May.
  • Handle: RePEc:spr:annopr:v:348:y:2025:i:1:d:10.1007_s10479-023-05271-z
    DOI: 10.1007/s10479-023-05271-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-023-05271-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-023-05271-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:348:y:2025:i:1:d:10.1007_s10479-023-05271-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.