IDEAS home Printed from https://ideas.repec.org/p/hal/journl/hal-04848056.html
   My bibliography  Save this paper

Multivariate filter methods for feature selection with the γ-metric

Author

Listed:
  • Nicolas Ngo

    (AMU - Aix Marseille Université, INSERM - Institut National de la Santé et de la Recherche Médicale, SESSTIM - U1252 INSERM - Aix Marseille Univ - UMR 259 IRD - Sciences Economiques et Sociales de la Santé & Traitement de l'Information Médicale - IRD - Institut de Recherche pour le Développement - AMU - Aix Marseille Université - INSERM - Institut National de la Santé et de la Recherche Médicale, ISSPAM - Institut des sciences de la santé publique [Marseille])

  • Pierre Michel

    (AMU - Aix Marseille Université, CNRS - Centre National de la Recherche Scientifique, AMSE - Aix-Marseille Sciences Economiques - EHESS - École des hautes études en sciences sociales - AMU - Aix Marseille Université - ECM - École Centrale de Marseille - CNRS - Centre National de la Recherche Scientifique)

  • Roch Giorgi

    (AMU - Aix Marseille Université, APHM - Assistance Publique - Hôpitaux de Marseille, INSERM - Institut National de la Santé et de la Recherche Médicale, SESSTIM - U1252 INSERM - Aix Marseille Univ - UMR 259 IRD - Sciences Economiques et Sociales de la Santé & Traitement de l'Information Médicale - IRD - Institut de Recherche pour le Développement - AMU - Aix Marseille Université - INSERM - Institut National de la Santé et de la Recherche Médicale, ISSPAM - Institut des sciences de la santé publique [Marseille], TIMONE - Hôpital de la Timone [CHU - APHM], BiosTIC - Biostatistique et technologies de l'information et de la communication (BioSTIC) - [Hôpital de la Timone - APHM] - APHM - Assistance Publique - Hôpitaux de Marseille - TIMONE - Hôpital de la Timone [CHU - APHM], IRD [Occitanie] - Institut de Recherche pour le Développement)

Abstract

Background The γ-metric value is generally used as the importance score of a feature (or a set of features) in a clas- sification context. This study aimed to go further by creating a new methodology for multivariate feature selection for classification, whereby the γ-metric is associated with a specific search direction (and therefore a specific stopping criterion). As three search directions are used, we effectively created three distinct methods. MethodsWe assessed the performance of our new methodology through a simulation study, comparing them against more conventional methods. Classification performance indicators, number of selected features, stability and execution time were used to evaluate the performance of the methods. We also evaluated how well the proposed methodology selected relevant features for the detection of atrial fibrillation, which is a cardiac arrhythmia. ResultsWe found that in the simulation study as well as the detection of AF task, our methods were able to select informative features and maintain a good level of predictive performance; however in a case of strong correlation and large datasets, the γ-metric based methods were less efficient to exclude non-informative features. Conclusions Results highlighted a good combination of both the forward search direction and the γ-metric as an evaluation function. However, using the backward search direction, the feature selection algorithm could fall into a local optima and can be improved.

Suggested Citation

  • Nicolas Ngo & Pierre Michel & Roch Giorgi, 2024. "Multivariate filter methods for feature selection with the γ-metric," Post-Print hal-04848056, HAL.
  • Handle: RePEc:hal:journl:hal-04848056
    DOI: 10.1186/s12874-024-02426-9
    Note: View the original document on HAL open archive server: https://hal.science/hal-04848056v1
    as

    Download full text from publisher

    File URL: https://hal.science/hal-04848056v1/document
    Download Restriction: no

    File URL: https://libkey.io/10.1186/s12874-024-02426-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Stanislav Kolenikov & Gustavo Angeles, 2009. "Socioeconomic Status Measurement With Discrete Proxy Variables: Is Principal Component Analysis A Reliable Answer?," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 55(1), pages 128-165, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guirong Li & Jiajia Xu & Liying Li & Zhaolei Shi & Hongmei Yi & James Chu & Elena Kardanova & Yanyan Li & Prashant Loyalka & Scott Rozelle, 2020. "The Impacts of Highly Resourced Vocational Schools on Student Outcomes in China," China & World Economy, Institute of World Economics and Politics, Chinese Academy of Social Sciences, vol. 28(6), pages 125-150, November.
    2. M Mahmud Khan & Sebastian Taylor & Chris Morry & Shyamkumar Sriram & Ibrahim Demir & Mizan Siddiqi, 2023. "How reliable is the asset score in measuring socioeconomic status? Comparing asset ownership reported by male and female heads of households," PLOS ONE, Public Library of Science, vol. 18(2), pages 1-15, February.
    3. Paschalis Arvanitidis & Athina Economou & Christos Kollias, 2016. "Terrorism’s effects on social capital in European countries," Public Choice, Springer, vol. 169(3), pages 231-250, December.
    4. Inyoung Park & Jieon Lee & Jungwoo Nam & Yuri Jo & Daeho Lee, 2022. "Which networking strategy improves ICT startup companies' technical efficiency?," Managerial and Decision Economics, John Wiley & Sons, Ltd., vol. 43(6), pages 2434-2443, September.
    5. Brown, Joe & Hamoudi, Amar & Jeuland, Marc & Turrini, Gina, 2017. "Seeing, believing, and behaving: Heterogeneous effects of an information intervention on household water treatment," Journal of Environmental Economics and Management, Elsevier, vol. 86(C), pages 141-159.
    6. Lannes, Laurence, 2015. "Improving health worker performance: The patient-perspective from a PBF program in Rwanda," Social Science & Medicine, Elsevier, vol. 138(C), pages 1-11.
    7. Juan M Villa, 2016. "A harmonised proxy means test for Kenya’s National Safety Net programme," Global Development Institute Working Paper Series 032016, GDI, The University of Manchester.
    8. Esposito, Lucio & Villaseñor, Adrián, 2017. "Relative deprivation: Measurement issues and predictive role for body image dissatisfaction," Social Science & Medicine, Elsevier, vol. 192(C), pages 49-57.
    9. Yang Yixin & Lü Xin & Ma Jian & Qiao Han, 2014. "A Robust Factor Analysis Model for Dichotomous Data," Journal of Systems Science and Information, De Gruyter, vol. 2(5), pages 437-450, October.
    10. Bessonova, Evguenia & Gonchar, Ksenia, 2019. "How the innovation-competition link is shaped by technology distance in a high-barrier catch-up economy," Technovation, Elsevier, vol. 86, pages 15-32.
    11. Christopoulos, Dimitris K. & McAdam, Peter, 2019. "Efficiency, Inefficiency, And The Mena Frontier," Macroeconomic Dynamics, Cambridge University Press, vol. 23(2), pages 489-521, March.
    12. Dong, Fengxia & Mitchell, Paul D. & Hurley, Terrance M. & Frisvold, George B., 2012. "Quantifying Farmer Adoption Intensity for Weed Resistance Management Practices and Its Determinants," 2012 Annual Meeting, August 12-14, 2012, Seattle, Washington 125194, Agricultural and Applied Economics Association.
    13. Lucio Esposito & Sunil Mitra Kumar & Adrián Villaseñor, 2020. "The importance of being earliest: birth order and educational outcomes along the socioeconomic ladder in Mexico," Journal of Population Economics, Springer;European Society for Population Economics, vol. 33(3), pages 1069-1099, July.
    14. Enid M. Katungi & Catherine Larochelle & Josephat R. Mugabo & Robin Buruchara, 2018. "The effect of climbing bean adoption on the welfare of smallholder common bean growers in Rwanda," Food Security: The Science, Sociology and Economics of Food Production and Access to Food, Springer;The International Society for Plant Pathology, vol. 10(1), pages 61-79, February.
    15. Stefanía D’Iorio & Liliana Forzani & Rodrigo García Arancibia & Ignacio Girela, 2023. "Predictive Power of Composite Socioeconomic Indices in Regression and Classification: Principal Components and Partial Least Squares," Working Papers 246, Red Nacional de Investigadores en Economía (RedNIE).
    16. Maria Angelica Arbelaez & Roberto Steiner & Alejandro Becerra & Daniel Wills, 2011. "Housing Tenure and Housing Demand in Colombia," Research Department Publications 4736, Inter-American Development Bank, Research Department.
    17. Maselko, Joanna & Hagaman, Ashley K. & Bates, Lisa M. & Bhalotra, Sonia & Biroli, Pietro & Gallis, John A. & O'Donnell, Karen & Sikander, Siham & Turner, Elizabeth L. & Rahman, Atif, 2019. "Father involvement in the first year of life: Associations with maternal mental health and child development outcomes in rural Pakistan," Social Science & Medicine, Elsevier, vol. 237(C), pages 1-1.
    18. Patrick Ward, 2014. "Measuring the Level and Inequality of Wealth: An Application to China," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 60(4), pages 613-635, December.
    19. Jake Anders & Francis Green & Morag Henderson & Golo Henseke, 2020. "Determinants of private school participation: all about the money?," CEPEO Working Paper Series 20-06, UCL Centre for Education Policy and Equalising Opportunities, revised Feb 2020.
    20. Chei Bukari & Emm anuel Atta Anaman, 2021. "Corruption and firm innovation: a grease or sand in the wheels of commerce? Evidence from lower-middle and upper-middle income economies," Eurasian Business Review, Springer;Eurasia Business and Economics Society, vol. 11(2), pages 267-302, June.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:journl:hal-04848056. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.