IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i11p1226-d563759.html
   My bibliography  Save this article

An Improved Machine Learning-Based Employees Attrition Prediction Framework with Emphasis on Feature Selection

Author

Listed:
  • Saeed Najafi-Zangeneh

    (Industrial Engineering Department, Amirkabir University of Technology, Tehran 15875-4413, Iran)

  • Naser Shams-Gharneh

    (Industrial Engineering Department, Amirkabir University of Technology, Tehran 15875-4413, Iran)

  • Ali Arjomandi-Nezhad

    (Industrial Engineering and Productivity Research Center, Amirkabir University of Technology, Tehran 15875-4413, Iran)

  • Sarfaraz Hashemkhani Zolfani

    (School of Engineering, Catholic University of the North, Larrondo 1281, 1780000 Coquimbo, Chile)

Abstract

Companies always seek ways to make their professional employees stay with them to reduce extra recruiting and training costs. Predicting whether a particular employee may leave or not will help the company to make preventive decisions. Unlike physical systems, human resource problems cannot be described by a scientific-analytical formula. Therefore, machine learning approaches are the best tools for this aim. This paper presents a three-stage (pre-processing, processing, post-processing) framework for attrition prediction. An IBM HR dataset is chosen as the case study. Since there are several features in the dataset, the “max-out” feature selection method is proposed for dimension reduction in the pre-processing stage. This method is implemented for the IBM HR dataset. The coefficient of each feature in the logistic regression model shows the importance of the feature in attrition prediction. The results show improvement in the F1-score performance measure due to the “max-out” feature selection method. Finally, the validity of parameters is checked by training the model for multiple bootstrap datasets. Then, the average and standard deviation of parameters are analyzed to check the confidence value of the model’s parameters and their stability. The small standard deviation of parameters indicates that the model is stable and is more likely to generalize well.

Suggested Citation

  • Saeed Najafi-Zangeneh & Naser Shams-Gharneh & Ali Arjomandi-Nezhad & Sarfaraz Hashemkhani Zolfani, 2021. "An Improved Machine Learning-Based Employees Attrition Prediction Framework with Emphasis on Feature Selection," Mathematics, MDPI, vol. 9(11), pages 1-14, May.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:11:p:1226-:d:563759
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/11/1226/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/11/1226/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Daly, Andrew & Dekker, Thijs & Hess, Stephane, 2016. "Dummy coding vs effects coding for categorical variables: Clarifications and extensions," Journal of choice modelling, Elsevier, vol. 21(C), pages 36-41.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lehmann, Nico & Sloot, Daniel & Schüle, Christopher & Ardone, Armin & Fichtner, Wolf, 2023. "The motivational drivers behind consumer preferences for regional electricity – Results of a choice experiment in Southern Germany," Energy Economics, Elsevier, vol. 120(C).
    2. Jasper Grashuis & Theodoros Skevas & Michelle S. Segovia, 2020. "Grocery Shopping Preferences during the COVID-19 Pandemic," Sustainability, MDPI, vol. 12(13), pages 1-10, July.
    3. Carole Ropars-Collet & Philippe Goffe & Qods Lefnatsa, 2021. "Does catch-and-release increase the recreational value of rivers? The case of salmon fishing," Review of Agricultural, Food and Environmental Studies, Springer, vol. 102(4), pages 393-424, December.
    4. Jaung, Wanggi, 2022. "Digital forest recreation in the metaverse: Opportunities and challenges," Technological Forecasting and Social Change, Elsevier, vol. 185(C).
    5. Hammerle, Mara & White, Lee V. & Sturmberg, Bjorn, 2023. "Solar for renters: Investigating investor perspectives of barriers and policies," Energy Policy, Elsevier, vol. 174(C).
    6. Hansen, Line Block & Termansen, Mette & Hasler, Berit, 2017. "Effectiveness Of Markets In Nitrogen Abatement: A Danish Case Study," 2017 International Congress, August 28-September 1, 2017, Parma, Italy 260887, European Association of Agricultural Economists.
    7. Buckell, John & White, Justin S. & Shang, Ce, 2020. "Can incentive-compatibility reduce hypothetical bias in smokers’ experimental choice behavior? A randomized discrete choice experiment," Journal of choice modelling, Elsevier, vol. 37(C).
    8. Krah, Kwabena & Michelson, Hope & Perge, Emilie & Jindal, Rohit, 2019. "Constraints to adopting soil fertility management practices in Malawi: A choice experiment approach," World Development, Elsevier, vol. 124(C), pages 1-1.
    9. Cordula Hinkes & Inken Christoph-Schulz, 2020. "No Palm Oil or Certified Sustainable Palm Oil? Heterogeneous Consumer Preferences and the Role of Information," Sustainability, MDPI, vol. 12(18), pages 1-26, September.
    10. Varela, Elsa & Kallas, Zein, 2022. "Extensive Mediterranean agroecosystems and their linked traditional breeds: Societal demand for the conservation of the Majorcan black pig," Land Use Policy, Elsevier, vol. 112(C).
    11. Schmid, Basil & Molloy, Joseph & Peer, Stefanie & Jokubauskaite, Simona & Aschauer, Florian & Hössinger, Reinhard & Gerike, Regine & Jara-Diaz, Sergio R. & Axhausen, Kay W., 2021. "The value of travel time savings and the value of leisure in Zurich: Estimation, decomposition and policy implications," Transportation Research Part A: Policy and Practice, Elsevier, vol. 150(C), pages 186-215.
    12. Peyron, Christine & Pélissier, Aurore & Béjean, Sophie, 2018. "Preference heterogeneity with respect to whole genome sequencing. A discrete choice experiment among parents of children with rare genetic diseases," Social Science & Medicine, Elsevier, vol. 214(C), pages 125-132.
    13. Carole Ropars-Collet & Philippe Le Goffe, 2020. "Economic evaluation of catch-and-release salmon fishing: impact on anglers’ willingness to pay," Working Papers hal-02441505, HAL.
    14. Grammatikopoulou, Ioanna & Badura, Tomas & Vačkářová, Davina, 2020. "Public preferences for post 2020 agri-environmental policy in the Czech Republic: A choice experiment approach," Land Use Policy, Elsevier, vol. 99(C).
    15. Schmid, Basil & Jokubauskaite, Simona & Aschauer, Florian & Peer, Stefanie & Hössinger, Reinhard & Gerike, Regine & Jara-Diaz, Sergio R. & Axhausen, Kay W., 2019. "A pooled RP/SP mode, route and destination choice model to investigate mode and user-type effects in the value of travel time savings," Transportation Research Part A: Policy and Practice, Elsevier, vol. 124(C), pages 262-294.
    16. Dolores Garrido & Rosa Karina Gallardo, 2022. "Are improvements in convenience good enough for consumers to prefer new food processing technologies?," Agribusiness, John Wiley & Sons, Ltd., vol. 38(1), pages 73-92, January.
    17. Ana Margarita Larranaga & Julián Arellana & Luis Ignacio Rizzi & Orlando Strambi & Helena Beatriz Bettella Cybis, 2019. "Using best–worst scaling to identify barriers to walkability: a study of Porto Alegre, Brazil," Transportation, Springer, vol. 46(6), pages 2347-2379, December.
    18. Ropars‑Collet, Carole & Le Goffe, Philippe & Lefnatsa, Qods, 2021. "Does catch‑and‑release increase the recreational value of rivers? The case of salmon fishing," Review of Agricultural, Food and Environmental Studies, Institut National de la Recherche Agronomique (INRA), vol. 102(4), September.
    19. Jivesh Ujjwal & Ranja Bandyopadhyaya, 2023. "Development of Pedestrian Level of Service (PLOS) model and satisfaction perception rating models for pedestrian infrastructure for mixed land-use urban areas," Transportation, Springer, vol. 50(2), pages 355-381, April.
    20. Carole Ropars-Collet & Philippe Le Goffe & Qods Lefnatsa, 2021. "Does catch-and-release increase the recreational value of rivers? The case of salmon fishing," Post-Print hal-03342732, HAL.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:11:p:1226-:d:563759. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.