IDEAS home Printed from https://ideas.repec.org/a/ers/journl/vxxviiy2024ispecialbp320-332.html
   My bibliography  Save this article

Imputing Data Gaps in Economic Surveys Using Fuzzy Sets and Artificial Intelligence Technique

Author

Listed:
  • Adam Kiersztyn
  • Krystyna Kiersztyn
  • Korneliusz Pylak
  • Jakub Bis
  • Michal Dolecki
  • Anna Zelazna

Abstract

Purpose: This paper develops a novel approach to impute data gaps in economic surveys. In contrast to classical methods relying on statistical analysis of survey data, more advanced prediction techniques combined with fuzzy sets are applied to effectively address missing data. Design/Methodology/Approach: The paper proposes an unconventional approach that integrates advanced prediction methods with fuzzy sets for imputing missing data. The effectiveness of the method is tested on the extensive dataset from the Polish Panel Survey (POLPAN), which was conducted every five years from 1988 to 2018. The survey contains a wide range of questions asked over successive waves, enabling a comprehensive analysis of the method for imputing data gaps. Findings: The results of numerical experiments show that the proposed method performs highly effectively, regardless of the proportion of observations assigned to the training set. Some methods, such as Support Vector Machine (SVM), did not prove suitable for imputing this dataset. The choice and number of explanatory variables play a crucial role in the method's effectiveness, with cases where a single variable was sufficient for accurate imputation. Practical Implications: The proposed method offers practical applications for improving data quality in economic surveys, especially in large-scale longitudinal surveys like POLPAN. It provides new insights into handling missing data and optimizing the selection of explanatory variables, which can enhance the robustness of imputation techniques in complex surveys. Originality/Value: This paper contributes an original and valuable approach by combining advanced prediction techniques with fuzzy sets, providing a highly effective tool for imputing missing data. This unconventional method offers new avenues for further research in economic surveys and beyond.

Suggested Citation

  • Adam Kiersztyn & Krystyna Kiersztyn & Korneliusz Pylak & Jakub Bis & Michal Dolecki & Anna Zelazna, 2024. "Imputing Data Gaps in Economic Surveys Using Fuzzy Sets and Artificial Intelligence Technique," European Research Studies Journal, European Research Studies Journal, vol. 0(Special B), pages 320-332.
  • Handle: RePEc:ers:journl:v:xxvii:y:2024:i:specialb:p:320-332
    as

    Download full text from publisher

    File URL: https://ersj.eu/journal/3492/download
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
    2. Alain Fayolle & Hans Landström & William B. Gartner & Karin Berglund, 2016. "The institutionalization of entrepreneurship : Questioning the status quo and re-gaining hope for entrepreneurship research," Post-Print hal-02311947, HAL.
    3. Ganzeboom, H.B.G. & de Graaf, P.M. & Treiman, D.J. & de Leeuw, J., 1992. "A standard international socio-economic index of occupational status," WORC Paper 92.01.001/1, Tilburg University, Work and Organization Research Centre.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Clara H. Mulder & Michael Wagner, 2001. "The Connections between Family Formation and First-time Home Ownership in the Context of West Germany and the Netherlands," European Journal of Population, Springer;European Association for Population Studies, vol. 17(2), pages 137-164, June.
    2. Bissan Ghaddar & Ignacio Gómez-Casares & Julio González-Díaz & Brais González-Rodríguez & Beatriz Pateiro-López & Sofía Rodríguez-Ballesteros, 2023. "Learning for Spatial Branching: An Algorithm Selection Approach," INFORMS Journal on Computing, INFORMS, vol. 35(5), pages 1024-1043, September.
    3. Nahushananda Chakravarthy H G & Karthik M Seenappa & Sujay Raghavendra Naganna & Dayananda Pruthviraja, 2023. "Machine Learning Models for the Prediction of the Compressive Strength of Self-Compacting Concrete Incorporating Incinerated Bio-Medical Waste Ash," Sustainability, MDPI, vol. 15(18), pages 1-22, September.
    4. Wen, Shaoting & Buyukada, Musa & Evrendilek, Fatih & Liu, Jingyong, 2020. "Uncertainty and sensitivity analyses of co-combustion/pyrolysis of textile dyeing sludge and incense sticks: Regression and machine-learning models," Renewable Energy, Elsevier, vol. 151(C), pages 463-474.
    5. Spiliotis, Evangelos & Makridakis, Spyros & Kaltsounis, Anastasios & Assimakopoulos, Vassilios, 2021. "Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data," International Journal of Production Economics, Elsevier, vol. 240(C).
    6. Ralph Hippe & Maciej Jakubowski & Luisa De Sousa Lobo Borges de Araujo, 2018. "Regional inequalities in PISA: the case of Italy and Spain," JRC Research Reports JRC109057, Joint Research Centre.
    7. Kusiak, Andrew & Zheng, Haiyang & Song, Zhe, 2009. "On-line monitoring of power curves," Renewable Energy, Elsevier, vol. 34(6), pages 1487-1493.
    8. Zhu, Siying & Zhu, Feng, 2019. "Cycling comfort evaluation with instrumented probe bicycle," Transportation Research Part A: Policy and Practice, Elsevier, vol. 129(C), pages 217-231.
    9. Ian Smith, 2012. "Reinterpreting the economics of extramarital affairs," Review of Economics of the Household, Springer, vol. 10(3), pages 319-343, September.
    10. Silke L. Schneider, 2022. "The classification of education in surveys: a generalized framework for ex-post harmonization," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(3), pages 1829-1866, June.
    11. Fenella Fleischmann & Jaap Dronkers, 2010. "Unemployment among immigrants in European labour markets: an analysis of origin and destination effects," Work, Employment & Society, British Sociological Association, vol. 24(2), pages 337-354, June.
    12. Cao, Jason & Tao, Tao, 2025. "Can an identified environmental correlate of car ownership serve as a practical planning tool?," Transportation Research Part A: Policy and Practice, Elsevier, vol. 191(C).
    13. Dursun Delen & Hamed M. Zolbanin & Durand Crosby & David Wright, 2021. "To imprison or not to imprison: an analytics model for drug courts," Annals of Operations Research, Springer, vol. 303(1), pages 101-124, August.
    14. Chiswick, Barry R. & Wang, Zhiling, 2019. "Social Contacts, Dutch Language Proficiency and Immigrant Economic Performance in the Netherlands," GLO Discussion Paper Series 419, Global Labor Organization (GLO).
    15. Doruk Cengiz & Arindrajit Dube & Attila S. Lindner & David Zentler-Munro, 2021. "Seeing Beyond the Trees: Using Machine Learning to Estimate the Impact of Minimum Wages on Labor Market Outcomes," NBER Working Papers 28399, National Bureau of Economic Research, Inc.
    16. Zhou, Jing & Li, Wei & Wang, Jiaxin & Ding, Shuai & Xia, Chengyi, 2019. "Default prediction in P2P lending from high-dimensional data based on machine learning," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 534(C).
    17. Guangjun Shen & Chuanchuan Zhang, 2024. "Economic Development and Social Integration of Migrants in China," China & World Economy, Institute of World Economics and Politics, Chinese Academy of Social Sciences, vol. 32(1), pages 1-20, January.
    18. Lu, Yingjie & Li, Tao & Hu, Hui & Zeng, Xuemei, 2023. "Short-term prediction of reference crop evapotranspiration based on machine learning with different decomposition methods in arid areas of China," Agricultural Water Management, Elsevier, vol. 279(C).
    19. Michele Raitano & Francesco Vona, 2013. "Peer heterogeneity, school tracking and students' performances: evidence from PISA 2006," Applied Economics, Taylor & Francis Journals, vol. 45(32), pages 4516-4532, November.
    20. Bohdan M. Pavlyshenko, 2019. "Machine-Learning Models for Sales Time Series Forecasting," Data, MDPI, vol. 4(1), pages 1-11, January.

    More about this item

    Keywords

    Missing value imputation; fuzzy sets; field surveys; POLPAN.;
    All these keywords.

    JEL classification:

    • C6 - Mathematical and Quantitative Methods - - Mathematical Methods; Programming Models; Mathematical and Simulation Modeling
    • C8 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs
    • C83 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Survey Methods; Sampling Methods
    • D7 - Microeconomics - - Analysis of Collective Decision-Making

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ers:journl:v:xxvii:y:2024:i:specialb:p:320-332. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Marios Agiomavritis (email available below). General contact details of provider: https://ersj.eu/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.