IDEAS home Printed from https://ideas.repec.org/p/pra/mprapa/119875.html

An efficient Bayes classifier for word classification: an application on the EU Recovery and Resilience Plans

Author

Listed:
  • Limosani, Michele
  • Millemaci, Emanuele
  • Mustica, Paolo

Abstract

This paper proposes the Prior Adaptive Bayes (PAB) classifier, a new algorithm to assign words appearing in a text to their respective topics. It is an adaption of the Bayes classifier where, as the prior probabilities of classes, their posterior probabilities associated with the adjacent words are used. Simulations show an improvement of more than 20% over the standard Bayes classifier. The PAB classifier is applied to the Recovery and Resilience Plans (RRPs) of the 27 European Union member states to evaluate their alignment with the environmental dimension of the Sustainable Development Goals (SDGs) as compared to the socioeconomic one. Results show that the attention paid by the countries to the pro-environment SDGs increases with the funds per capita assigned, the gap in the environmental endowment and the touristic attractiveness. Finally, the environmental dimension appears associated positively with available GDP growth projections for the next few years.

Suggested Citation

  • Limosani, Michele & Millemaci, Emanuele & Mustica, Paolo, 2023. "An efficient Bayes classifier for word classification: an application on the EU Recovery and Resilience Plans," MPRA Paper 119875, University Library of Munich, Germany.
  • Handle: RePEc:pra:mprapa:119875
    as

    Download full text from publisher

    File URL: https://mpra.ub.uni-muenchen.de/119875/1/MPRA_paper_119875.pdf
    File Function: original version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Franca Debole & Fabrizio Sebastiani, 2005. "An analysis of the relative hardness of Reuters‐21578 subsets," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 56(6), pages 584-596, April.
    2. repec:cii:cepiei:2013-q2-134-3 is not listed on IDEAS
    3. Mundaca, Luis & Markandya, Anil, 2016. "Assessing regional progress towards a ‘Green Energy Economy’," Applied Energy, Elsevier, vol. 179(C), pages 1372-1394.
    4. Susan Athey & Guido W. Imbens, 2019. "Machine Learning Methods That Economists Should Know About," Annual Review of Economics, Annual Reviews, vol. 11(1), pages 685-725, August.
    5. Phoebe Koundouri & Stathis Devves & Angelos Plataniotis, 2021. "Alignment of the European Green Deal, the Sustainable Development Goals and the European Semester Process: Method and Application," DEOS Working Papers 2113, Athens University of Economics and Business.
    6. Pierre-André Jouvet & Christian de Perthuis, 2013. "Green growth: From intention to implementation," International Economics, CEPII research center, issue 134, pages 29-55.
    7. Athey, Susan & Imbens, Guido W., 2019. "Machine Learning Methods Economists Should Know About," Research Papers 3776, Stanford University, Graduate School of Business.
    8. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Limosani, Michele & Millemaci, Emanuele & Mustica, Paolo, 2025. "Do green policies enhance short-term economic growth? Assessing EU Recovery and Resilience Plans through the lens of Sustainable Development Goals," Economic Modelling, Elsevier, vol. 147(C).
    2. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    3. repec:osf:socarx:qzm7y_v1 is not listed on IDEAS
    4. Verhagen, Mark D., 2023. "Using machine learning to monitor the equity of large-scale policy interventions: The Dutch decentralisation of the Social Domain," SocArXiv qzm7y, Center for Open Science.
    5. Joshua B. Gilbert & Zachary Himmelsbach & James Soland & Mridul Joshi & Benjamin W. Domingue, 2024. "Estimating Heterogeneous Treatment Effects with Item-Level Outcome Data: Insights from Item Response Theory," Papers 2405.00161, arXiv.org, revised Jan 2025.
    6. Augustine Denteh & Helge Liebert, 2022. "Who Increases Emergency Department Use? New Insights from the Oregon Health Insurance Experiment," CESifo Working Paper Series 9664, CESifo.
    7. Tanner Regan & Giorgio Chiovelli & Stelios Michalopoulos & Elias Papaioannou, 2023. "Illuminating Africa?," Working Papers 2023-11, The George Washington University, Institute for International Economic Policy.
    8. Khudri, Md Mohsan & Hussey, Andrew, 2024. "Breastfeeding and Child Development Outcomes across Early Childhood and Adolescence: Doubly Robust Estimation with Machine Learning," IZA Discussion Papers 17080, Institute of Labor Economics (IZA).
    9. Sophie-Charlotte Klose & Johannes Lederer, 2020. "A Pipeline for Variable Selection and False Discovery Rate Control With an Application in Labor Economics," Papers 2006.12296, arXiv.org, revised Jun 2020.
    10. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    11. Labib Shami & Teddy Lazebnik, 2024. "Implementing Machine Learning Methods in Estimating the Size of the Non-observed Economy," Computational Economics, Springer;Society for Computational Economics, vol. 63(4), pages 1459-1476, April.
    12. Hurmeranta, Risto & Lyytikäinen, Teemu, 2025. "Nominal Loss Aversion in the Housing Market and Household Mobility," Working Papers 178, VATT Institute for Economic Research.
    13. Chen, Ruoyu & Jiang, Hanchen & Quintero, Luis E., 2023. "Measuring the value of rent stabilization and understanding its implications for racial inequality: Evidence from New York City," Regional Science and Urban Economics, Elsevier, vol. 103(C).
    14. Dang, Hai-Anh & Carleto, Gero & Gourlay, Sydney & Abanokova, Kseniya, 2023. "Addressing Soil Quality Data Gaps with Imputation: Evidence from Ethiopia and Uganda," 2023 Annual Meeting, July 23-25, Washington D.C. 335648, Agricultural and Applied Economics Association.
    15. Dangxing Chen & Luyao Zhang, 2023. "Monotonicity for AI ethics and society: An empirical study of the monotonic neural additive model in criminology, education, health care, and finance," Papers 2301.07060, arXiv.org.
    16. Ballestar, María Teresa & Mir, Miguel Cuerdo & Pedrera, Luis Miguel Doncel & Sainz, Jorge, 2024. "Effectiveness of tutoring at school: A machine learning evaluation," Technological Forecasting and Social Change, Elsevier, vol. 199(C).
    17. Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    18. Barzin,Samira & Avner,Paolo & Maruyama Rentschler,Jun Erik & O’Clery,Neave, 2022. "Where Are All the Jobs ? A Machine Learning Approach for High Resolution Urban Employment Prediction inDeveloping Countries," Policy Research Working Paper Series 9979, The World Bank.
    19. Arenas, Andreu & Calsamiglia, Caterina, 2022. "Gender Differences in High-Stakes Performance and College Admission Policies," IZA Discussion Papers 15550, Institute of Labor Economics (IZA).
    20. Tsang, Andrew, 2021. "Uncovering Heterogeneous Regional Impacts of Chinese Monetary Policy," MPRA Paper 110703, University Library of Munich, Germany.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    JEL classification:

    • C82 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Macroeconomic Data; Data Access
    • H22 - Public Economics - - Taxation, Subsidies, and Revenue - - - Incidence
    • O44 - Economic Development, Innovation, Technological Change, and Growth - - Economic Growth and Aggregate Productivity - - - Environment and Growth

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:119875. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joachim Winter (email available below). General contact details of provider: https://edirc.repec.org/data/vfmunde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.