IDEAS home Printed from https://ideas.repec.org/a/gam/jfinte/v4y2025i3p33-d1704651.html
   My bibliography  Save this article

A Transparent House Price Prediction Framework Using Ensemble Learning, Genetic Algorithm-Based Tuning, and ANOVA-Based Feature Analysis

Author

Listed:
  • Mohammed Ibrahim Hussain

    (Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh)

  • Arslan Munir

    (Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA)

  • Mohammad Mamun

    (Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh)

  • Safiul Haque Chowdhury

    (Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh)

  • Nazim Uddin

    (Department of ICT, Chandpur Science and Technology University, Chandpur 3600, Bangladesh)

  • Muhammad Minoar Hossain

    (Department of Computer Science and Engineering, Bangladesh University, Dhaka 1000, Bangladesh
    Department of Computer Science and Engineering, Mawlana Bhashani Science and Technology University, Tangail 1902, Bangladesh)

Abstract

House price prediction is crucial in real estate for informed decision-making. This paper presents an automated prediction system that combines genetic algorithms (GA) for feature optimization and Analysis of Variance (ANOVA) for statistical analysis. We apply and compare five ensemble machine learning (ML) models, namely Extreme Gradient Boosting Regression (XGBR), random forest regression (RFR), Categorical Boosting Regression (CBR), Adaptive Boosting Regression (ADBR), and Gradient Boosted Decision Trees Regression (GBDTR), on a comprehensive dataset. We used a dataset with 1000 samples with eight features and a secondary dataset for external validation with 3865 samples. Our integrated approach identifies Categorical Boosting with GA (CBRGA) as the best performer, achieving an R 2 of 0.9973 and outperforming existing state-of-the-art methods. ANOVA-based analysis further enhances model interpretability and performance by isolating key factors such as square footage and lot size. To ensure robustness and transparency, we conduct 10-fold cross-validation and employ explainable AI techniques such as Shapley Additive Explanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME), providing insights into model decision-making and feature importance.

Suggested Citation

  • Mohammed Ibrahim Hussain & Arslan Munir & Mohammad Mamun & Safiul Haque Chowdhury & Nazim Uddin & Muhammad Minoar Hossain, 2025. "A Transparent House Price Prediction Framework Using Ensemble Learning, Genetic Algorithm-Based Tuning, and ANOVA-Based Feature Analysis," FinTech, MDPI, vol. 4(3), pages 1-26, July.
  • Handle: RePEc:gam:jfinte:v:4:y:2025:i:3:p:33-:d:1704651
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2674-1032/4/3/33/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2674-1032/4/3/33/
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jfinte:v:4:y:2025:i:3:p:33-:d:1704651. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.