IDEAS home Printed from https://ideas.repec.org/p/arz/wpaper/2022_114.html
   My bibliography  Save this paper

Automated Assessment of Housing Quality with the Use of Wordscores Algorithm

Author

Listed:
  • Michal Hebdzynski

Abstract

The aim of this paper is to address the problem of unavailability or inaccuracy of information on the quality of housing in the existing data sources. This may lead to obtaining biased results in the hedonic analyses of the market conducted for macroprudential and statistical purposes. We target this problem by proposing a supervised machine learning framework that base on the Wordscores algorithm. We try to answer the question, whether it is possible to reliably, automatically assess the quality of apartment, based solely on the textual description of its listing posted in the internet advertisement site. The accuracy of the method has been tested on the example of the Polish-language apartment sales and rental listings from 2019-2021. The obtained point estimates of the quality level show a high correlation with the human assessments. The results indicate that the application of the Wordscores algorithm gives 71% effectiveness in categorizing the apartments for rent into three quality groups: low, medium and high. For the apartments for sale, the effectiveness equals 64%. The study indicates that textual descriptions of apartments’ listings convey usable, yet most often unused information on the housing quality. The usage of the fruits of the method may lead to the increased accuracy of the performed analyses of the market, thus to its better understanding. The relative easiness of application of the algorithm and its high interpretability make the proposed method advantageous over the already developed, more econometrically sophisticated approaches.

Suggested Citation

  • Michal Hebdzynski, 2022. "Automated Assessment of Housing Quality with the Use of Wordscores Algorithm," ERES 2022_114, European Real Estate Society (ERES).
  • Handle: RePEc:arz:wpaper:2022_114
    as

    Download full text from publisher

    File URL: https://eres.architexturez.net/doc/eres-id-eres2022-114
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    Hedonic methods; Housing quality; supervised machine-learning; Textual Analysis;
    All these keywords.

    JEL classification:

    • R3 - Urban, Rural, Regional, Real Estate, and Transportation Economics - - Real Estate Markets, Spatial Production Analysis, and Firm Location

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arz:wpaper:2022_114. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Architexturez Imprints (email available below). General contact details of provider: https://edirc.repec.org/data/eressea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.