IDEAS home Printed from https://ideas.repec.org/a/kap/compec/v57y2021i2d10.1007_s10614-020-09973-5.html
   My bibliography  Save this article

A New Appraisal Model of Second-Hand Housing Prices in China’s First-Tier Cities Based on Machine Learning Algorithms

Author

Listed:
  • Lulin Xu

    (China University of Geosciences (Wuhan))

  • Zhongwu Li

    (China University of Geosciences (Wuhan))

Abstract

The accurate appraisal of second-hand housing prices plays an important role in second-hand housing transactions, mortgages and risk assessment. Machine learning technology, gradually applied to finance and economics, can also be used to upgrade the traditional appraisal methods of second-hand housing. A large number of appraisal indicators and price data on second-hand housing in Beijing, Shanghai, Guangzhou and Shenzhen, four first-tier cities in China, can be obtained by using crawler technology. Then, the geographical location information of second-hand housing can be visualized by GIS technology, and the descriptive text of second-hand housing can be processed by natural language processing. Finally, combined with other numerical and classification indicators, the second-hand housing appraisal model based on a two-tier stacking framework is constructed by using random forest, adaptive boosting, gradient boosting decision tree, light gradient boosting machine and extreme gradient boosting as base models and back propagation neural network as the meta-model. The result of model training shows that the machine learning models improve the accuracy significantly compared to linear multiple regression and spatial econometric models, and the performance of the stacking model is better than that of standalone machine learning models.

Suggested Citation

  • Lulin Xu & Zhongwu Li, 2021. "A New Appraisal Model of Second-Hand Housing Prices in China’s First-Tier Cities Based on Machine Learning Algorithms," Computational Economics, Springer;Society for Computational Economics, vol. 57(2), pages 617-637, February.
  • Handle: RePEc:kap:compec:v:57:y:2021:i:2:d:10.1007_s10614-020-09973-5
    DOI: 10.1007/s10614-020-09973-5
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10614-020-09973-5
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10614-020-09973-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Prashant Das & Patrick Smith & Paul Gallimore, 2018. "Pricing Extreme Attributes in Commercial Real Estate: the Case of Hotel Transactions," The Journal of Real Estate Finance and Economics, Springer, vol. 57(2), pages 264-296, August.
    2. Marcel A. J. Theebe, 2004. "Planes, Trains, and Automobiles: The Impact of Traffic Noise on House Prices," The Journal of Real Estate Finance and Economics, Springer, vol. 28(2_3), pages 209-234, March.
    3. XingYu Fu & JinHong Du & YiFeng Guo & MingWen Liu & Tao Dong & XiuWen Duan, 2018. "A Machine Learning Framework for Stock Selection," Papers 1806.01743, arXiv.org, revised Aug 2018.
    4. Fabian Taigel & Anselme K. Tueno & Richard Pibernik, 2018. "Privacy-preserving condition-based forecasting using machine learning," Journal of Business Economics, Springer, vol. 88(5), pages 563-592, July.
    5. Dickson K. W. Chiu & Yves T. F. Yueh & Ho-fung Leung & Patrick C. K. Hung, 2009. "Towards ubiquitous tourist service coordination and process integration: A collaborative travel agent system architecture with semantic web services," Information Systems Frontiers, Springer, vol. 11(3), pages 241-256, July.
    6. Friendly M., 2002. "Corrgrams: Exploratory Displays for Correlation Matrices," The American Statistician, American Statistical Association, vol. 56, pages 316-324, November.
    7. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    8. Periklis Gogas & Theophilos Papadimitriou & Maria Matthaiou & Efthymia Chrysanthidou, 2015. "Yield Curve and Recession Forecasting in a Machine Learning Framework," Computational Economics, Springer;Society for Computational Economics, vol. 45(4), pages 635-645, April.
    9. Lynn Wu & Erik Brynjolfsson, 2015. "The Future of Prediction: How Google Searches Foreshadow Housing Prices and Sales," NBER Chapters, in: Economic Analysis of the Digital Economy, pages 89-118, National Bureau of Economic Research, Inc.
    10. Marcelo C. Medeiros & Gabriel F. R. Vasconcelos & Álvaro Veiga & Eduardo Zilberman, 2021. "Forecasting Inflation in a Data-Rich Environment: The Benefits of Machine Learning Methods," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 39(1), pages 98-119, January.
    11. Jianrong Yao & Jiarui Chen & June Wei & Yuangao Chen & Shuiqing Yang, 2019. "The relationship between soft information in loan titles and online peer-to-peer lending: evidence from RenRenDai platform," Electronic Commerce Research, Springer, vol. 19(1), pages 111-129, March.
    12. Guo, Juncong & Qu, Xi, 2019. "Spatial interactive effects on housing prices in Shanghai and Beijing," Regional Science and Urban Economics, Elsevier, vol. 76(C), pages 147-160.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Cankun Wei & Meichen Fu & Li Wang & Hanbing Yang & Feng Tang & Yuqing Xiong, 2022. "The Research Development of Hedonic Price Model-Based Real Estate Appraisal in the Era of Big Data," Land, MDPI, vol. 11(3), pages 1-30, February.
    2. Raul-Tomas Mora-Garcia & Maria-Francisca Cespedes-Lopez & V. Raul Perez-Sanchez, 2022. "Housing Price Prediction Using Machine Learning Algorithms in COVID-19 Times," Land, MDPI, vol. 11(11), pages 1-32, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jon Ellingsen & Vegard H. Larsen & Leif Anders Thorsrud, 2020. "News Media vs. FRED-MD for Macroeconomic Forecasting," CESifo Working Paper Series 8639, CESifo.
    2. Tuhkuri, Joonas, 2016. "Forecasting Unemployment with Google Searches," ETLA Working Papers 35, The Research Institute of the Finnish Economy.
    3. Emmanuel Sirimal Silva & Hossein Hassani & Dag Øivind Madsen & Liz Gee, 2019. "Googling Fashion: Forecasting Fashion Consumer Behaviour Using Google Trends," Social Sciences, MDPI, vol. 8(4), pages 1-23, April.
    4. Emanuel Kohlscheen, 2022. "Quantifying the Role of Interest Rates, the Dollar and Covid in Oil Prices," Papers 2208.14254, arXiv.org, revised Oct 2022.
    5. Byron Botha & Rulof Burger & Kevin Kotzé & Neil Rankin & Daan Steenkamp, 2023. "Big data forecasting of South African inflation," Empirical Economics, Springer, vol. 65(1), pages 149-188, July.
    6. Andrew J. Patton & Yasin Simsek, 2023. "Generalized Autoregressive Score Trees and Forests," Papers 2305.18991, arXiv.org.
    7. Mirza, Nawazish & Rizvi, Syed Kumail Abbas & Naqvi, Bushra & Umar, Muhammad, 2024. "Inflation prediction in emerging economies: Machine learning and FX reserves integration for enhanced forecasting," International Review of Financial Analysis, Elsevier, vol. 94(C).
    8. Knut Lehre Seip & Dan Zhang, 2021. "The Yield Curve as a Leading Indicator: Accuracy and Timing of a Parsimonious Forecasting Model," Forecasting, MDPI, vol. 3(2), pages 1-16, May.
    9. Araujo, Gustavo Silva & Gaglianone, Wagner Piazza, 2023. "Machine learning methods for inflation forecasting in Brazil: New contenders versus classical models," Latin American Journal of Central Banking (previously Monetaria), Elsevier, vol. 4(2).
    10. Rama K. Malladi, 2024. "Benchmark Analysis of Machine Learning Methods to Forecast the U.S. Annual Inflation Rate During a High-Decile Inflation Period," Computational Economics, Springer;Society for Computational Economics, vol. 64(1), pages 335-375, July.
    11. Lenza, Michele & Moutachaker, Inès & Paredes, Joan, 2023. "Density forecasts of inflation: a quantile regression forest approach," CEPR Discussion Papers 18298, C.E.P.R. Discussion Papers.
    12. Houcine Senoussi, 2021. "Inflation and Inflation Uncertainty in Growth Model of Barro: An Application of Random Forest Method," International Econometric Review (IER), Econometric Research Association, vol. 13(1), pages 4-23, March.
    13. Hamdy Ahmad Aly Alhendawy & Mohammed Galal Abdallah Mostafa & Mohamed Ibrahim Elgohari & Ibrahim Abdalla Abdelraouf Mohamed & Nabil Medhat Arafat Mahmoud & Mohamed Ahmed Mohamed Mater, 2023. "Determinants of Renewable Energy Production in Egypt New Approach: Machine Learning Algorithms," International Journal of Energy Economics and Policy, Econjournals, vol. 13(6), pages 679-689, November.
    14. Jaehyun Yoon, 2021. "Forecasting of Real GDP Growth Using Machine Learning Models: Gradient Boosting and Random Forest Approach," Computational Economics, Springer;Society for Computational Economics, vol. 57(1), pages 247-265, January.
    15. Felipe Leal & Carlos Molina & Eduardo Zilberman, 2020. "Proyección de la Inflación en Chile con Métodos de Machine Learning," Working Papers Central Bank of Chile 860, Central Bank of Chile.
    16. Anesti, Nikoleta & Kalamara, Eleni & Kapetanios, George, 2021. "Forecasting UK GDP growth with large survey panels," Bank of England working papers 923, Bank of England.
    17. Guilherme Lindenmeyer & Pedro Pablo Skorin & Hudson da Silva Torrent, 2021. "Using boosting for forecasting electric energy consumption during a recession: a case study for the Brazilian State Rio Grande do Sul," Letters in Spatial and Resource Sciences, Springer, vol. 14(2), pages 111-128, August.
    18. Li, Xin & Pan, Bing & Law, Rob & Huang, Xiankai, 2017. "Forecasting tourism demand with composite search index," Tourism Management, Elsevier, vol. 59(C), pages 57-66.
    19. Daniel Wochner, 2020. "Dynamic Factor Trees and Forests – A Theory-led Machine Learning Framework for Non-Linear and State-Dependent Short-Term U.S. GDP Growth Predictions," KOF Working papers 20-472, KOF Swiss Economic Institute, ETH Zurich.
    20. Jon Ellingsen & Vegard H. Larsen & Leif Anders Thorsrud, 2020. "News Media vs. FRED-MD for Macroeconomic Forecasting," CESifo Working Paper Series 8639, CESifo.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:kap:compec:v:57:y:2021:i:2:d:10.1007_s10614-020-09973-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.