IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v308y2022i1d10.1007_s10479-021-03932-5.html
   My bibliography  Save this article

Real estate price estimation in French cities using geocoding and machine learning

Author

Listed:
  • Dieudonné Tchuente

    (Toulouse Business School)

  • Serge Nyawa

    (Toulouse Business School)

Abstract

This paper reviews real estate price estimation in France, a market that has received little attention. We compare seven popular machine learning techniques by proposing a different approach that quantifies the relevance of location features in real estate price estimation with high and fine levels of granularity. We take advantage of a newly available open dataset provided by the French government that contains 5 years of historical data of real estate transactions. At a high level of granularity, we obtain important differences regarding the models’ prediction powers between cities with medium and high standards of living (precision differences beyond 70% in some cases). At a low level of granularity, we use geocoding to add precise geographical location features to the machine learning algorithm inputs. We obtain important improvements regarding the models’ forecasting powers relative to models trained without these features (improvements beyond 50% for some forecasting error measures). Our results also reveal that neural networks and random forest techniques particularly outperform other methods when geocoding features are not accounted for, while random forest, adaboost and gradient boosting perform well when geocoding features are considered. For identifying opportunities in the real estate market through real estate price prediction, our results can be of particular interest. They can also serve as a basis for price assessment in revenue management for durable and non-replenishable products such as real estate.

Suggested Citation

  • Dieudonné Tchuente & Serge Nyawa, 2022. "Real estate price estimation in French cities using geocoding and machine learning," Annals of Operations Research, Springer, vol. 308(1), pages 571-608, January.
  • Handle: RePEc:spr:annopr:v:308:y:2022:i:1:d:10.1007_s10479-021-03932-5
    DOI: 10.1007/s10479-021-03932-5
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-021-03932-5
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-021-03932-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Maxime C. Cohen, 2018. "Big Data and Service Operations," Production and Operations Management, Production and Operations Management Society, vol. 27(9), pages 1709-1723, September.
    2. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
    3. Marcos Lins & Luiz Novaes & Luiz Legey, 2005. "Real Estate Appraisal: A Double Perspective Data Envelopment Analysis Approach," Annals of Operations Research, Springer, vol. 138(1), pages 79-96, September.
    4. Sören Gröbel & Lorenz Thomschke, 2018. "Hedonic pricing and the spatial structure of housing data – an application to Berlin," Journal of Property Research, Taylor & Francis Journals, vol. 35(3), pages 185-208, July.
    5. Michael Doumpos & Dimitrios Papastamos & Dimitrios Andritsos & Constantin Zopounidis, 2020. "Developing automated valuation models for estimating property values: a comparison of global and locally weighted approaches," Post-Print hal-02880099, HAL.
    6. Allan Din & Martin Hoesli & Andre Bender, 2001. "Environmental Variables and Real Estate Prices," Urban Studies, Urban Studies Journal Limited, vol. 38(11), pages 1989-2000, October.
    7. Agostino Valier, 2020. "Who performs better? AVMs vs hedonic models," Journal of Property Investment & Finance, Emerald Group Publishing Limited, vol. 38(3), pages 213-225, March.
    8. Steven C. Bourassa & Eva Cantoni & Martin Hoesli, 2010. "Predicting House Prices with Spatial Dependence: A Comparison of Alternative Methods," Journal of Real Estate Research, American Real Estate Society, vol. 32(2), pages 139-160.
    9. Goodman, Allen C. & Thibodeau, Thomas G., 2003. "Housing market segmentation and hedonic prediction accuracy," Journal of Housing Economics, Elsevier, vol. 12(3), pages 181-201, September.
    10. Bourassa, Steven C. & Hamelink, Foort & Hoesli, Martin & MacGregor, Bryan D., 1999. "Defining Housing Submarkets," Journal of Housing Economics, Elsevier, vol. 8(2), pages 160-183, June.
    11. Juan Carlos Espinoza Garcia & Laurent Alfandari, 2018. "Robust location of new housing developments using a choice model," Annals of Operations Research, Springer, vol. 271(2), pages 527-550, December.
    12. Li, Juan & Tang, Ou, 2012. "Capacity and pricing policies with consumer overflow behavior," International Journal of Production Economics, Elsevier, vol. 140(2), pages 825-832.
    13. Gomes, Luiz Flávio Autran Monteiro & Rangel, Luís Alberto Duncan, 2009. "Determining the utility functions of criteria used in the evaluation of real estate," International Journal of Production Economics, Elsevier, vol. 117(2), pages 420-426, February.
    14. Bogataj, David & McDonnell, Diego Ros & Bogataj, Marija, 2016. "Management, financing and taxation of housing stock in the shrinking cities of aging societies," International Journal of Production Economics, Elsevier, vol. 181(PA), pages 2-13.
    15. Wen, Xiaoqin & Xu, Chen & Hu, Qiying, 2016. "Dynamic capacity management with uncertain demand and dynamic price," International Journal of Production Economics, Elsevier, vol. 175(C), pages 121-131.
    16. W.J. McCluskey & M. McCord & P.T. Davis & M. Haran & D. McIlhatton, 2013. "Prediction accuracy in mass appraisal: a comparison of modern approaches," Journal of Property Research, Taylor & Francis Journals, vol. 30(4), pages 239-265, December.
    17. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
    18. Michael Johnson, 2003. "Single-Period Location Models for Subsidized Housing: Tenant-Based Subsidies," Annals of Operations Research, Springer, vol. 123(1), pages 105-124, October.
    19. Jorge Iván Pérez-Rave & Juan Carlos Correa-Morales & Favián González-Echavarría, 2019. "A machine learning approach to big data regression analysis of real estate prices for inferential and predictive purposes," Journal of Property Research, Taylor & Francis Journals, vol. 36(1), pages 59-96, January.
    20. M. K. Geraghty & Ernest Johnson, 1997. "Revenue Management Saves National Car Rental," Interfaces, INFORMS, vol. 27(1), pages 107-127, February.
    21. Bourassa, Steven C. & Hoesli, Martin & Peng, Vincent S., 2003. "Do housing submarkets really matter?," Journal of Housing Economics, Elsevier, vol. 12(1), pages 12-28, March.
    22. Tsan‐Ming Choi & Stein W. Wallace & Yulan Wang, 2018. "Big Data Analytics in Operations Management," Production and Operations Management, Production and Operations Management Society, vol. 27(10), pages 1868-1883, October.
    23. Basu, Sabyasachi & Thibodeau, Thomas G, 1998. "Analysis of Spatial Autocorrelation in House Prices," The Journal of Real Estate Finance and Economics, Springer, vol. 17(1), pages 61-85, July.
    24. Rotimi Boluwatife Abidoye & Albert P.C. Chan & Funmilayo Adenike Abidoye & Olalekan Shamsideen Oshodi, 2019. "Predicting property price index using artificial intelligence techniques," International Journal of Housing Markets and Analysis, Emerald Group Publishing Limited, vol. 12(6), pages 1072-1092, June.
    25. Clapp, John M, 2003. "A Semiparametric Method for Valuing Residential Locations: Application to Automated Valuation," The Journal of Real Estate Finance and Economics, Springer, vol. 27(3), pages 303-320, November.
    26. Allen C. Goodman & Thomas G. Thibodeau, 2007. "The Spatial Proximity of Metropolitan Area Housing Submarkets," Real Estate Economics, American Real Estate and Urban Economics Association, vol. 35(2), pages 209-232, June.
    27. Anne Pedersen & Alex Weissensteiner & Rolf Poulsen, 2013. "Financial planning for young households," Annals of Operations Research, Springer, vol. 205(1), pages 55-76, May.
    28. Berk, Emre & Gürler, Ülkü & YIldIrIm, Gonca, 2009. "On pricing of perishable assets with menu costs," International Journal of Production Economics, Elsevier, vol. 121(2), pages 678-699, October.
    29. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    30. Bradford Case & John Clapp & Robin Dubin & Mauricio Rodriguez, 2004. "Modeling Spatial and Temporal House Price Patterns: A Comparison of Four Models," The Journal of Real Estate Finance and Economics, Springer, vol. 29(2), pages 167-191, September.
    31. Rebecca Wu, 1997. "Neural network models: Foundations and applications to an audit decision problem," Annals of Operations Research, Springer, vol. 75(0), pages 291-301, January.
    32. Ruomeng Cui & Santiago Gallino & Antonio Moreno & Dennis J. Zhang, 2018. "The Operational Value of Social Media Information," Production and Operations Management, Production and Operations Management Society, vol. 27(10), pages 1749-1769, October.
    33. Timothy J. Fik & David C. Ling & Gordon F. Mulligan, 2003. "Modeling Spatial Variation in Housing Prices: A Variable Interaction Approach," Real Estate Economics, American Real Estate and Urban Economics Association, vol. 31(4), pages 623-646, December.
    34. Narula, Subhash C. & Wellington, John F. & Lewis, Stephen A., 2012. "Valuating residential real estate using parametric programming," European Journal of Operational Research, Elsevier, vol. 217(1), pages 120-128.
    35. Christopher Bitter & Gordon Mulligan & Sandy Dall’erba, 2007. "Incorporating spatial variation in housing attribute prices: a comparison of geographically weighted regression and the spatial expansion method," Journal of Geographical Systems, Springer, vol. 9(1), pages 7-27, April.
    36. Steven Bourassa & Eva Cantoni & Martin Hoesli, 2007. "Spatial Dependence, Housing Submarkets, and House Price Prediction," The Journal of Real Estate Finance and Economics, Springer, vol. 35(2), pages 143-160, August.
    37. Goodman, Allen C. & Thibodeau, Thomas G., 1998. "Housing Market Segmentation," Journal of Housing Economics, Elsevier, vol. 7(2), pages 121-143, June.
    38. Andrew Kusiak, 2020. "Convolutional and generative adversarial neural networks in manufacturing," International Journal of Production Research, Taylor & Francis Journals, vol. 58(5), pages 1594-1604, March.
    39. Koetter, Michael & Poghosyan, Tigran, 2010. "Real estate prices and bank stability," Journal of Banking & Finance, Elsevier, vol. 34(6), pages 1129-1138, June.
    40. Michael Mayer & Steven C. Bourassa & Martin Hoesli & Donato Flavio Scognamiglio, 2018. "Estimation and Updating Methods for Hedonic Valuation," Swiss Finance Institute Research Paper Series 18-76, Swiss Finance Institute.
    41. Autran Monteiro Gomes, Luiz Flávio & Duncan Rangel, LuI´s Alberto, 2009. "An application of the TODIM method to the multicriteria rental evaluation of residential properties," European Journal of Operational Research, Elsevier, vol. 193(1), pages 204-211, February.
    42. K.C. Lam & C.Y. Yu & C.K. Lam, 2009. "Support vector machine and entropy based decision support system for property valuation," Journal of Property Research, Taylor & Francis Journals, vol. 26(3), pages 213-233, August.
    43. Galit Shmueli & Inbal Yahav, 2018. "The Forest or the Trees? Tackling Simpson's Paradox with Classification Trees," Production and Operations Management, Production and Operations Management Society, vol. 27(4), pages 696-716, April.
    44. Hu, Lirong & He, Shenjing & Han, Zixuan & Xiao, He & Su, Shiliang & Weng, Min & Cai, Zhongliang, 2019. "Monitoring housing rental prices based on social media:An integrated approach of machine-learning algorithms and hedonic modeling to inform equitable housing policies," Land Use Policy, Elsevier, vol. 82(C), pages 657-673.
    45. Daikun Wang & Victor Jing Li, 2019. "Mass Appraisal Models of Real Estate in the 21st Century: A Systematic Literature Review," Sustainability, MDPI, vol. 11(24), pages 1-14, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Praveen Puram & Soumya Roy & Deepak Srivastav & Anand Gurumurthy, 2023. "Understanding the effect of contextual factors and decision making on team performance in Twenty20 cricket: an interpretable machine learning approach," Annals of Operations Research, Springer, vol. 325(1), pages 261-288, June.
    2. Bauer, Kevin & von Zahn, Moritz & Hinz, Oliver, 2023. "Please take over: XAI, delegation of authority, and domain knowledge," SAFE Working Paper Series 394, Leibniz Institute for Financial Research SAFE.
    3. Michał Strach & Krzysztof Różanowski & Jerzy Pietrucha & Jarosław Lewandowski, 2023. "Analysis of the Functionality of a Mobile Network of Sensors in a Construction Project Supervision System Based on Unmanned Aerial Vehicles," Sustainability, MDPI, vol. 16(1), pages 1-26, December.
    4. Hoang, Daniel & Wiegratz, Kevin, 2022. "Machine learning methods in finance: Recent applications and prospects," Working Paper Series in Economics 158, Karlsruhe Institute of Technology (KIT), Department of Economics and Management.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Juergen Deppner & Marcelo Cajias, 2024. "Accounting for Spatial Autocorrelation in Algorithm-Driven Hedonic Models: A Spatial Cross-Validation Approach," The Journal of Real Estate Finance and Economics, Springer, vol. 68(2), pages 235-273, February.
    2. Füss, Roland & Koller, Jan A., 2016. "The role of spatial and temporal structure for residential rent predictions," International Journal of Forecasting, Elsevier, vol. 32(4), pages 1352-1368.
    3. Chris Leishman & Greg Costello & Steven Rowley & Craig Watkins, 2013. "The Predictive Performance of Multilevel Models of Housing Sub-markets: A Comparative Analysis," Urban Studies, Urban Studies Journal Limited, vol. 50(6), pages 1201-1220, May.
    4. David C. Wheeler & Antonio Páez & Jamie Spinney & Lance A. Waller, 2014. "A Bayesian approach to hedonic price analysis," Papers in Regional Science, Wiley Blackwell, vol. 93(3), pages 663-683, August.
    5. Gao, Qishuo & Shi, Vivien & Pettit, Christopher & Han, Hoon, 2022. "Property valuation using machine learning algorithms on statistical areas in Greater Sydney, Australia," Land Use Policy, Elsevier, vol. 123(C).
    6. Steven C. Bourassa & Eva Cantoni & Martin Hoesli, 2005. "Spatial Dependence, Housing Submarkets, and House Prices," FAME Research Paper Series rp151, International Center for Financial Asset Management and Engineering.
    7. Steven Bourassa & Eva Cantoni & Martin Hoesli, 2007. "Spatial Dependence, Housing Submarkets, and House Price Prediction," The Journal of Real Estate Finance and Economics, Springer, vol. 35(2), pages 143-160, August.
    8. Antonio Páez & Fei Long & Steven Farber, 2008. "Moving Window Approaches for Hedonic Price Estimation: An Empirical Comparison of Modelling Techniques," Urban Studies, Urban Studies Journal Limited, vol. 45(8), pages 1565-1581, July.
    9. Dorsey, Robert E. & Hu, Haixin & Mayer, Walter J. & Wang, Hui-chen, 2010. "Hedonic versus repeat-sales housing price indexes for measuring the recent boom-bust cycle," Journal of Housing Economics, Elsevier, vol. 19(2), pages 75-93, June.
    10. Marko Kryvobokov, 2011. "Defining apartment neighbourhoods with Thiessen polygons and fuzzy equality clustering," ERES eres2011_142, European Real Estate Society (ERES).
    11. Natale Arcuri & Manuela De Ruggiero & Francesca Salvo & Raffaele Zinno, 2020. "Automated Valuation Methods through the Cost Approach in a BIM and GIS Integration Framework for Smart City Appraisals," Sustainability, MDPI, vol. 12(18), pages 1-16, September.
    12. Jos魍ar𨁍ontero-Lorenzo & Beatriz Larraz-Iribas, 2012. "Space-time approach to commercial property prices valuation," Applied Economics, Taylor & Francis Journals, vol. 44(28), pages 3705-3715, October.
    13. A. Fotheringham & Ricardo Crespo & Jing Yao, 2015. "Exploring, modelling and predicting spatiotemporal variations in house prices," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 54(2), pages 417-436, March.
    14. Xiaolong Liu, 2013. "Spatial and Temporal Dependence in House Price Prediction," The Journal of Real Estate Finance and Economics, Springer, vol. 47(2), pages 341-369, August.
    15. Rocco Curto & Elena Fregonara, 2019. "Monitoring and Analysis of the Real Estate Market in a Social Perspective: Results from the Turin’s (Italy) Experience," Sustainability, MDPI, vol. 11(11), pages 1-22, June.
    16. Ingrid Nappi‐Choulet Pr. & Tristan‐Pierre Maury, 2009. "A Spatiotemporal Autoregressive Price Index for the Paris Office Property Market," Real Estate Economics, American Real Estate and Urban Economics Association, vol. 37(2), pages 305-340, June.
    17. Salvati, Luca & Ciommi, Maria Teresa & Serra, Pere & Chelli, Francesco M., 2019. "Exploring the spatial structure of housing prices under economic expansion and stagnation: The role of socio-demographic factors in metropolitan Rome, Italy," Land Use Policy, Elsevier, vol. 81(C), pages 143-152.
    18. Joao Lourenço Marques & Eduardo Castro & Arnab Bhattacharjee & Paulo Batista, 2012. "SPATIAL HETEROGENEITY ACROSS SUBMARKETS: Housing submarket in an urban area of Portugal," ERSA conference papers ersa12p1111, European Regional Science Association.
    19. Kopczewska, Katarzyna & Ćwiakowski, Piotr, 2021. "Spatio-temporal stability of housing submarkets. Tracking spatial location of clusters of geographically weighted regression estimates of price determinants," Land Use Policy, Elsevier, vol. 103(C).
    20. Berna Keskin & Craig Watkins, 2017. "Defining spatial housing submarkets: Exploring the case for expert delineated boundaries," Urban Studies, Urban Studies Journal Limited, vol. 54(6), pages 1446-1462, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:308:y:2022:i:1:d:10.1007_s10479-021-03932-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.