IDEAS home Printed from https://ideas.repec.org/p/wvu/wpaper/15-34.html
   My bibliography  Save this paper

Textual Analysis in Real Estate

Author

Listed:
  • Adam Nowak

    (West Virginia University, Department of Economics)

  • Patrick Smith

    (Georgia State University, J. Mack Robinson College of Business)

Abstract

This paper incorporates text data from MLS listings from Atlanta, GA into a hedonic pricing model. Text is found to decrease pricing error by more than 25%. Information from text is incorporated into a linear model using a tokenization approach. By doing so, the implicit prices for various words and phrases are estimated. The estimation focuses on simultaneous variable selection and estimation for linear models in the presence of a large number of variables. The LASSO procedure and variants are shown to outperform least-squares in out-of-sample testing.

Suggested Citation

  • Adam Nowak & Patrick Smith, 2015. "Textual Analysis in Real Estate," Working Papers 15-34, Department of Economics, West Virginia University.
  • Handle: RePEc:wvu:wpaper:15-34
    as

    Download full text from publisher

    File URL: http://busecon.wvu.edu/phd_economics/pdf/15-34.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Johnstone, Iain M. & Lu, Arthur Yu, 2009. "On Consistency and Sparsity for Principal Components Analysis in High Dimensions," Journal of the American Statistical Association, American Statistical Association, vol. 104(486), pages 682-693.
    2. Steven D. Levitt & Chad Syverson, 2008. "Market Distortions When Agents Are Better Informed: The Value of Information in Real Estate Transactions," The Review of Economics and Statistics, MIT Press, vol. 90(4), pages 599-611, November.
    3. Carmen Fernandez & Eduardo Ley & Mark F. J. Steel, 2001. "Model uncertainty in cross-country growth regressions," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 16(5), pages 563-576.
    4. Mark B. Stewart, 2007. "The interrelated dynamics of unemployment and low-wage employment," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 22(3), pages 511-531.
    5. Michael P. Clements & Ana Beatriz Galvao, 2009. "Forecasting US output growth using leading indicators: an appraisal using MIDAS models," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 24(7), pages 1187-1206.
    6. Bourassa, Steven C. & Hamelink, Foort & Hoesli, Martin & MacGregor, Bryan D., 1999. "Defining Housing Submarkets," Journal of Housing Economics, Elsevier, vol. 8(2), pages 160-183, June.
    7. Card, David & Krueger, Alan B, 1992. "Does School Quality Matter? Returns to Education and the Characteristics of Public Schools in the United States," Journal of Political Economy, University of Chicago Press, vol. 100(1), pages 1-40, February.
    8. Benson, Earl D & Hansen, Julia L. & Schwartz Jr., Arthur & Smersh, Greg T., 1998. "Pricing Residential Amenities: The Value of a View," The Journal of Real Estate Finance and Economics, Springer, vol. 16(1), pages 55-73, January.
    9. Diego García, 2013. "Sentiment during Recessions," Journal of Finance, American Finance Association, vol. 68(3), pages 1267-1300, June.
    10. Andrea Carriero & George Kapetanios & Massimiliano Marcellino, 2011. "Forecasting large datasets with Bayesian reduced rank multivariate models," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 26(5), pages 735-761, August.
    11. R. Carter Hill & J. R. Knight & C. F. Sirmans, 1997. "Estimating Capital Asset Price Indexes," The Review of Economics and Statistics, MIT Press, vol. 79(2), pages 226-233, May.
    12. Bourassa, Steven C. & Hoesli, Martin & Peng, Vincent S., 2003. "Do housing submarkets really matter?," Journal of Housing Economics, Elsevier, vol. 12(1), pages 12-28, March.
    13. John K. Dagsvik & TorbjØrn HÆgeland & Arvid Raknerud, 2011. "Estimating the returns to schooling: a likelihood approach based on normal mixtures," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 26(4), pages 613-640, June.
    14. Michael T. Bond & Vicky L. Seiler & Michael J. Seiler, 2002. "Residential Real Estate Prices: A Room with a View," Journal of Real Estate Research, American Real Estate Society, vol. 23(1/2), pages 129-138.
    15. José Mata & José A. F. Machado, 2005. "Counterfactual decomposition of changes in wage distributions using quantile regression," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 20(4), pages 445-465.
    16. Sala-i-Martin, Xavier, 1997. "I Just Ran Two Million Regressions," American Economic Review, American Economic Association, vol. 87(2), pages 178-183, May.
    17. Moshe Buchinsky, 1998. "The dynamics of changes in the female wage distribution in the USA: a quantile regression approach," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 13(1), pages 1-30.
    18. Kenneth Soyeh & Jonathan Wiley & Ken Johnson, 2014. "Do Buyer Incentives Work for Houses during a Real Estate Downturn?," The Journal of Real Estate Finance and Economics, Springer, vol. 48(2), pages 380-396, February.
    19. Charles C. Tu & Mark J. Eppli, 1999. "Valuing New Urbanism: The Case of Kentlands," Real Estate Economics, American Real Estate and Urban Economics Association, vol. 27(3), pages 425-451, September.
    20. A. Belloni & V. Chernozhukov & L. Wang, 2011. "Square-root lasso: pivotal recovery of sparse signals via conic programming," Biometrika, Biometrika Trust, vol. 98(4), pages 791-806.
    21. Francisco Rodríguez & Dani Rodrik, 2001. "Trade Policy and Economic Growth: A Skeptic's Guide to the Cross-National Evidence," NBER Chapters, in: NBER Macroeconomics Annual 2000, Volume 15, pages 261-338, National Bureau of Economic Research, Inc.
    22. Rosen, Sherwin, 1974. "Hedonic Prices and Implicit Markets: Product Differentiation in Pure Competition," Journal of Political Economy, University of Chicago Press, vol. 82(1), pages 34-55, Jan.-Feb..
    23. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    24. Joshua D. Angrist & Alan B. Keueger, 1991. "Does Compulsory School Attendance Affect Schooling and Earnings?," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 106(4), pages 979-1014.
    25. Stock, James H & Watson, Mark W, 2002. "Macroeconomic Forecasting Using Diffusion Indexes," Journal of Business & Economic Statistics, American Statistical Association, vol. 20(2), pages 147-162, April.
    26. Han‐Bin Kang & Alan K. Reichert, 1991. "An Empirical Analysis of Hedonic Regression and Grid‐Adjustment Techniques in Real Estate Appraisal," Real Estate Economics, American Real Estate and Urban Economics Association, vol. 19(1), pages 70-91, March.
    27. Brown, Gardner M, Jr & Pollakowski, Henry O, 1977. "Economic Valuation of Shoreline," The Review of Economics and Statistics, MIT Press, vol. 59(3), pages 272-278, August.
    28. Joshua Angrist & Victor Chernozhukov & Iván Fernández-Val, 2006. "Quantile Regression under Misspecification, with an Application to the U.S. Wage Structure," Econometrica, Econometric Society, vol. 74(2), pages 539-563, March.
    29. Robert W. Paterson & Kevin J. Boyle, 2002. "Out of Sight, Out of Mind? Using GIS to Incorporate Visibility in Hedonic Property Value Models," Land Economics, University of Wisconsin Press, vol. 78(3), pages 417-425.
    30. Or Levkovich & Jan Rouwendal & Ramona van Marwijk, 2014. "The value of proximity to water in residential areas," ERSA conference papers ersa14p736, European Regional Science Association.
    31. John Ermisch & Marco Francesconi, 2013. "The Effect Of Parental Employment On Child Schooling," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 28(5), pages 796-822, August.
    32. Alexandre Belloni & Victor Chernozhukov, 2010. "Post-l1-penalized estimators in high-dimensional linear regression models," CeMMAP working papers CWP13/10, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    33. Song, Yan & Knaap, Gerrit-Jan, 2003. "New urbanism and housing values: a disaggregate assessment," Journal of Urban Economics, Elsevier, vol. 54(2), pages 218-238, September.
    34. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    35. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    36. Jerry T. Haag & Ronald C. Rutherford & Thomas A. Thomson, 2000. "Real Estate Agent Remarks: Help or Hype?," Journal of Real Estate Research, American Real Estate Society, vol. 20(1), pages 205-215.
    37. Matt Taddy, 2013. "Multinomial Inverse Regression for Text Analysis," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(503), pages 755-770, September.
    38. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    39. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    40. Liu, Zhenjuan & Stengos, Thanasis, 1999. "Non-linearities in Cross-Country Growth Regressions: A Semiparametric Approach," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 14(5), pages 527-538, Sept.-Oct.
    41. Robert Jensen, 2010. "The (Perceived) Returns to Education and the Demand for Schooling," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 125(2), pages 515-548.
    42. Christopher J. Mayer, 1998. "Assessing the Performance of Real Estate Auctions," Real Estate Economics, American Real Estate and Urban Economics Association, vol. 26(1), pages 41-66, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Adam D. Nowak & Bradley S. Price & Patrick S. Smith, 2021. "Real Estate Dictionaries Across Space and Time," The Journal of Real Estate Finance and Economics, Springer, vol. 62(1), pages 139-163, January.
    2. O. Ashton Morgan & Stuart E. Hamilton, 2009. "Disentangling Access and View Amenities in Access-restricted Coastal Residential Communities," Working Papers 09-10, Department of Economics, Appalachian State University.
    3. Eleni Kalamara & Arthur Turrell & Chris Redl & George Kapetanios & Sujit Kapadia, 2022. "Making text count: Economic forecasting using newspaper text," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 896-919, August.
    4. Lily Shen & Stephen L. Ross, 2019. "Information Value of Property Description: A Machine Learning Approach," Working papers 2019-20, University of Connecticut, Department of Economics, revised Sep 2020.
    5. Daniel Borup & Jorge Wolfgang Hansen & Benjamin Dybro Liengaard & Erik Christian Montes Schütte, 2023. "Quantifying investor narratives and their role during COVID‐19," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 38(4), pages 512-532, June.
    6. Matthew Gentzkow & Bryan T. Kelly & Matt Taddy, 2017. "Text as Data," NBER Working Papers 23276, National Bureau of Economic Research, Inc.
    7. Andres Algaba & David Ardia & Keven Bluteau & Samuel Borms & Kris Boudt, 2020. "Econometrics Meets Sentiment: An Overview Of Methodology And Applications," Journal of Economic Surveys, Wiley Blackwell, vol. 34(3), pages 512-547, July.
    8. Crocker H. Liu & Adam Nowak & Patrick S. Smith, 2018. "Does the Asset Pricing Premium Reflect Asymmetric or Incomplete Information?," Working Papers 18-06, Department of Economics, West Virginia University.
    9. Luiz Renato Lima & Lucas Lúcio Godeiro & Mohammed Mohsin, 2021. "Time-Varying Dictionary and the Predictive Power of FED Minutes," Computational Economics, Springer;Society for Computational Economics, vol. 57(1), pages 149-181, January.
    10. Yencha, Christopher, 2019. "Valuing walkability: New evidence from computer vision methods," Transportation Research Part A: Policy and Practice, Elsevier, vol. 130(C), pages 689-709.
    11. Kim, Hyun Hak & Swanson, Norman R., 2018. "Mining big data using parsimonious factor, machine learning, variable selection and shrinkage methods," International Journal of Forecasting, Elsevier, vol. 34(2), pages 339-354.
    12. Leif Anders Thorsrud, 2016. "Nowcasting using news topics Big Data versus big bank," Working Papers No 6/2016, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    13. Kim, Hyun Hak & Swanson, Norman R., 2014. "Forecasting financial and macroeconomic variables using data reduction methods: New empirical evidence," Journal of Econometrics, Elsevier, vol. 178(P2), pages 352-367.
    14. Massimiliano Caporin & Francesco Poli, 2017. "Building News Measures from Textual Data and an Application to Volatility Forecasting," Econometrics, MDPI, vol. 5(3), pages 1-46, August.
    15. Achim Ahrens & Christian B. Hansen & Mark E. Schaffer, 2020. "lassopack: Model selection and prediction with regularized regression in Stata," Stata Journal, StataCorp LP, vol. 20(1), pages 176-235, March.
    16. Schneider Ulrike & Wagner Martin, 2012. "Catching Growth Determinants with the Adaptive Lasso," German Economic Review, De Gruyter, vol. 13(1), pages 71-85, February.
    17. Ekaterina Chernobai & Michael Reibel & Michael Carney, 2011. "Nonlinear Spatial and Temporal Effects of Highway Construction on House Prices," The Journal of Real Estate Finance and Economics, Springer, vol. 42(3), pages 348-370, April.
    18. Akey, Pat & Grégoire, Vincent & Martineau, Charles, 2022. "Price revelation from insider trading: Evidence from hacked earnings news," Journal of Financial Economics, Elsevier, vol. 143(3), pages 1162-1184.
    19. Luiz Renato Lima & Lucas Lúcio Godeiro, 2023. "Equity‐premium prediction: Attention is all you need," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 38(1), pages 105-122, January.
    20. Anesti, Nikoleta & Kalamara, Eleni & Kapetanios, George, 2021. "Forecasting UK GDP growth with large survey panels," Bank of England working papers 923, Bank of England.

    More about this item

    Keywords

    textual analysis; big data; real estate valuation;
    All these keywords.

    JEL classification:

    • C01 - Mathematical and Quantitative Methods - - General - - - Econometrics
    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
    • C51 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Construction and Estimation
    • C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • C65 - Mathematical and Quantitative Methods - - Mathematical Methods; Programming Models; Mathematical and Simulation Modeling - - - Miscellaneous Mathematical Tools
    • R30 - Urban, Rural, Regional, Real Estate, and Transportation Economics - - Real Estate Markets, Spatial Production Analysis, and Firm Location - - - General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wvu:wpaper:15-34. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Feng Yao (email available below). General contact details of provider: https://edirc.repec.org/data/dewvuus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.