IDEAS home Printed from https://ideas.repec.org/p/osf/socarx/8jbvg.html
   My bibliography  Save this paper

Reading the city through its neighbourhoods: Deep text embeddings of Yelp reviews as a basis for determining similarity and change

Author

Listed:
  • Olson, Alex
  • Calderon-Figueroa, Fernando
  • Bidian, Olimpia
  • Silver, Daniel
  • Sanner, Scott

Abstract

This paper develops novel methods for using Yelp reviews as a window into the collective representations of a city and its neighbourhoods. Basing analysis on social media data such as Yelp is a challenging task because review data is highly sparse and direct analysis may fail to uncover hidden trends. To this end, we propose a deep autoencoder approach for embedding the language of neighbourhood-based business reviews into a reduced dimensional space that facilitates similarity comparison of neighbourhoods and their change over time. Our model improves performance in distinguishing real and fake neighbourhood descriptions derived from real reviews, increasing performance in the task from an average accuracy of 0.46 to 0.77. This improvement in performance indicates that this novel application of embedded language analysis permits us to uncover comparative trends in neighbourhood change through the lens of their venues' reviews, providing a computational methodology for reading a city through its neighbourhoods. The resulting toolkit makes it possible to examine a city's current sociological trends in terms of its neighbourhoods' collective identities.

Suggested Citation

  • Olson, Alex & Calderon-Figueroa, Fernando & Bidian, Olimpia & Silver, Daniel & Sanner, Scott, 2020. "Reading the city through its neighbourhoods: Deep text embeddings of Yelp reviews as a basis for determining similarity and change," SocArXiv 8jbvg, Center for Open Science.
  • Handle: RePEc:osf:socarx:8jbvg
    DOI: 10.31219/osf.io/8jbvg
    as

    Download full text from publisher

    File URL: https://osf.io/download/5fc94e1aaa60b801d9894305/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/8jbvg?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Daniel Arribas-Bel & Karima Kourtit & Peter Nijkamp, 2016. "The sociocultural sources of urban buzz," Environment and Planning C, , vol. 34(1), pages 188-204, February.
    2. Edward L. Glaeser & Hyunjin Kim & Michael Luca, 2019. "Nowcasting the Local Economy: Using Yelp Data to Measure Economic Activity," NBER Chapters, in: Big Data for Twenty-First-Century Economic Statistics, pages 249-273, National Bureau of Economic Research, Inc.
    3. Edward L. Glaeser & Hyunjin Kim & Michael Luca, 2018. "Nowcasting Gentrification: Using Yelp Data to Quantify Neighborhood Change," AEA Papers and Proceedings, American Economic Association, vol. 108, pages 77-82, May.
    4. Elizabeth C. Delmelle, 2016. "Mapping the DNA of Urban Neighborhoods: Clustering Longitudinal Sequences of Neighborhood Socioeconomic Change," Annals of the American Association of Geographers, Taylor & Francis Journals, vol. 106(1), pages 36-56, January.
    5. Daniel Arribas-Bel & Jessie Bakens, 2019. "Use and validation of location-based services in urban research: An example with Dutch restaurants," Urban Studies, Urban Studies Journal Limited, vol. 56(5), pages 868-884, April.
    6. Yihong Yuan & Yongmei Lu & T. Edwin Chow & Chao Ye & Abdullatif Alyaqout & Yu Liu, 2020. "The Missing Parts from Social Media–Enabled Smart Cities: Who, Where, When, and What?," Annals of the American Association of Geographers, Taylor & Francis Journals, vol. 110(2), pages 462-475, March.
    7. Balázs Kovács & Glenn R. Carroll & David W. Lehman, 2014. "Authenticity and Consumer Value Ratings: Empirical Tests from the Restaurant Domain," Organization Science, INFORMS, vol. 25(2), pages 458-478, April.
    8. repec:bla:jamest:v:41:y:1990:i:6:p:391-407 is not listed on IDEAS
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Silver, Daniel & Silva, Thiago H, 2021. "Complex Causal Structures of Neighbourhood Change: Evidence From a Functionalist Model and Yelp Data," SocArXiv wprf8, Center for Open Science.
    2. Susan Athey & Michael Luca, 2019. "Economists (and Economics) in Tech Companies," Journal of Economic Perspectives, American Economic Association, vol. 33(1), pages 209-230, Winter.
    3. Dominik Gutt & Philipp Herrmann & Mohammad S. Rahman, 2018. "Crowd-Driven Competitive Intelligence: Understanding the Relationship Between Local Market Competition and Online Rating Distributions," Working Papers Dissertations 41, Paderborn University, Faculty of Business Administration and Economics.
    4. Mohammed Alyakoob & Mohammad S. Rahman, 2022. "Shared Prosperity (or Lack Thereof) in the Sharing Economy," Information Systems Research, INFORMS, vol. 33(2), pages 638-658, June.
    5. Dominik Gutt & Philipp Herrmann & Mohammad S. Rahman, 2019. "Crowd-Driven Competitive Intelligence: Understanding the Relationship Between Local Market Competition and Online Rating Distributions," Information Systems Research, INFORMS, vol. 30(3), pages 980-994, September.
    6. Morgan Ubeda, 2020. "Local Amenities, Commuting Costs and Income Disparities Within Cities," Working Papers halshs-03082448, HAL.
    7. Yong Gao & Yuanyuan Chen & Lan Mu & Shize Gong & Pengcheng Zhang & Yu Liu, 2022. "Measuring urban sentiments from social media data: a dual-polarity metric approach," Journal of Geographical Systems, Springer, vol. 24(2), pages 199-221, April.
    8. Song, Hanqun & Yang, Huijun & Ma, Emily, 2022. "Restaurants’ outdoor signs say more than you think: An enquiry from a linguistic landscape perspective," Journal of Retailing and Consumer Services, Elsevier, vol. 68(C).
    9. Breithaupt, Patrick & Kesler, Reinhold & Niebel, Thomas & Rammer, Christian, 2020. "Intangible capital indicators based on web scraping of social media," ZEW Discussion Papers 20-046, ZEW - Leibniz Centre for European Economic Research.
    10. Oliver Schilke & Sheen S. Levine & Olenka Kacperczyk & Lynne G. Zucker, 2019. "Call for Papers-Special Issue on Experiments in Organizational Theory," Organization Science, INFORMS, vol. 30(1), pages 232-234, February.
    11. Nilsson, Isabelle & Delmelle, Elizabeth, 2018. "Transit investments and neighborhood change: On the likelihood of change," Journal of Transport Geography, Elsevier, vol. 66(C), pages 167-179.
    12. Abdul Shaban & Karima Kourtit & Peter Nijkamp, 2020. "India’s Urban System: Sustainability and Imbalanced Growth of Cities," Sustainability, MDPI, vol. 12(7), pages 1-20, April.
    13. Batabyal, Amitrajeet A. & Beladi, Hamid, 2018. "Artists, engineers, and aspects of economic growth in a creative region," Economic Modelling, Elsevier, vol. 71(C), pages 214-219.
    14. Megan Doherty Bea, 2024. "A Life Course Perspective of Community (Non)Investment: Historical Financial Service Trajectories and Community Outcomes," Journal of Family and Economic Issues, Springer, vol. 45(2), pages 288-307, June.
    15. Obschonka, Martin & Lee, Neil & Rodríguez-Pose, Andrés & Eichstaedt, johannes Christopher & Ebert, Tobias, 2018. "Big Data, artificial intelligence and the geography of entrepreneurship in the United States," OSF Preprints c62tn, Center for Open Science.
    16. Haoye Sun & Thorsten Teichert, 2024. "Scarcity in today´s consumer markets: scoping the research landscape by author keywords," Management Review Quarterly, Springer, vol. 74(1), pages 93-120, February.
    17. Evelyn Ravuri, 2023. "Neighbourhood change in Genesee and Kent Counties, Michigan, 1970–2019," Papers in Regional Science, Wiley Blackwell, vol. 102(1), pages 107-127, February.
    18. Buhr, Helena & Funk, Russell J. & Owen-Smith, Jason, 2021. "The authenticity premium: Balancing conformity and innovation in high technology industries," Research Policy, Elsevier, vol. 50(1).
    19. Al-Kilani, Shaymaa & El Hedhli, Kamel, 2021. "How do restaurant atmospherics influence restaurant authenticity? An integrative framework and empirical evidence," Journal of Retailing and Consumer Services, Elsevier, vol. 63(C).
    20. Joel Alcedo & Alberto Cavallo & Bricklin Dwyer & Prachi Mishra & Antonio Spilimbergo, 2022. "Back to Trend: COVID Effects on E-commerce in 47 Countries," NBER Working Papers 29729, National Bureau of Economic Research, Inc.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:socarx:8jbvg. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://arabixiv.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.