Author
Abstract
The goal of this paper is to investigate OpenStreetMap as a research tool by analysing what pros and cons this platform offers to linguistics and to GIS disciplines. To reach this goal, the paper analyses how this platform represents places as geographical units and toponyms (i.e. place names) as linguistic units referring to places. The paper presents two previous studies that featured a novel procedure for toponym extraction and its application to OpenStreetMap toponym data. These two studies focused on distinct scales and densities of geographical distribution in multi-lingual contexts: city level (Macao); mixed regional and national level (Italy). The studies also included a comparison of these data with data originating from an authoritative geographic source (e.g. Italian street directories). The present paper extends the analysis and results from these studies by showing that via a single extraction algorithm, one can obtain all the relevant toponyms from overpass-turbo, a platform including OpenStreetMap’s textual information, and from other gazetteers. For each level of analysis, the paper shows that toponyms come in different combinations of multi-lingual formats: Chinese and Portuguese for Macao, Italian, local dialects (e.g. Genoese), and minority languages (e.g. German) for Italy. From these data, the paper offers an analysis of language-specific features, methodological challenges, and informational accuracy of each database. The paper proposes that OpenStreetMap may be as reliable as authoritative sources; however, one must apply cross-source comparison during data analysis, to confirm OpenStreetMap-based data. The paper concludes by discussing the current role of OpenStreetMap as an information database in toponym extraction. The paper discusses the use of OSM in linguistics and GIS disciplines, and how these uses can offer theoretical insights informing research in these disciplines.
Suggested Citation
Francesco-Alessio Ursini & Giuseppe Samo, 2025.
"Extracting toponyms from OpenStreetMap and other gazetteers: comparing representational accuracy in multilingual contexts,"
Palgrave Communications, Palgrave Macmillan, vol. 12(1), pages 1-16, December.
Handle:
RePEc:pal:palcom:v:12:y:2025:i:1:d:10.1057_s41599-025-05025-1
DOI: 10.1057/s41599-025-05025-1
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pal:palcom:v:12:y:2025:i:1:d:10.1057_s41599-025-05025-1. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: https://www.nature.com/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.