IDEAS home Printed from https://ideas.repec.org/p/egu/wpaper/1843.html
   My bibliography  Save this paper

A network-based method to harmonize data classifications

Author

Listed:
  • Dario Diodato

Abstract

A frequent problem in research is the harmonization of data to a common classification, whether that is in terms of ? to name a few examples ? industries, commodities, occupations, or geograph- ical areas. Statistical offices often provide concordance tables, to match data through time or with different classifications, but these concordance tables alone are often not sufficient to define a clear methodology on how the matching should be performed. In fact, the concordance tables have, in numerous occasions, a many-to-many mapping of classifications. The issue is exacerbated when two or more concordance tables are concatenated. In this Jupyter notebook, I discuss a network- based abstraction of this problem and propose, as a general solution, a method that identifies the network components (or the network communities) to make data converge to a new classification. The method simplifies the issue and reduces greatly conversion errors.

Suggested Citation

  • Dario Diodato, 2018. "A network-based method to harmonize data classifications," Papers in Evolutionary Economic Geography (PEEG) 1843, Utrecht University, Department of Human Geography and Spatial Planning, Group Economic Geography, revised Dec 2018.
  • Handle: RePEc:egu:wpaper:1843
    as

    Download full text from publisher

    File URL: http://econ.geo.uu.nl/peeg/peeg1843.pdf
    File Function: Version December 2018
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Diodato, Dario & Neffke, Frank & O’Clery, Neave, 2018. "Why do industries coagglomerate? How Marshallian externalities differ by industry and have evolved over time," Journal of Urban Economics, Elsevier, vol. 106(C), pages 1-26.
    2. Glenn Ellison & Edward L. Glaeser & William R. Kerr, 2010. "What Causes Industry Agglomeration? Evidence from Coagglomeration Patterns," American Economic Review, American Economic Association, vol. 100(3), pages 1195-1213, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Daniel Straulino & Mattie Landman & Neave O'Clery, 2020. "A bi-directional approach to comparing the modular structure of networks," Papers 2010.06568, arXiv.org.
    2. Diodato, Dario & Hausmann, Ricardo & Neffke, Frank, 2023. "The impact of return migration on employment and wages in Mexican cities," Journal of Urban Economics, Elsevier, vol. 135(C).
    3. Dario Diodato & Ricardo Hausmann & Frank Neffke, 2020. "The impact of return migration from the U.S. on employment and wages in Mexican cities," Papers in Evolutionary Economic Geography (PEEG) 2012, Utrecht University, Department of Human Geography and Spatial Planning, Group Economic Geography, revised Mar 2020.
    4. Lukaszuk, Piotr & Torun, David, 2022. "Harmonizing the Harmonized System," Economics Working Paper Series 2212, University of St. Gallen, School of Economics and Political Science.
    5. Mattie Landman & Sanna Ojanperä & Stephen Kinsella & Neave O’Clery, 2023. "The role of relatedness and strategic linkages between domestic and MNE sectors in regional branching and resilience," The Journal of Technology Transfer, Springer, vol. 48(2), pages 515-559, April.
    6. Ana Grisanti & Douglas Barrios & Eric S. M. Protzer & Jorge Tapia & Nikita Taniparti & Ricardo Hausmann & Rushabh Sanghvi & Semiray Kasoolu & Tim O'Brien, 2021. "Western Australia – Research Findings and Policy Recommendations," CID Working Papers 395, Center for International Development at Harvard University.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Balland, Pierre-Alexandre & Broekel, Tom & Diodato, Dario & Giuliani, Elisa & Hausmann, Ricardo & O'Clery, Neave & Rigby, David, 2022. "Reprint of The new paradigm of economic complexity," Research Policy, Elsevier, vol. 51(8).
    2. Alje van Dam & Andres Gomez‐Lievano & Frank Neffke & Koen Frenken, 2023. "An information‐theoretic approach to the analysis of location and colocation patterns," Journal of Regional Science, Wiley Blackwell, vol. 63(1), pages 173-213, January.
    3. Matias Nehuen Iglesias, 2021. "The Overlooked Insights from Correlation Structures in Economic Geography," Papers in Evolutionary Economic Geography (PEEG) 2105, Utrecht University, Department of Human Geography and Spatial Planning, Group Economic Geography, revised Jan 2021.
    4. Qin, Quande & Yu, Ying & Liu, Yuan & Zhou, Jianqing & Chen, Xiude, 2023. "Industrial agglomeration and energy efficiency: A new perspective from market integration," Energy Policy, Elsevier, vol. 183(C).
    5. Hidalgo, César A., 2023. "The policy implications of economic complexity," Research Policy, Elsevier, vol. 52(9).
    6. Du, Mengfan & Zhang, Yue-Jun, 2023. "The impact of producer services agglomeration on green economic development: Evidence from 278 Chinese cities," Energy Economics, Elsevier, vol. 124(C).
    7. Amezcua, Alejandro & Ratinho, Tiago & Plummer, Lawrence A. & Jayamohan, Parvathi, 2020. "Organizational sponsorship and the economics of place: How regional urbanization and localization shape incubator outcomes," Journal of Business Venturing, Elsevier, vol. 35(4).
    8. Rawaa Laajimi & Julie Le Gallo & Saloua Benammou, 2020. "What Geographical Concentration of Industries in the Tunisian Sahel? Empirical Evidence Using Distance‐Based Measures," Tijdschrift voor Economische en Sociale Geografie, Royal Dutch Geographical Society KNAG, vol. 111(5), pages 738-757, December.
    9. Li,Yue - ETICI & Sinha Roy,Sutirtha, 2020. "The Employment Effect of Place-Based Policies : Evidence from India," Policy Research Working Paper Series 9477, The World Bank.
    10. Carla Costa & Rui Baptista, 2023. "Knowledge inheritance and performance of spinouts," Eurasian Business Review, Springer;Eurasia Business and Economics Society, vol. 13(1), pages 29-55, March.
    11. O’Clery, Neave & Kinsella, Stephen, 2022. "Modular structure in labour networks reveals skill basins," Research Policy, Elsevier, vol. 51(5).
    12. Mattie Landman & Sanna Ojanperä & Stephen Kinsella & Neave O’Clery, 2023. "The role of relatedness and strategic linkages between domestic and MNE sectors in regional branching and resilience," The Journal of Technology Transfer, Springer, vol. 48(2), pages 515-559, April.
    13. Neave O'Clery & Samuel Heroy & Francois Hulot & Mariano Beguerisse-D'iaz, 2019. "Unravelling the forces underlying urban industrial agglomeration," Papers 1903.09279, arXiv.org, revised Jun 2019.
    14. Li, Yang & Neffke, Frank M.H., 2024. "Evaluating the principle of relatedness: Estimation, drivers and implications for policy," Research Policy, Elsevier, vol. 53(3).
    15. Steijn, Mathieu P.A. & Koster, Hans R.A. & Van Oort, Frank G., 2022. "The dynamics of industry agglomeration: Evidence from 44 years of coagglomeration patterns," Journal of Urban Economics, Elsevier, vol. 130(C).
    16. Rosalia Castellano & Gaetano Musella & Gennaro Punzo, 2023. "Does context matter? Exploring the effects of productive structures on the relationship between innovation and workforce skills’ complementarity," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(3), pages 1991-2011, June.
    17. Riccardo Crescenzi & Roberto Ganau & Michael Storper, 2022. "Does foreign investment hurt job creation at home? The geography of outward FDI and employment in the USA," Journal of Economic Geography, Oxford University Press, vol. 22(1), pages 53-79.
    18. Riccardo Cappelli & Ron Boschma & Anet Weterings, 2019. "Labour mobility, skill-relatedness and new plant survival across different development stages of an industry," Environment and Planning A, , vol. 51(4), pages 869-890, June.
    19. Sándor Juhász & Tom Broekel & Ron Boschma, 2021. "Explaining the dynamics of relatedness: The role of co‐location and complexity," Papers in Regional Science, Wiley Blackwell, vol. 100(1), pages 3-21, February.
    20. Zhao, Zhong & Zheng, Liang, 2023. "The Births of New Private-Owned Enterprises in an Environment of State-Owned Enterprises," IZA Discussion Papers 16259, Institute of Labor Economics (IZA).

    More about this item

    Keywords

    classification; concordance; harmonization; network; Python; Jupyter;
    All these keywords.

    JEL classification:

    • C65 - Mathematical and Quantitative Methods - - Mathematical Methods; Programming Models; Mathematical and Simulation Modeling - - - Miscellaneous Mathematical Tools
    • C82 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Macroeconomic Data; Data Access
    • C88 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other Computer Software

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:egu:wpaper:1843. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/deguunl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.