IDEAS home Printed from https://ideas.repec.org/p/osf/osfxxx/9bu5z.html
   My bibliography  Save this paper

Using the web to predict regional trade flows: data extraction, modelling, and validation

Author

Listed:
  • Tranos, Emmanouil
  • Incera, Andre Carrascal
  • Willis, George

Abstract

Despite the importance of interregional trade for building effective regional economic policies, there is very little hard data to illustrate such interdependencies. We propose here a novel research framework to predict interregional trade flows by utilising freely available web data and machine learning algorithms. Specifically, we extract hyperlinks between archived websites in the UK and we aggregate these data to create an interregional network of hyperlinks between geolocated and commercial webpages over time. We also use some existing interregional trade data to train our models using random forests and then make out-of-sample predictions of interregional trade flows using a rolling-forecasting framework. Our models illustrative great predictive capability with $R^2$ greater than 0.9. We are also able to disaggregate our predictions in terms of industrial sectors, but also at a sub-regional level, for which trade data are not available. In total, our models provide a proof of concept that the digital traces left behind by physical trade can help us capture such economic activities at a more granular level and, consequently, inform regional policies.

Suggested Citation

  • Tranos, Emmanouil & Incera, Andre Carrascal & Willis, George, 2022. "Using the web to predict regional trade flows: data extraction, modelling, and validation," OSF Preprints 9bu5z, Center for Open Science.
  • Handle: RePEc:osf:osfxxx:9bu5z
    DOI: 10.31219/osf.io/9bu5z
    as

    Download full text from publisher

    File URL: https://osf.io/download/62c552d37ddff526409a6527/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/9bu5z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Olga Ivanova & d'Artis Kancs & Dirk Stelder, 2009. "Modelling Inter-Regional Trade Flows: Data and Methodological Issues in Rhomolo," EERI Research Paper Series EERI RP 2009/31, Economics and Econometrics Research Institute (EERI), Brussels.
    2. James Paul Lesage & Wolfgang Polasek, 2008. "Incorporating Transportation Network Structure in Spatial Econometric Models of Commodity Flows," Spatial Economic Analysis, Taylor & Francis Journals, vol. 3(2), pages 225-245.
    3. Liwen Vaughan & Yijun Gao & Margaret Kipp, 2006. "Why are hyperlinks to business Websites created? A content analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 67(2), pages 291-300, May.
    4. Ana L.M. Sargento & Pedro Nogueira Ramos & Geoffrey J.D. Hewings, 2012. "Inter-Regional Trade Flow Estimation Through Non-Survey Models: An Empirical Assessment," Economic Systems Research, Taylor & Francis Journals, vol. 24(2), pages 173-193, March.
    5. Kuhn, Max, 2008. "Building Predictive Models in R Using the caret Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 28(i05).
    6. Wen Chen & Bart Los & Philip McCann & Raquel Ortega‐Argilés & Mark Thissen & Frank van Oort, 2018. "The continental divide? Economic exposure to Brexit in regions and countries on both sides of The Channel," Papers in Regional Science, Wiley Blackwell, vol. 97(1), pages 25-54, March.
    7. Antrà s, Pol & Chor, Davin, 2017. "On the Measurement of Upstreamness and Downstreamness in Global Value Chains," CEPR Discussion Papers 12549, C.E.P.R. Discussion Papers.
    8. Oshan, Taylor M., 2020. "Potential and pitfalls of big transport data for spatial interaction models of urban mobility," OSF Preprints gwumt, Center for Open Science.
    9. Felipa de Mello-Sampayo, 2017. "Testing competing destinations gravity models – evidence from BRIC International," The Journal of International Trade & Economic Development, Taylor & Francis Journals, vol. 26(3), pages 277-294, April.
    10. David H. Autor & David Dorn & Gordon H. Hanson, 2013. "The China Syndrome: Local Labor Market Effects of Import Competition in the United States," American Economic Review, American Economic Association, vol. 103(6), pages 2121-2168, October.
    11. Felipa de Mello-Sampayo, 2017. "Competing-destinations gravity model applied to trade in intermediate goods," Applied Economics Letters, Taylor & Francis Journals, vol. 24(19), pages 1378-1384, November.
    12. Gervais, Antoine & Jensen, J. Bradford, 2019. "The tradability of services: Geographic concentration and trade costs," Journal of International Economics, Elsevier, vol. 118(C), pages 331-350.
    13. Estrella Gómez-Herrera, 2013. "Comparing alternative methods to estimate gravity models of bilateral trade," Empirical Economics, Springer, vol. 44(3), pages 1087-1111, June.
    14. Evert Meijers & Antoine Peris, 2019. "Using toponym co-occurrences to measure relationships between places: review, application and evaluation," International Journal of Urban Sciences, Taylor & Francis Journals, vol. 23(2), pages 246-268, April.
    15. Kim Holmberg, 2010. "Co-inlinking to a municipal Web space: a webometric and content analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 83(3), pages 851-862, June.
    16. Peter Egger, 2002. "An Econometric View on the Estimation of Gravity Models and the Calculation of Trade Potentials," The World Economy, Wiley Blackwell, vol. 25(2), pages 297-312, February.
    17. James E. Anderson & Eric van Wincoop, 2003. "Gravity with Gravitas: A Solution to the Border Puzzle," American Economic Review, American Economic Association, vol. 93(1), pages 170-192, March.
    18. Liwen Vaughan & Guozhu Wu, 2004. "Links to commercial websites as a source of business information," Scientometrics, Springer;Akadémiai Kiadó, vol. 60(3), pages 487-496, August.
    19. Raf Guns & Ronald Rousseau, 2014. "Recommending research collaborations using link prediction and random forest classifiers," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1461-1473, November.
    20. Filippo Simini & Marta C. González & Amos Maritan & Albert-László Barabási, 2012. "A universal model for mobility and migration patterns," Nature, Nature, vol. 484(7392), pages 96-100, April.
    21. Geoff Riddington & Hervey Gibson & John Anderson, 2006. "Comparison of Gravity Model, Survey and Location Quotient-based Local Area Tables and Multipliers," Regional Studies, Taylor & Francis Journals, vol. 40(9), pages 1069-1081.
    22. Yi Ren & Tong Xia & Yong Li & Xiang Chen, 2019. "Predicting socio-economic levels of urban regions via offline and online indicators," PLOS ONE, Public Library of Science, vol. 14(7), pages 1-15, July.
    23. Bart Los & Philip McCann & John Springford & Mark Thissen, 2017. "The mismatch between local voting and the local economic consequences of Brexit," Regional Studies, Taylor & Francis Journals, vol. 51(5), pages 786-799, May.
    24. Jon Kleinberg & Jens Ludwig & Sendhil Mullainathan & Ziad Obermeyer, 2015. "Prediction Policy Problems," American Economic Review, American Economic Association, vol. 105(5), pages 491-495, May.
    25. Philip McCann & Raquel Ortega-Argil�s, 2015. "Smart Specialization, Regional Growth and Applications to European Union Cohesion Policy," Regional Studies, Taylor & Francis Journals, vol. 49(8), pages 1291-1302, August.
    26. Anastasios Kitsos & André Carrascal-Incera & Raquel Ortega-Argilés, 2019. "The Role of Embeddedness on Regional Economic Resilience: Evidence from the UK," Sustainability, MDPI, vol. 11(14), pages 1-19, July.
    27. Susan Athey & Guido W. Imbens, 2019. "Machine Learning Methods That Economists Should Know About," Annual Review of Economics, Annual Reviews, vol. 11(1), pages 685-725, August.
    28. I�aki Arto & Jos� M. Rueda-Cantuche & Glen P. Peters, 2014. "Comparing The Gtap-Mrio And Wiod Databases For Carbon Footprint Analysis," Economic Systems Research, Taylor & Francis Journals, vol. 26(3), pages 327-353, September.
    29. Evert Meijers & Martijn Burger & Mark Thissen & Thomas Graaff & Frank Oort, 2016. "Competitive network positions in trade and structural economic growth: A geographically weighted regression analysis for European regions," Papers in Regional Science, Wiley Blackwell, vol. 95(1), pages 159-180, March.
    30. Fukunari Kimura & Hyun-Hoon Lee, 2006. "The Gravity Equation in International Trade in Services," Review of World Economics (Weltwirtschaftliches Archiv), Springer;Institut für Weltwirtschaft (Kiel Institute for the World Economy), vol. 142(1), pages 92-121, April.
    31. Matthew A Zook, 2000. "The Web of Production: The Economic Geography of Commercial Internet Content Production in the United States," Environment and Planning A, , vol. 32(3), pages 411-426, March.
    32. Krzysztof Janc, 2015. "Geography of Hyperlinks-Spatial Dimensions of Local Government Websites," European Planning Studies, Taylor & Francis Journals, vol. 23(5), pages 1019-1037, May.
    33. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    34. Xesús Pereira-López & André Carrascal-Incera & Melchor Fernández-Fernández, 2020. "A bidimensional reformulation of location quotients for generating input–output tables," Spatial Economic Analysis, Taylor & Francis Journals, vol. 15(4), pages 476-493, October.
    35. Bernard Fingleton & Harry Garretsen & Ron Martin, 2012. "Recessionary Shocks And Regional Employment: Evidence On The Resilience Of U.K. Regions," Journal of Regional Science, Wiley Blackwell, vol. 52(1), pages 109-133, February.
    36. Mozolin, M. & Thill, J. -C. & Lynn Usery, E., 2000. "Trip distribution forecasting with multilayer perceptron neural networks: A critical evaluation," Transportation Research Part B: Methodological, Elsevier, vol. 34(1), pages 53-73, January.
    37. Kim Holmberg & Mike Thelwall, 2009. "Local government web sites in Finland: A geographic and webometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 79(1), pages 157-169, April.
    38. Krzysztof Janc, 2015. "Visibility and Connections among Cities in Digital Space," Journal of Urban Technology, Taylor & Francis Journals, vol. 22(4), pages 3-21, October.
    39. Liwen Vaughan, 2004. "Exploring website features for business information," Scientometrics, Springer;Akadémiai Kiadó, vol. 61(3), pages 467-477, November.
    40. Anne Owen & Richard Wood & John Barrett & Andrew Evans, 2016. "Explaining value chain differences in MRIO databases through structural path decomposition," Economic Systems Research, Taylor & Francis Journals, vol. 28(2), pages 243-272, June.
    41. Hellmanzik, Christiane & Schmitz, Martin, 2017. "Taking gravity online: The role of virtual proximity in international finance," Journal of International Money and Finance, Elsevier, vol. 77(C), pages 164-179.
    42. Bart Los & Marcel P. Timmer & Gaaitzen J. Vries, 2015. "How Global Are Global Value Chains? A New Approach To Measure International Fragmentation," Journal of Regional Science, Wiley Blackwell, vol. 55(1), pages 66-92, January.
    43. Emmanouil Tranos & Tasos Kitsos & Raquel Ortega-Argilés, 2021. "Digital economy in the UK: regional productivity effects of early adoption," Regional Studies, Taylor & Francis Journals, vol. 55(12), pages 1924-1938, December.
    44. repec:hal:spmain:info:hdl:2441/10144 is not listed on IDEAS
    45. Yan, Xiang & Liu, Xinyu & Zhao, Xilei, 2020. "Using machine learning for direct demand modeling of ridesourcing services in Chicago," Journal of Transport Geography, Elsevier, vol. 83(C).
    46. Bart Los & Marcel P. Timmer & Gaaitzen J. de Vries, 2016. "Tracing Value-Added and Double Counting in Gross Exports: Comment," American Economic Review, American Economic Association, vol. 106(7), pages 1958-1966, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Eduardo A. Haddad & Inácio F. Araújo, 2021. "The internal geography of services value‐added in exports: A Latin American perspective," Papers in Regional Science, Wiley Blackwell, vol. 100(3), pages 713-744, June.
    2. Eduardo Rodrigues Sanguinet & Francisco de Borja García-García, 2023. "Rural-Urban Linkages: Regional Financial Business Services’ Integration into Chilean Agri-Food Value Chains," Sustainability, MDPI, vol. 15(14), pages 1-22, July.
    3. Mark Thissen & Maureen Lankhuizen & Frank (F.G.) van Oort & Bart Los & Dario Diodato, 2018. "EUREGIO: The construction of a global IO DATABASE with regional detail for Europe for 2000-2010," Tinbergen Institute Discussion Papers 18-084/VI, Tinbergen Institute.
    4. Camille Reverdy, 2023. "Estimating the general equilibrium effects of services trade liberalization," Review of International Economics, Wiley Blackwell, vol. 31(2), pages 493-521, May.
    5. Hildegunn K. Nordås & Dorothée Rouzet, 2017. "The Impact of Services Trade Restrictiveness on Trade Flows," The World Economy, Wiley Blackwell, vol. 40(6), pages 1155-1183, June.
    6. Hylke Vandenbussche & William Connell & Wouter Simons, 2022. "Global value chains, trade shocks and jobs: An application to Brexit," The World Economy, Wiley Blackwell, vol. 45(8), pages 2338-2369, August.
    7. Araújo, Inácio Fernandes de & Perobelli, Fernando Salgueiro & Faria, Weslem Rodrigues, 2021. "Regional and global patterns of participation in value chains: Evidence from Brazil," International Economics, Elsevier, vol. 165(C), pages 154-171.
    8. Giammetti, Raffaele, 2019. "Tariffs, Domestic Import Substitution and Trade Diversion in Input-Output Production Networks: how to deal with Brexit," MPRA Paper 93229, University Library of Munich, Germany.
    9. Mark Thissen & Frank van Oort & Philip McCann & Raquel Ortega-Argilés & Trond Husby, 2020. "The Implications of Brexit for UK and EU Regional Competitiveness," Economic Geography, Taylor & Francis Journals, vol. 96(5), pages 397-421, October.
    10. Shahriar Kabir & Ruhul Salim, 2016. "Can A Common Currency Induce Intra-Regional Trade? The Southeast Asian Perspective," Review of Urban & Regional Development Studies, Wiley Blackwell, vol. 28(3), pages 218-234, November.
    11. Filmer,Deon P. & Nahata,Vatsal & Sabarwal,Shwetlena, 2021. "Preparation, Practice, and Beliefs : A Machine Learning Approach to Understanding Teacher Effectiveness," Policy Research Working Paper Series 9847, The World Bank.
    12. Nenci, Silvia & Fusacchia, Ilaria & Giunta, Anna & Montalbano, Pierluigi & Pietrobelli, Carlo, 2022. "Mapping global value chain participation and positioning in agriculture and food: stylised facts, empirical evidence and critical issues," Bio-based and Applied Economics Journal, Italian Association of Agricultural and Applied Economics (AIEAA), vol. 11(2), July.
    13. Shahriar Kabir & Harry Bloch & Ruhul A Salim, 2018. "Global Financial Crisis And Southeast Asian Trade Performance: Empirical Evidence," Review of Urban & Regional Development Studies, Wiley Blackwell, vol. 30(2), pages 114-144, July.
    14. Falco J. Bargagli Stoffi & Kenneth De Beckker & Joana E. Maldonado & Kristof De Witte, 2021. "Assessing Sensitivity of Machine Learning Predictions.A Novel Toolbox with an Application to Financial Literacy," Papers 2102.04382, arXiv.org.
    15. Bar-Ilan, Judit, 2008. "Informetrics at the beginning of the 21st century—A review," Journal of Informetrics, Elsevier, vol. 2(1), pages 1-52.
    16. Das, Satya P. & Sant’Anna, Vinicios P., 2023. "Determinants of bilateral trade in manufacturing and services: A unified approach," Economic Modelling, Elsevier, vol. 123(C).
    17. Dinçer, Gönül, 2014. "Turkey’s Rising Imports from BRICS: A Gravity Model Approach," MPRA Paper 61979, University Library of Munich, Germany.
    18. José-Antonio Ontalba-Ruipérez & Enrique Orduna-Malea & Adolfo Alonso-Arroyo, 2016. "Identifying institutional relationships in a geographically distributed public health system using interlinking and co-authorship methods," Scientometrics, Springer;Akadémiai Kiadó, vol. 106(3), pages 1167-1191, March.
    19. Costas Arkolakis & Federico Huneeus & Yuhei Miyauchi, 2023. "Spatial Production Networks," Working Papers Central Bank of Chile 971, Central Bank of Chile.
    20. Nhan Thanh Thi Hoang & Hoan Quang Truong & Chung Van Dong, 2020. "Determinants of Trade Between Taiwan and ASEAN Countries: A PPML Estimator Approach," SAGE Open, , vol. 10(2), pages 21582440209, May.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:osfxxx:9bu5z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://osf.io/preprints/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.