Estimating spatial panel models using unbalanced panels
AbstractEconometricians have begun to devote more attention to spatial interactions when carrying out applied econometric studies. In part, this is motivated by an explicit focus on spatial interactions in policy formulation or market behavior, but it may also reflect concern about the role of omitted variables that are or may be spatially correlated. The Stata user-written procedure xsmle has been designed to estimate a wide range of spatial panel models, including spatial autocorrelation, spatial Durbin, and spatial error models using maximum likelihood methods. It relies upon the availability of balanced panel data with no missing observations. This requirement is stringent, but it arises from the fact that in principle, the values of the dependent variable for any panel unit may depend upon the values of the dependent and independent variables for all the other panel units. Thus even a single missing data point may require that all data for a time period, panel unit, or variable be discarded. The presence of missing data is an endemic problem for many types of applied work, often because of the creation or disappearance of panel units. At the macro level, the number and composition of countries in Europe or local government units in the United Kingdom has changed substantially over the last three decades. In longitudinal household surveys, new households are created and old ones disappear all the time. Restricting the analysis to a subset of panel units that have remained stable over time is a form of sample selection whose consequences are uncertain and that may have statistical implications that merit additional investigation. The simplest mechanisms by which missing data may arise underpin the missing-at-random (MAR) assumption. When this is appropriate, it is possible to use two approaches to estimation with missing data. The first is either simple or, preferably, multiple imputation, which involves the replacement of missing data by stochastic imputed values. The Stata procedure mi can be combined with xsmle to implement a variety of estimates that rely upon multiple imputation. While the combination of procedures is relatively simple to estimate, practical experience suggests that the results can be quite sensitive to the specification that is adopted for the imputation phase of the analysis. Hence, this is not a one-size-fits-all method of dealing with unbalanced panels, because the analyst must give serious consideration to the way in which imputed values are generated. The second approach has been developed by Pfaffermayr. It relies upon the spatial interactions in the model, which means that the influence of the missing observations can be inferred from the values taken by nonmissing observations. In effect, the missing observations are treated as latent variables whose distribution can be derived from the values of the nonmissing data. This leads to a likelihood function that can be partitioned between missing and nonmissing data and thus used to estimate the coefficients of the full model. The merit of the approach is that it takes explicit account of the spatial structure of the model. However, the procedure becomes computationally demanding if the proportion of missing observations is too large and, as one would expect, the information provided by the spatial interactions is not sufficient to generate well-defined estimates of the structural coefficients. The missing-at-random assumption is crucial for both of these approaches, but it is not reasonable to rely upon it when dealing with the birth or death of distinct panel units. A third approach, which is based on methods used in the literature on statistical signal processing, relies upon reducing the spatial interactions to immediate neighbors. Intuitively, the basic unit for the analysis becomes a block consisting of a central unit (the dependent variable) and its neighbors (the spatial interactions). Because spatial interactions are restricted to within-block effects, the population of blocks can vary over time and standard nonspatial panel methods can be applied. The presentation will describe and compare the three approaches to estimating spatial panel models as implemented in Stata as extensions to xsmle. It will be illustrated by analyses of i) state data on electricity consumption in the U.S. and ii) gridded historical data on temperature and precipitation to identify the effects of El NiÃ±o (ENSO) and other major weather oscillations.
Download InfoIf you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
Bibliographic InfoPaper provided by Stata Users Group in its series United Kingdom Stata Users' Group Meetings 2013 with number 09.
Date of creation: 16 Sep 2013
Date of revision:
This paper has been announced in the following NEP Reports:
- NEP-ALL-2013-09-26 (All new papers)
- NEP-ECM-2013-09-26 (Econometrics)
- NEP-GEO-2013-09-26 (Economic Geography)
- NEP-URE-2013-09-26 (Urban & Real Estate Economics)
You can help add them by filling out this form.
reading list or among the top items on IDEAS.Access and download statisticsgeneral information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Christopher F Baum).
If references are entirely missing, you can add them using this form.