IDEAS home Printed from
   My bibliography  Save this paper

Estimating spatial panel models using unbalanced panels


  • Gordon Hughes

    (University of Edinburgh)


Econometricians have begun to devote more attention to spatial interactions when carrying out applied econometric studies. In part, this is motivated by an explicit focus on spatial interactions in policy formulation or market behavior, but it may also reflect concern about the role of omitted variables that are or may be spatially correlated. The Stata user-written procedure xsmle has been designed to estimate a wide range of spatial panel models, including spatial autocorrelation, spatial Durbin, and spatial error models using maximum likelihood methods. It relies upon the availability of balanced panel data with no missing observations. This requirement is stringent, but it arises from the fact that in principle, the values of the dependent variable for any panel unit may depend upon the values of the dependent and independent variables for all the other panel units. Thus even a single missing data point may require that all data for a time period, panel unit, or variable be discarded. The presence of missing data is an endemic problem for many types of applied work, often because of the creation or disappearance of panel units. At the macro level, the number and composition of countries in Europe or local government units in the United Kingdom has changed substantially over the last three decades. In longitudinal household surveys, new households are created and old ones disappear all the time. Restricting the analysis to a subset of panel units that have remained stable over time is a form of sample selection whose consequences are uncertain and that may have statistical implications that merit additional investigation. The simplest mechanisms by which missing data may arise underpin the missing-at-random (MAR) assumption. When this is appropriate, it is possible to use two approaches to estimation with missing data. The first is either simple or, preferably, multiple imputation, which involves the replacement of missing data by stochastic imputed values. The Stata procedure mi can be combined with xsmle to implement a variety of estimates that rely upon multiple imputation. While the combination of procedures is relatively simple to estimate, practical experience suggests that the results can be quite sensitive to the specification that is adopted for the imputation phase of the analysis. Hence, this is not a one-size-fits-all method of dealing with unbalanced panels, because the analyst must give serious consideration to the way in which imputed values are generated. The second approach has been developed by Pfaffermayr. It relies upon the spatial interactions in the model, which means that the influence of the missing observations can be inferred from the values taken by nonmissing observations. In effect, the missing observations are treated as latent variables whose distribution can be derived from the values of the nonmissing data. This leads to a likelihood function that can be partitioned between missing and nonmissing data and thus used to estimate the coefficients of the full model. The merit of the approach is that it takes explicit account of the spatial structure of the model. However, the procedure becomes computationally demanding if the proportion of missing observations is too large and, as one would expect, the information provided by the spatial interactions is not sufficient to generate well-defined estimates of the structural coefficients. The missing-at-random assumption is crucial for both of these approaches, but it is not reasonable to rely upon it when dealing with the birth or death of distinct panel units. A third approach, which is based on methods used in the literature on statistical signal processing, relies upon reducing the spatial interactions to immediate neighbors. Intuitively, the basic unit for the analysis becomes a block consisting of a central unit (the dependent variable) and its neighbors (the spatial interactions). Because spatial interactions are restricted to within-block effects, the population of blocks can vary over time and standard nonspatial panel methods can be applied. The presentation will describe and compare the three approaches to estimating spatial panel models as implemented in Stata as extensions to xsmle. It will be illustrated by analyses of i) state data on electricity consumption in the U.S. and ii) gridded historical data on temperature and precipitation to identify the effects of El Niño (ENSO) and other major weather oscillations.

Suggested Citation

  • Gordon Hughes, 2013. "Estimating spatial panel models using unbalanced panels," United Kingdom Stata Users' Group Meetings 2013 09, Stata Users Group.
  • Handle: RePEc:boc:usug13:09

    Download full text from publisher

    File URL:
    Download Restriction: no

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:boc:usug13:09. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Christopher F Baum). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.