Researchers trying to estimate tens or hundreds of thousands of fixed effects for two or more groups (workers and firms; pupils, teachers and schools; etc.) in datasets with high numbers of observations are often limited by the size of computer memory available. Such a model is commonly estimated by sweeping out one of the effects by the fixed-effects transformation (time-demeaning) and by including the remaining effects as dummy variables. If K is the number of fixed effects to be included as dummy variables, and N is the number of observations, then the design matrix is of dimension N x K (neglecting any remaining right-hand-side regressors). The time-demeaned dummies have to be stored as “float” variables consuming 8 bytes per cell in Stata. For example, with 2 million observations (N) and 10 thousand fixed effects (K), the memory requirement would be 160 gigabytes. This paper describes how the memory requirement can be reduced to store only a K x K matrix, which in the given example reduces the memory requirement to below 1 gigabyte. The paper also describes the Stata program felsdvreg.ado, which implements the method in Mata. Besides implementing the memory-saving estimation method, the program also takes care of checking the identification of the effects and provides useful summary statistics.
Download Info
To download:
If you experience problems downloading a file, check if you have the
proper application to
view it first. Information about this may be contained
in the File-Format links below. In case of further problems read
the IDEAS help
page. Note that these files are not on the IDEAS
site. Please be patient as the files may be large.
Cited by: (explanations, Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.)