We study semiparametric efficiency bounds and efficient estimation of parameters defined through general nonlinear, possibly non-smooth and over-identified moment restrictions, where the sampling information consists of a primary sample and an auxiliary sample. The variables of interest in the moment conditions are not directly observable in the primary data set, but the primary data set contains proxy variables which are correlated with the variables of interest. The auxiliary data set contains information about the conditional distribution of the variables of interest given the proxy variables. Identification is achieved by the assumption that this conditional distribution is the same in both the primary and auxiliary data sets. We provide semiparametric efficiency bounds for both the "verify-out-of-sample" case, where the two samples are independent, and the "verify-in-sample" case, where the auxiliary sample is a subset of the primary sample; and the bounds are derived when the propensity score is unknown, or known, or belongs to a correctly specified parametric family. These efficiency variance bounds indicate that the propensity score is ancillary for the "verify-in-sample" case, but is not ancillary for the "verify-out-of-sample" case. We show that sieve conditional expectation projection based GMM estimators achieve the semiparametric efficiency bounds for all the above mentioned cases, and establish their asymptotic efficiency under mild regularity conditions. Although inverse probability weighting based GMM estimators are also shown to be semiparametrically efficient, they need stronger regularity conditions and clever combinations of nonparametric and parametric estimates of the propensity score to achieve the efficiency bounds for various cases. Our results contribute to the literature on non-classical measurement error models, missing data and treatment effects.
Download Info
To download:
If you experience problems downloading a file, check if you have the
proper application to
view it first. Information about this may be contained
in the File-Format links below. In case of further problems read
the IDEAS help
file. Note that these files are not on the IDEAS
site. Please be patient as the files may be large.
Length: 50 pages Date of creation: Mar 2008 Date of revision: Publication status: Published in Annals of Statistics (2008), 36: 808-843 Handle: RePEc:cwl:cwldpp:1644
Find related papers by JEL classification: C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: General C3 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables
This paper has been announced in the following NEP Reports:
References listed on IDEAS Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
Did you know? You can include your works in the database easily by uploading them on the Munich Personal RePEc Archive (MPRA) if you do not have access to an institutional RePEc archive.