It is common in empirical research to use what appear to be sensible rules of thumb for cleaning data. Measurement error is often the justification for removing (trimming) or recoding (winsorizing) observations whose values lie outside a specified range. This paper considers identification in a linear model when the dependent variable is mismeasured. The results examine the common practice of trimming and winsorizing to address the identification failure. In contrast to the physical and laboratory sciences, measurement error in social science data is likely to be more complex than simply additive white noise. We consider a general measurement error process which nests many processes including the additive white noise process and a contaminated sampling process. Analytic results are only tractable under strong distributional assumptions, but demonstrate that winsorizing and trimming are only solutions for a particular class of measurement error processes. Indeed, trimming and winsorizing may induce or exacerbate bias. We term this source of bias Iatrogenic' (or econometrician induced) error. The identification results for the general error process highlight other approaches which are more robust to distributional assumptions. Monte Carlo simulations demonstrate the fragility of trimming and winsorizing as solutions to measurement error in the dependent variable.
Download Info
To download:
If you experience problems downloading a file, check if you have the
proper application to
view it first. Information about this may be contained
in the File-Format links below. In case of further problems read
the IDEAS help
page. Note that these files are not on the IDEAS
site. Please be patient as the files may be large.
As the access to this document is restricted, you may want to look for a different version under "Related research" (further below) or search for a different version of it.
Publisher Info
Paper provided by National Bureau of Economic Research, Inc in its series NBER Technical Working Papers with number
0289.
Length: Date of creation: Mar 2003 Date of revision: Handle: RePEc:nbr:nberte:0289
Note: TWP Contact details of provider: Postal: National Bureau of Economic Research, 1050 Massachusetts Avenue Cambridge, MA 02138, U.S.A. Phone: 617-868-3900 Email: Web page: http://www.nber.org More information through EDIRC
For technical questions regarding this item, or to correct its listing, contact: ().
Related research
Keywords:
Other versions of this item:
Find related papers by JEL classification: C2 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables C8 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs
This paper has been announced in the following NEP Reports:
References listed on IDEAS Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.: