Kenneth I. MacDonald () (Nuffield College, University of Oxford)
Abstract
The design decisions made by Stata in handling missing data in relational and logical expressions have, for the user, complex, pernicious, and poorly understood consequences. This presentation intends to substantiate that claim and to present two possible resolutions to the problem. As is well documented and reasonably well known, Stata considers p & q (and p | q) to be true when both p and q are indeterminate. This interpretation is counterintuitive and at odds with the formal-logic definition of these operators. To assert two unknowns is not to assert truth. Nevertheless, introductions to Stata characteristically present this as merely a “feature†and suggest that the obligation imposed on users (us) to explicitly test for missing data is straightforwardly implementable. Simple cases are indeed simple but, it will be argued, do not readily scale up to complex, real-life instances. For example, the one-line Stata command to implement the intention, "generate v = p|q" becomes "generate v = p|q if !mi(p,q)|(p&!mi(p))|(q&!mi(q))" And so forth. Such coding is a problem, not a feature—so solutions should be sought. One solution (really a work-around) introduces my command, validly, which allows expressions such as "validly generate v = p|q" and correctly, without fuss, interprets the logical or relational operators (here returning true if p is true but q indeterminate and indeterminate if p is false but q indeterminate). More generally, validly serves as a “wrapper†for any standard conditional command. So, for example, "validly reg a b c if p|q" is handled correctly. But validly (its code deploys nested calls to cond()) is computationally expensive. The better resolution would be for Stata, in its next release, to redesign its core code so that logical and relational operators would (as algebraic operators currently do) handle missing data appropriately. (Objections to this strategy are examined and deemed to lack force.) I would like to enlist the informed and active judgment of the participants of the 14th Users Group meeting to help bring this about.
Download Info
To download:
If you experience problems downloading a file, check if you have the
proper application to
view it first. Information about this may be contained
in the File-Format links below. In case of further problems read
the IDEAS help
page. Note that these files are not on the IDEAS
site. Please be patient as the files may be large.