Distinct observations are those different with respect to one or more variables, considered either individually or jointly. Distinctness is thus a key aspect of the similarity or difference of observations. It is sometimes confounded with uniqueness. Counting the number of distinct observations may be required at any point from initial data cleaning or checking to subsequent statistical analysis. We review how far existing commands in official Stata offer solutions to this issue, and we show how to answer questions about distinct observations from first principles by using the by prefix and the egen command. The new distinct command is offered as a convenience tool.
Download Info
To download:
If you experience problems downloading a file, check if you have the
proper application to
view it first. Information about this may be contained
in the File-Format links below. In case of further problems read
the IDEAS help
page. Note that these files are not on the IDEAS
site. Please be patient as the files may be large.
Publisher Info
Article provided by StataCorp LP in its journal Stata Journal.
Volume (Year): 8 (2008) Issue (Month): 4 (December) Pages: 557-568 Download reference. The following formats are available: HTML
(with abstract),
plain text
(with abstract),
BibTeX,
RIS (EndNote, RefMan, ProCite),
ReDIF