Automating the production of large reports from Stata
We have been undertaking a systematic review of the literature on diet and cancer, which included all study types reporting on any dietary exposure. The data were presented in a mixture of category, mean difference, and regression coefficients, which we analyzed in Stata to produce dose–response estimates and other statistics for all results. The resulting tables were large (more than 3000 results). To rapidly produce formatted tables, we wrote the xtable command, which arranges data for exporting with formatting tags. These tags are then recognized by an Excel macro, which creates headings, merges across cells, and performs other formatting actions as required. In this way the data are compact, as study-level information is merged across cells to reduce duplication, and neatly organized. The process allows users to arrange the data as they wish, or the data can be sorted according to other variables within the command—or a mix of both. The data are exported as text format, there is one intermediate step as they are imported to Excel, and then it is a single key press to format the table. In this way complex tables can be produced with duplicate information merged across cells at more than one level, and multiple levels of headings can be incorporated. After the initial specification of the xtable command, it is then simple to rerun the procedure, which makes updates and modifications to the analysis simple. After developing these techniques, we wrote a program to form simple sentences based on our data, e.g.: “The Iowa Women’s Health study, a prospective cohort, reported an unadjusted OR of 1.09 (950.98, 1.21) per cup per day increase of coffee.” A program was then created that produced a series of short texts for each exposure in a log file, consisting of a title, subtitles, a small frequency table, and a sentence summarizing each result. The log file was then opened in Word and tags used to format the document as before to create titles and align the frequency tables. This proved a massive labor-saving device, as much of the report was rather repetitious, and had the added benefit of creating a structure for the report and preventing typing errors and accidental omission of results. The code for this method is too specific to produce a general command, but the techniques will be discussed.
When requesting a correction, please mention this item's handle: RePEc:boc:usug06:03. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Christopher F Baum)
If references are entirely missing, you can add them using this form.