Automating the production of large reports from Stata
We have been undertaking a systematic review of the literature on diet and cancer, which included all study types reporting on any dietary exposure. The data were presented in a mixture of category, mean difference, and regression coefficients, which we analyzed in Stata to produce dose–response estimates and other statistics for all results. The resulting tables were large (more than 3000 results). To rapidly produce formatted tables, we wrote the xtable command, which arranges data for exporting with formatting tags. These tags are then recognized by an Excel macro, which creates headings, merges across cells, and performs other formatting actions as required. In this way the data are compact, as study-level information is merged across cells to reduce duplication, and neatly organized. The process allows users to arrange the data as they wish, or the data can be sorted according to other variables within the command—or a mix of both. The data are exported as text format, there is one intermediate step as they are imported to Excel, and then it is a single key press to format the table. In this way complex tables can be produced with duplicate information merged across cells at more than one level, and multiple levels of headings can be incorporated. After the initial specification of the xtable command, it is then simple to rerun the procedure, which makes updates and modifications to the analysis simple. After developing these techniques, we wrote a program to form simple sentences based on our data, e.g.: “The Iowa Women’s Health study, a prospective cohort, reported an unadjusted OR of 1.09 (950.98, 1.21) per cup per day increase of coffee.” A program was then created that produced a series of short texts for each exposure in a log file, consisting of a title, subtitles, a small frequency table, and a sentence summarizing each result. The log file was then opened in Word and tags used to format the document as before to create titles and align the frequency tables. This proved a massive labor-saving device, as much of the report was rather repetitious, and had the added benefit of creating a structure for the report and preventing typing errors and accidental omission of results. The code for this method is too specific to produce a general command, but the techniques will be discussed.
When requesting a correction, please mention this item's handle: RePEc:boc:usug06:03. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Christopher F Baum)
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If references are entirely missing, you can add them using this form.
If the full references list an item that is present in RePEc, but the system did not link to it, you can help with this form.
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your profile, as there may be some citations waiting for confirmation.
Please note that corrections may take a couple of weeks to filter through the various RePEc services.