IDEAS home Printed from https://ideas.repec.org/p/boc/scon20/21.html
   My bibliography  Save this paper

Using Microsoft Excel to improve efficiency in working with large datasets in Stata

Author

Listed:
  • Ahmad Khanijahani

    (Duquesne University)

Abstract

There is an ongoing growth in the availability of data and increased number of variables in large datasets such as medical claim files or national surveys. Stata supports various descriptive, exploratory, and analytical approaches to work with these data to identify and study various topics such as public and clinical health outcomes and issues. Given the high volume of various data generated daily, implementing cross-platform approaches to manage and manipulate data can improve efficiency of data-science professionals and academic researchers. The aim of this presentation is to use Microsoft Excel jointly with Stata to facilitate data governance and manipulation in large-scale datasets. Method: This presentation will focus on three different ways that Excel can be used as a supportive tool to facilitate and expedite the data manipulation, analysis, interpretation, and reporting in Stata, with a focus on large datasets with many variables. First, Excel will be used as an interactive data dictionary tool to select and keep track of variables included in various analysis stages. Second, Excel commands and features will be used to generate batch commands to perform repeated variable transformation and conditional data manipulation or analysis in Stata. Finally, Stata output tables will be imported to Excel to further customize preparation and reporting. Each of these three categories of tasks will be supported by at least one example from a dataset with many variables. Conclusion: Using Microsoft Excel features and commands jointly with Stata can benefit data scientists and researchers by improving efficiency and productivity through saving time and providing a comprehensive picture of a dataset.

Suggested Citation

  • Ahmad Khanijahani, 2020. "Using Microsoft Excel to improve efficiency in working with large datasets in Stata," 2020 Stata Conference 21, Stata Users Group.
  • Handle: RePEc:boc:scon20:21
    as

    Download full text from publisher

    File URL: http://fmwww.bc.edu/repec/scon2020/us20_Khanijahani.pdf
    Download Restriction: no
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:boc:scon20:21. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Christopher F Baum (email available below). General contact details of provider: https://edirc.repec.org/data/stataea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.