IDEAS home Printed from https://ideas.repec.org/p/boc/usug20/01.html
   My bibliography  Save this paper

From datasets to metadatasets in Stata

Author

Listed:
  • Roger Newson

    (Department of Primary Care and Public Health, Imperial College London)

Abstract

Metadatasets are Stata datasets, in files or in frames, which may have one observation per file, per dataset, per variable, or per variable value. Metadatasets can be used to modify a Stata database, or to make a Stata database self-documenting, especially if converted to non-Stata formats, such as HTML or even Microsoft Excel. We present some user-written packages, updated to Stata version 16, for creating and using metadatasets. The xdir package creates a resultsset with one observation per file in a folder conforming to a user-specified pattern. The descgen pack inputs a xdir resultsset, and generates a new variable indicating whether each file is a Stata dataset, and other new variables containing dataset attributes, such as the dataset label and characteristics, the sort key of variables, and the numbers of observations and variables. The vallabdef package inputs a dataset with one observation per label name per value per value label, and generates Stata value labels. The vallabsave package loads and saves value labels from and to label-only datasets, and transfers value labels between data frames. The descsave package creates a metadataset with one observation per variable in a dataset, and data on variable attributes (including characteristics). The invdesc package modifies the variable attributes of the dataset in the current frame, inputting a descsave resultsset in a second data frame to set the variable attributes, and inputting value labels from a dataset in a third data frame. The datasets containing the variable attributes and value labels may be produced as resultssets by Stata packages, or produced manually in a spreadsheet using LibreOffice Calc or Microsoft Excel, and input into Stata datasets using import delimited or import excel.

Suggested Citation

  • Roger Newson, 2020. "From datasets to metadatasets in Stata," London Stata Conference 2020 01, Stata Users Group.
  • Handle: RePEc:boc:usug20:01
    as

    Download full text from publisher

    File URL: http://repec.org/usug2020/Newson_u20.pdf
    File Function: presentation materials
    Download Restriction: no

    File URL: http://repec.org/usug2020/Newson_u20.zip
    File Function: sample materials
    Download Restriction: no

    File URL: http://repec.org/usug2020/Newson_example1.do
    File Function: sample do-file
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Roger Newson, 2006. "Resultssets, resultsspreadsheets, and resultsplots in Stata," German Stata Users' Group Meetings 2006 01, Stata Users Group.
    2. Roger Newson, 2004. "From datasets to resultssets in Stata," United Kingdom Stata Users' Group Meetings 2004 16, Stata Users Group.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Roger Newson, 2022. "Resultssets in resultsframes in Stata 16-plus," London Stata Conference 2022 01, Stata Users Group.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Roger Newson, 2022. "Resultssets in resultsframes in Stata 16-plus," London Stata Conference 2022 01, Stata Users Group.
    2. Roger Newson, 2013. "Creating factor variables in resultssets and other datasets," United Kingdom Stata Users' Group Meetings 2013 01, Stata Users Group.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:boc:usug20:01. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Christopher F Baum (email available below). General contact details of provider: https://edirc.repec.org/data/stataea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.