IDEAS home Printed from https://ideas.repec.org/a/jss/jstsof/v055c01.html
   My bibliography  Save this article

A Greedy Algorithm for Representative Sampling: repsample in Stata

Author

Listed:
  • Kontopantelis, Evangelos

Abstract

Quantitative empirical analyses of a population of interest usually aim to estimate the causal effect of one or more independent variables on a dependent variable. However, only in rare instances is the whole population available for analysis. Researchers tend to estimate causal effects on a selected sample and generalize their conclusions to the whole population. The validity of this approach rests on the assumption that the sample is representative of the population on certain key characteristics. A study using a non-representative sample is lacking in external validity by failing to minimize population choice bias. When the sample is large and non-response bias is not an issue, a random selection process is adequate to ensure external validity. If that is not the case, however, researchers could follow a more deterministic approach to ensure representativeness on the selected characteristics, provided these are known, or can be estimated, in the parent population. Although such approaches exist for matched sampling designs, research on representative sampling and the similarity between the sample and the parent population seems to be lacking. In this article we propose a greedy algorithm for obtaining a representative sample and quantifying representativeness in Stata.

Suggested Citation

  • Kontopantelis, Evangelos, 2013. "A Greedy Algorithm for Representative Sampling: repsample in Stata," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 55(c01).
  • Handle: RePEc:jss:jstsof:v:055:c01
    DOI: http://hdl.handle.net/10.18637/jss.v055.c01
    as

    Download full text from publisher

    File URL: https://www.jstatsoft.org/index.php/jss/article/view/v055c01/v55c01.pdf
    Download Restriction: no

    File URL: https://www.jstatsoft.org/index.php/jss/article/downloadSuppFile/v055c01/repsample_1.1.zip
    Download Restriction: no

    File URL: https://libkey.io/http://hdl.handle.net/10.18637/jss.v055.c01?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Matthew Blackwell & Stefano Iacus & Gary King & Giuseppe Porro, 2009. "cem: Coarsened exact matching in Stata," Stata Journal, StataCorp LP, vol. 9(4), pages 524-546, December.
    2. Rosenbaum, Paul R. & Ross, Richard N. & Silber, Jeffrey H., 2007. "Minimum Distance Matched Sampling With Fine Balance in an Observational Study of Treatment for Ovarian Cancer," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 75-83, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Vicente Núñez-Antón & Juan Manuel Pérez-Salamero González & Marta Regúlez-Castillo & Carlos Vidal-Meliá, 2020. "Improving the Representativeness of a Simple Random Sample: An Optimization Model and Its Application to the Continuous Sample of Working Lives," Mathematics, MDPI, vol. 8(8), pages 1-27, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lucas A. Mariani & Jose Renato Haas Ornelas & Bernardo Ricca, 2023. "Banks’ Physical Footprint and Financial Technology Adoption," Working Papers Series 576, Central Bank of Brazil, Research Department.
    2. Nicholas Longford & Ioana C. Salagean, 2013. "A study of the labour market trajectories in the Grand Duchy of Luxembourg," Economics Working Papers 1396, Department of Economics and Business, Universitat Pompeu Fabra.
    3. Sergio Afcha & Jose García-Quevedo, 2016. "The impact of R&D subsidies on R&D employment composition," Industrial and Corporate Change, Oxford University Press and the Associazione ICC, vol. 25(6), pages 955-975.
    4. Merino, José & Borja, Victor Hugo & Lopez, Oliva & Ochoa, José Alfredo & Clark, Eduardo & Petersen, Lila & Caballero, Saul, 2021. "Ivermectin and the odds of hospitalization due to COVID-19: evidence from a quasi-experimental analysis based on a public intervention in Mexico City," SocArXiv r93g4, Center for Open Science.
    5. Jing Wang & Gen Li & Kai-Lung Hui, 2022. "Monetary Incentives and Knowledge Spillover: Evidence from a Natural Experiment," Management Science, INFORMS, vol. 68(5), pages 3549-3572, May.
    6. Wheeler, P. Barrett, 2019. "Loan loss accounting and procyclical bank lending: The role of direct regulatory actions," Journal of Accounting and Economics, Elsevier, vol. 67(2), pages 463-495.
    7. Leduc, Elisabeth & Tojerow, Ilan, 2020. "Subsidizing Domestic Services as a Tool to Fight Unemployment: Effectiveness and Hidden Costs," IZA Discussion Papers 13544, Institute of Labor Economics (IZA).
    8. Patricio Aroca & Juan Gabriel Brida & Juan Sebastián Pereyra & Serena Volo, 2014. "Tourism statistics: correcting data inadequacy using coarsened exact matching," BEMPS - Bozen Economics & Management Paper Series BEMPS22, Faculty of Economics and Management at the Free University of Bozen.
    9. Guignet, Dennis & Jenkins, Robin R. & Belke, James & Mason, Henry, 2023. "The property value impacts of industrial chemical accidents," Journal of Environmental Economics and Management, Elsevier, vol. 120(C).
    10. Philipp vom Berge & Achim Schmillen, 2023. "Effects of mass layoffs on local employment—evidence from geo-referenced data," Journal of International Economic Law, Oxford University Press, vol. 23(3), pages 509-539.
    11. Matteo Aquilina & Giulio Cornelli & Marina Sanchez del Villar, 2024. "Regulation, information asymmetries and the funding of new ventures," BIS Working Papers 1162, Bank for International Settlements.
    12. repec:zbw:rwirep:0170 is not listed on IDEAS
    13. Sun, Xiaojie & Liu, Xiaoyun & Sun, Qiang & Yip, Winnie & Wagstaff, Adam & Meng, Qingyue, 2014. "The impact of a pay-for-performance scheme on prescription quality in rural China : an impact evaluation," Policy Research Working Paper Series 6892, The World Bank.
    14. Heejung Byun & Joseph Raffiee & Martin Ganco, 2019. "Discontinuities in the Value of Relational Capital: The Effects on Employee Entrepreneurship and Mobility," Organization Science, INFORMS, vol. 30(6), pages 1368-1393, November.
    15. Mossberger, Karen & LaCombe, Scott & Tolbert, Caroline J., 2022. "A new measure of digital economic activity and its impact on local opportunity," Telecommunications Policy, Elsevier, vol. 46(1).
    16. Lauren Lanahan & Daniel Armanios, 2018. "Does More Certification Always Benefit a Venture?," Organization Science, INFORMS, vol. 29(5), pages 931-947, October.
    17. Giuliano Masiero & Michael Santarossa, 2020. "Earthquakes, grants, and public expenditure: How municipalities respond to natural disasters," Journal of Regional Science, Wiley Blackwell, vol. 60(3), pages 481-516, June.
    18. Sara Pavone & Elena Ragazzi & Lisa Sella, 2015. "Sostenere le imprese agro-industriali in Piemonte: un?analisi controfattuale," SCIENZE REGIONALI, FrancoAngeli Editore, vol. 2015(3 Suppl.), pages 129-143.
    19. Heß, Moritz & Scheve, Christian von & Schupp, Jürgen & Wagner, Aiko & Wagner, Gert G., 2018. "Are Political Representatives More Risk-Loving Than the Electorate? Evidence from German Federal and State Parliaments," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 4, pages 1-7.
    20. Florian Gunsilius & Yuliang Xu, 2021. "Matching for causal effects via multimarginal unbalanced optimal transport," Papers 2112.04398, arXiv.org, revised Jul 2022.
    21. John Beshears & James J. Choi & David Laibson & Brigitte C. Madrian & William L. Skimmyhorn, 2022. "Borrowing to Save? The Impact of Automatic Enrollment on Debt," Journal of Finance, American Finance Association, vol. 77(1), pages 403-447, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:jss:jstsof:v:055:c01. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Christopher F. Baum (email available below). General contact details of provider: http://www.jstatsoft.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.