IDEAS home Printed from https://ideas.repec.org/a/bla/istatr/v77y2009i3p331-344.html
   My bibliography  Save this article

A General Algorithm for Univariate Stratification

Author

Listed:
  • Sophie Baillargeon
  • Louis‐Paul Rivest

Abstract

This paper presents a general algorithm for constructing strata in a population using X, a univariate stratification variable known for all the units in the population. Stratum h consists of all the units with an X value in the interval [bh−1, bh) . The stratum boundaries {bh} are obtained by minimizing the anticipated sample size for estimating the population total of a survey variable Y with a given level of precision. The stratification criterion allows the presence of a take‐none and of a take‐all stratum. The sample is allocated to the strata using a general rule that features proportional allocation, Neyman allocation, and power allocation as special cases. The optimization can take into account a stratum‐specific anticipated non‐response and a model for the relationship between the stratification variable X and the survey variable Y. A loglinear model with stratum‐specific mortality for Y given X is presented in detail. Two numerical algorithms for determining the optimal stratum boundaries, attributable to Sethi and Kozak, are compared in a numerical study. Several examples illustrate the stratified designs that can be constructed with the proposed methodology. All the calculations presented in this paper were carried out with stratification, an R package that will be available on CRAN (Comprehensive R Archive Network). Cet article présente un algorithme général pour construire des strates dans une population à l'aide de X, une variable de stratification unidimensionnelle connue pour toutes les unités de la population. La strate h contient toutes les unités ayant une valeur de X dans l'intervalle [bh−1, bh). Les frontières des strates {bh} sont obtenues en minimisant la taille d'échantillon anticipée pour l'estimation du total de la variable d'intérêt Y avec un niveau de précision prédéterminé. Le critère de stratification permet la présence d'une strate à tirage nul et de strates recensement. L'échantillon est réparti dans les strates à l'aide d'une règle générale qui inclut l'allocation proportionnelle, l'allocation de Neyman et l'allocation de puissance comme des cas particuliers. L'optimisation peut tenir compte d'un taux de non réponse spécifique à la strate et d'un modèle reliant la variable de stratification X à la variable d'intérêt Y. Un modèle loglinéaire avec un taux de mortalité propre à la strate est présenté en détail. Deux algorithmes numériques pour déterminer les frontières de strates optimales, dus à Sethi et Kozak, sont comparés dans une étude numérique. Plusieurs exemples illustrent les plans stratifiés qui peuvent être construits avec la méthodologie proposée. Tous les calculs présentés dans l'article ont été effectués avec stratification, un package R disponible auprès des auteurs.

Suggested Citation

  • Sophie Baillargeon & Louis‐Paul Rivest, 2009. "A General Algorithm for Univariate Stratification," International Statistical Review, International Statistical Institute, vol. 77(3), pages 331-344, December.
  • Handle: RePEc:bla:istatr:v:77:y:2009:i:3:p:331-344
    DOI: 10.1111/j.1751-5823.2009.00093.x
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/j.1751-5823.2009.00093.x
    Download Restriction: no

    File URL: https://libkey.io/10.1111/j.1751-5823.2009.00093.x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Hidiroglou, M A & Srinath, K P, 1993. "Problems Associated with Designing Subannual Business Surveys," Journal of Business & Economic Statistics, American Statistical Association, vol. 11(4), pages 397-405, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. R. Benedetti & M. S. Andreano & F. Piersimoni, 2019. "Sample selection when a multivariate set of size measures is available," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 28(1), pages 1-25, March.
    2. de Gruijter, J.J. & Wheeler, I. & Malone, B.P., 2019. "Using model predictions of soil carbon in farm-scale auditing - A software tool," Agricultural Systems, Elsevier, vol. 169(C), pages 24-30.
    3. Vicente Núñez-Antón & Juan Manuel Pérez-Salamero González & Marta Regúlez-Castillo & Carlos Vidal-Meliá, 2020. "Improving the Representativeness of a Simple Random Sample: An Optimization Model and Its Application to the Continuous Sample of Working Lives," Mathematics, MDPI, vol. 8(8), pages 1-27, July.
    4. Polanec Sašo & Bavdaž Mojca & Smith Paul A., 2022. "Determination of the Threshold in Cutoff Sampling Using Response Burden with an Application to Intrastat," Journal of Official Statistics, Sciendo, vol. 38(4), pages 1205-1234, December.
    5. Roberto Benedetti & Federica Piersimoni & Paolo Postiglione, 2017. "Spatially Balanced Sampling: A Review and A Reappraisal," International Statistical Review, International Statistical Institute, vol. 85(3), pages 439-454, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. R. Benedetti & M. S. Andreano & F. Piersimoni, 2019. "Sample selection when a multivariate set of size measures is available," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 28(1), pages 1-25, March.
    2. Katharina Thill & Barbara Covarrubias Venegas & Sabine Groblschegg, 2014. "HR Roles and activities. Empirical results from the DACH-Region and implications for a future development of the HR profession," Proceedings of International Academic Conferences 0802015, International Institute of Social and Economic Sciences.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:istatr:v:77:y:2009:i:3:p:331-344. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/isiiinl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.