IDEAS home Printed from https://ideas.repec.org/a/spr/stmapp/v24y2015i1p97-119.html
   My bibliography  Save this article

Tests for statistical significance of a treatment effect in the presence of hidden sub-populations

Author

Listed:
  • B. Karmakar
  • K. Dhara
  • K. Dey
  • A. Basu
  • A. Ghosh

Abstract

For testing the statistical significance of a treatment effect, we often compare between two parts of a population; one is exposed to the treatment, and the other is not exposed to it. Standard parametric or nonparametric two-sample tests are commonly used for this comparison. But direct applications of these tests can yield misleading results, especially when the population has some hidden sub-populations, and the effect of this sub-population difference on the response dominates the treatment effect. This problem becomes more evident if these sub-populations have widely different proportions of representatives in the samples obtained from these two parts. In this article, we propose some simple methods to overcome these limitations. These proposed methods first use a suitable clustering algorithm to find the hidden sub-populations, and then they eliminate the sub-population effect by using a suitable transformation of the data. Standard two-sample tests, when they are applied on the transformed data, usually yield better results. We analyze some simulated and real data sets to demonstrate the utility of these proposed methods. Copyright Springer-Verlag Berlin Heidelberg 2015

Suggested Citation

  • B. Karmakar & K. Dhara & K. Dey & A. Basu & A. Ghosh, 2015. "Tests for statistical significance of a treatment effect in the presence of hidden sub-populations," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 24(1), pages 97-119, March.
  • Handle: RePEc:spr:stmapp:v:24:y:2015:i:1:p:97-119
    DOI: 10.1007/s10260-014-0271-x
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1007/s10260-014-0271-x
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1007/s10260-014-0271-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Conor Dolan & Han Maas, 1998. "Fitting multivariage normal finite mixtures subject to structural equation modeling," Psychometrika, Springer;The Psychometric Society, vol. 63(3), pages 227-253, September.
    2. Mukhopadhyay, Subhadeep & Ghosh, Anil K., 2011. "Bayesian multiscale smoothing in supervised and semi-supervised kernel discriminant analysis," Computational Statistics & Data Analysis, Elsevier, vol. 55(7), pages 2344-2353, July.
    3. Robert Tibshirani & Guenther Walther & Trevor Hastie, 2001. "Estimating the number of clusters in a data set via the gap statistic," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(2), pages 411-423.
    4. Jörnsten, Rebecka, 2004. "Clustering and classification based on the L1 data depth," Journal of Multivariate Analysis, Elsevier, vol. 90(1), pages 67-89, July.
    5. Tenenhaus, Michel & Vinzi, Vincenzo Esposito & Chatelin, Yves-Marie & Lauro, Carlo, 2005. "PLS path modeling," Computational Statistics & Data Analysis, Elsevier, vol. 48(1), pages 159-205, January.
    6. Hoeting, Jennifer & Raftery, Adrian E. & Madigan, David, 1996. "A method for simultaneous variable selection and outlier identification in linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 22(3), pages 251-270, July.
    7. Claeskens,Gerda & Hjort,Nils Lid, 2008. "Model Selection and Model Averaging," Cambridge Books, Cambridge University Press, number 9780521852258, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Julian Rossbroich & Jeffrey Durieux & Tom F. Wilderjans, 2022. "Model Selection Strategies for Determining the Optimal Number of Overlapping Clusters in Additive Overlapping Partitional Clustering," Journal of Classification, Springer;The Classification Society, vol. 39(2), pages 264-301, July.
    2. Andrea Cappozzo & Luis Angel García Escudero & Francesca Greselin & Agustín Mayo-Iscar, 2021. "Parameter Choice, Stability and Validity for Robust Cluster Weighted Modeling," Stats, MDPI, vol. 4(3), pages 1-14, July.
    3. Annie Tubadji & Peter Nijkamp, 2015. "Cultural impact on regional development: application of a PLS-PM model to Greece," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 54(3), pages 687-720, May.
    4. Kitagawa, Toru & Muris, Chris, 2016. "Model averaging in semiparametric estimation of treatment effects," Journal of Econometrics, Elsevier, vol. 193(1), pages 271-289.
    5. Thiemo Fetzer & Samuel Marden, 2017. "Take What You Can: Property Rights, Contestability and Conflict," Economic Journal, Royal Economic Society, vol. 0(601), pages 757-783, May.
    6. Debora Bettiga & Lucio Lamberti & Emanuele Lettieri, 2020. "Individuals’ adoption of smart technologies for preventive health care: a structural equation modeling approach," Health Care Management Science, Springer, vol. 23(2), pages 203-214, June.
    7. Vittadini, Giorgio & Minotti, Simona C. & Fattore, Marco & Lovaglio, Pietro G., 2007. "On the relationships among latent variables and residuals in PLS path modeling: The formative-reflective scheme," Computational Statistics & Data Analysis, Elsevier, vol. 51(12), pages 5828-5846, August.
    8. Adam Malešević & Dušan Barać & Dragan Soleša & Ema Aleksić & Marijana Despotović-Zrakić, 2021. "Adopting xRM in Higher Education: E-Services Outside the Classroom," Sustainability, MDPI, vol. 13(14), pages 1-20, July.
    9. Daniel Agness & Travis Baseler & Sylvain Chassang & Pascaline Dupas & Erik Snowberg, 2022. "Valuing the Time of the Self-Employed," Working Papers 2022-2, Princeton University. Economics Department..
    10. Batool, Fatima & Hennig, Christian, 2021. "Clustering with the Average Silhouette Width," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
    11. Oubrich, Mourad & Hakmaoui, Abdelati & Benhayoun, Lamiae & Solberg Söilen, Klaus & Abdulkader, Bisan, 2021. "Impacts of leadership style, organizational design and HRM practices on knowledge hiding: The indirect roles of organizational justice and competitive work environment," Journal of Business Research, Elsevier, vol. 137(C), pages 488-499.
    12. Claudio Vitari & Elisabetta Raguseo, 2016. "Big data value and financial performance: an empirical investigation [Digital data, dynamic capability and financial performance: an empirical investigation in the era of Big Data]," Post-Print halshs-01923271, HAL.
    13. Martins, José & Costa, Catarina & Oliveira, Tiago & Gonçalves, Ramiro & Branco, Frederico, 2019. "How smartphone advertising influences consumers' purchase intention," Journal of Business Research, Elsevier, vol. 94(C), pages 378-387.
    14. Nicoleta Serban & Huijing Jiang, 2012. "Multilevel Functional Clustering Analysis," Biometrics, The International Biometric Society, vol. 68(3), pages 805-814, September.
    15. Philippe Goulet Coulombe & Maxime Leroux & Dalibor Stevanovic & Stéphane Surprenant, 2022. "How is machine learning useful for macroeconomic forecasting?," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 920-964, August.
    16. Amir Louizi & Radhouane Kammoun, 2016. "Evaluation of corporate governance systems by credit rating agencies," Journal of Management & Governance, Springer;Accademia Italiana di Economia Aziendale (AIDEA), vol. 20(2), pages 363-385, June.
    17. Sengazhani Murugesan Vadivel & Aloysius Henry Sequeira & Robert Rajkumar Sakkariyas & Kirubaharan Boobalan, 2022. "Impact of lean service, workplace environment, and social practices on the operational performance of India post service industry," Annals of Operations Research, Springer, vol. 315(2), pages 2219-2244, August.
    18. Gupta, Prashant & Seetharaman, A. & Raj, John Rudolph, 2013. "The usage and adoption of cloud computing by small and medium businesses," International Journal of Information Management, Elsevier, vol. 33(5), pages 861-874.
    19. Fernandez, Carmen & Ley, Eduardo & Steel, Mark F. J., 2001. "Benchmark priors for Bayesian model averaging," Journal of Econometrics, Elsevier, vol. 100(2), pages 381-427, February.
    20. Asif Khan & Chih-Cheng Chen & Kwanrat Suanpong & Athapol Ruangkanjanases & Santhaya Kittikowit & Shih-Chih Chen, 2021. "The Impact of CSR on Sustainable Innovation Ambidexterity: The Mediating Role of Sustainable Supply Chain Management and Second-Order Social Capital," Sustainability, MDPI, vol. 13(21), pages 1-25, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stmapp:v:24:y:2015:i:1:p:97-119. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.