IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i6p1411-d1097477.html
   My bibliography  Save this article

Determining Dimensionality with Dichotomous Variables: A Monte Carlo Simulation Study and Applications to Missing Data in Longitudinal Research

Author

Listed:
  • Ting Dai

    (Department of Educational Psychology, University of Illinois Chicago, Chicago, IL 60607, USA)

  • Adam Davey

    (Department of Behavioral Health and Nutrition, University of Delaware, Newark, DE 19716, USA)

Abstract

Dichotomous data correspond with various types of commonly encountered data, e.g., positive/negative, case/control, missing/observed, in many fields, including medicine, health, and social sciences. Despite their ubiquity, criteria for determining dimensionality from dichotomous variables are not yet established. We conducted a large-scale simulation (Study 1) to evaluate four criteria—Kaiser, empirical Kaiser, parallel analysis, and profile likelihood—to determine the dimensionality of dichotomous data across combinations of correlation matrices (Pearson r or tetrachoric ρ) and analysis methods (principal component analysis or exploratory factor analysis), and combinations of study characteristics: sample sizes (100, 250, and 1000), variable splits (10%/90%, 25%/75%, and 50%/50%), dimensions (1, 3, 5, and 10), and items per dimension (3, 5, and 10) with 1000 replications per condition. Parallel analysis performed best, recovering dimensionality in 87.9% of replications when using principal component analysis with Pearson correlations. Guidance for selecting criteria is provided. In Study 2, we applied this dimensionality reduction approach to two different longitudinal data sets where missing data posed difficulty for multivariate data analysis. The applications of this approach to longitudinal data suggest that the exploration of resulting missing data meta-patterns is useful in practice.

Suggested Citation

  • Ting Dai & Adam Davey, 2023. "Determining Dimensionality with Dichotomous Variables: A Monte Carlo Simulation Study and Applications to Missing Data in Longitudinal Research," Mathematics, MDPI, vol. 11(6), pages 1-25, March.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:6:p:1411-:d:1097477
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/6/1411/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/6/1411/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. D. Divgi, 1979. "Calculation of the tetrachoric correlation coefficient," Psychometrika, Springer;The Psychometric Society, vol. 44(2), pages 169-172, June.
    2. John Horn, 1965. "A rationale and test for the number of factors in factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 30(2), pages 179-185, June.
    3. Shu Xu & Shelley A. Blozis, 2011. "Sensitivity Analysis of Mixed Models for Incomplete Longitudinal Data," Journal of Educational and Behavioral Statistics, , vol. 36(2), pages 237-256, April.
    4. F. Thomas Juster & Richard Suzman, 1995. "An Overview of the Health and Retirement Study," Journal of Human Resources, University of Wisconsin Press, vol. 30, pages 7-56.
    5. Bengt Muthén & Charles Hofacker, 1988. "Testing the assumptions underlying tetrachoric correlations," Psychometrika, Springer;The Psychometric Society, vol. 53(4), pages 563-577, December.
    6. Louis Guttman, 1954. "Some necessary conditions for common-factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 19(2), pages 149-161, June.
    7. Brian L. Egleston & Daniel O. Scharfstein & Ellen MacKenzie, 2009. "On Estimation of the Survivor Average Causal Effect in Observational Studies When Important Confounders Are Missing Due to Death," Biometrics, The International Biometric Society, vol. 65(2), pages 497-504, June.
    8. Li, Baibing & Martin, Elaine B. & Morris, A. Julian, 2002. "On principal component analysis in L1," Computational Statistics & Data Analysis, Elsevier, vol. 40(3), pages 471-474, September.
    9. Zhu, Mu & Ghodsi, Ali, 2006. "Automatic dimensionality selection from the scree plot via the use of profile likelihood," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 918-930, November.
    10. Adam Davey & Michael J. Shanahan & Joseph L. Schafer, 2001. "Correcting for Selective Nonresponse in the National Longitudinal Survey of Youth Using Multiple Imputation," Journal of Human Resources, University of Wisconsin Press, vol. 36(3), pages 500-519.
    11. Adam Davey & Charles F. Halverson & Alan B. Zonderman & Paul T. Costa, 2004. "Change in Depressive Symptoms in the Baltimore Longitudinal Study of Aging," The Journals of Gerontology: Series B, The Gerontological Society of America, vol. 59(6), pages 270-277.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Peres-Neto, Pedro R. & Jackson, Donald A. & Somers, Keith M., 2005. "How many principal components? stopping rules for determining the number of non-trivial axes revisited," Computational Statistics & Data Analysis, Elsevier, vol. 49(4), pages 974-997, June.
    2. Yoo, Sun-Young & Vonk, M. Elizabeth, 2012. "The development and initial validation of the Immigrant Parental Stress Inventory (IPSI) in a sample of Korean immigrant parents," Children and Youth Services Review, Elsevier, vol. 34(5), pages 989-998.
    3. Oscar Claveria & Enric Monte & Salvador Torra, 2017. "A new approach for the quantification of qualitative measures of economic expectations," Quality & Quantity: International Journal of Methodology, Springer, vol. 51(6), pages 2685-2706, November.
    4. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W. & Lessmann, Stefan, 2020. "Incorporating textual information in customer churn prediction models based on a convolutional neural network," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1563-1578.
    5. Marconi, Gabriele, 2014. "European higher education policies and the problem of estimating a complex model with a small cross-section," MPRA Paper 87600, University Library of Munich, Germany.
    6. Hutchison, Paul D. & Daigle, Ronald J. & George, Benjamin, 2018. "Application of latent semantic analysis in AIS academic research," International Journal of Accounting Information Systems, Elsevier, vol. 31(C), pages 83-96.
    7. James J. Heckman & Rodrigo Pinto, 2015. "Econometric Mediation Analyses: Identifying the Sources of Treatment Effects from Experimentally Estimated Production Technologies with Unmeasured and Mismeasured Inputs," Econometric Reviews, Taylor & Francis Journals, vol. 34(1-2), pages 6-31, February.
    8. Edoardo Saccenti & Marieke E. Timmerman, 2017. "Considering Horn’s Parallel Analysis from a Random Matrix Theory Point of View," Psychometrika, Springer;The Psychometric Society, vol. 82(1), pages 186-209, March.
    9. James Heckman & Rodrigo Pinto & Peter Savelyev, 2013. "Understanding the Mechanisms through Which an Influential Early Childhood Program Boosted Adult Outcomes," American Economic Review, American Economic Association, vol. 103(6), pages 2052-2086, October.
    10. Godfred O Boateng & Shalean M Collins & Patrick Mbullo & Pauline Wekesa & Maricianah Onono & Torsten B Neilands & Sera L Young, 2018. "A novel household water insecurity scale: Procedures and psychometric analysis among postpartum women in western Kenya," PLOS ONE, Public Library of Science, vol. 13(6), pages 1-28, June.
    11. Shen, Cencheng & Sun, Ming & Tang, Minh & Priebe, Carey E., 2014. "Generalized canonical correlation analysis for classification," Journal of Multivariate Analysis, Elsevier, vol. 130(C), pages 310-322.
    12. Nan Wei & Changjun Li & Jiehao Duan & Jinyuan Liu & Fanhua Zeng, 2019. "Daily Natural Gas Load Forecasting Based on a Hybrid Deep Learning Model," Energies, MDPI, vol. 12(2), pages 1-15, January.
    13. Mohieddine Rahmouni, 2014. "Perception des obstacles aux activités d'innovation dans les entreprises tunisiennes," Revue d’économie du développement, De Boeck Université, vol. 22(3), pages 69-98.
    14. Stefan Schulenberg & Amanda Melton, 2010. "A Confirmatory Factor-Analytic Evaluation of the Purpose in Life Test: Preliminary Psychometric Support for a Replicable Two-Factor Model," Journal of Happiness Studies, Springer, vol. 11(1), pages 95-111, March.
    15. Andrea Zammitti & Isabella Valbusa & Sara Santilli & Maria Cristina Ginevra & Salvatore Soresi & Laura Nota, 2023. "Development and Validation of the Decent Work for Inclusive and Sustainable Future Construction Scale in Italy," Sustainability, MDPI, vol. 15(15), pages 1-19, July.
    16. Rajalaxmi Kamath & Abhi Dattasharma, 2017. "Women and Household Cash Management: Evidence from Financial Diaries in India," The European Journal of Development Research, Palgrave Macmillan;European Association of Development Research and Training Institutes (EADI), vol. 29(1), pages 73-92, January.
    17. Hudson F Golino & Sacha Epskamp, 2017. "Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research," PLOS ONE, Public Library of Science, vol. 12(6), pages 1-26, June.
    18. Agyeman, Stephen & Cheng, Lin, 2020. "Analysis of barriers to perceived service quality in Ghana: Students’ perspectives on bus mobility attributes," Transport Policy, Elsevier, vol. 99(C), pages 63-85.
    19. Iacobucci, Dawn & Ruvio, Ayalla & Román, Sergio & Moon, Sangkil & Herr, Paul M., 2022. "How many factors in factor analysis? New insights about parallel analysis with confidence intervals," Journal of Business Research, Elsevier, vol. 139(C), pages 1026-1043.
    20. Paola Zuccolotto, 2012. "Principal component analysis with interval imputed missing values," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 96(1), pages 1-23, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:6:p:1411-:d:1097477. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.