IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v15y2018i1p106-d126259.html
   My bibliography  Save this article

Modeling Unobserved Heterogeneity in Susceptibility to Ambient Benzo[ a ]pyrene Concentration among Children with Allergic Asthma Using an Unsupervised Learning Algorithm

Author

Listed:
  • Daniel Fernández

    (Research and Development Unit, Parc Sanitari Sant Joan de Déu, Fundació Sant Joan de Déu, CIBERSAM, Dr. Antoni Pujadas, 42, Sant Boi de Llobregat, 08830 Barcelona, Spain
    School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6140, New Zealand)

  • Radim J. Sram

    (Department of Genetic Ecotoxicology, Institute of Experimental Medicine, Academy of Sciences of the Czech Republic, v.v.i., Vídeňská 1083, 142 20 Prague 4, Czech Republic)

  • Miroslav Dostal

    (Department of Genetic Ecotoxicology, Institute of Experimental Medicine, Academy of Sciences of the Czech Republic, v.v.i., Vídeňská 1083, 142 20 Prague 4, Czech Republic)

  • Anna Pastorkova

    (Department of Genetic Ecotoxicology, Institute of Experimental Medicine, Academy of Sciences of the Czech Republic, v.v.i., Vídeňská 1083, 142 20 Prague 4, Czech Republic)

  • Hans Gmuender

    (Genedata AG, Margarethenstrasse 38, CH-4053 Basel, Switzerland)

  • Hyunok Choi

    (Departments of Environmental Health Sciences, Epidemiology, and Biostatistics State University of New York at Albany School of Public Health, Rensselaer, NY 12144, USA)

Abstract

Current studies of gene × air pollution interaction typically seek to identify unknown heritability of common complex illnesses arising from variability in the host’s susceptibility to environmental pollutants of interest. Accordingly, a single component generalized linear models are often used to model the risk posed by an environmental exposure variable of interest in relation to a priori determined DNA variants. However, reducing the phenotypic heterogeneity may further optimize such approach, primarily represented by the modeled DNA variants. Here, we reduce phenotypic heterogeneity of asthma severity, and also identify single nucleotide polymorphisms (SNP) associated with phenotype subgroups. Specifically, we first apply an unsupervised learning algorithm method and a non-parametric regression to find a biclustering structure of children according to their allergy and asthma severity. We then identify a set of SNPs most closely correlated with each sub-group. We subsequently fit a logistic regression model for each group against the healthy controls using benzo[ a ]pyrene (B[ a ]P) as a representative airborne carcinogen. Application of such approach in a case-control data set shows that SNP clustering may help to partly explain heterogeneity in children’s asthma susceptibility in relation to ambient B[ a ]P concentration with greater efficiency.

Suggested Citation

  • Daniel Fernández & Radim J. Sram & Miroslav Dostal & Anna Pastorkova & Hans Gmuender & Hyunok Choi, 2018. "Modeling Unobserved Heterogeneity in Susceptibility to Ambient Benzo[ a ]pyrene Concentration among Children with Allergic Asthma Using an Unsupervised Learning Algorithm," IJERPH, MDPI, vol. 15(1), pages 1-18, January.
  • Handle: RePEc:gam:jijerp:v:15:y:2018:i:1:p:106-:d:126259
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/15/1/106/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/15/1/106/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Fernández, D. & Arnold, R. & Pledger, S., 2016. "Mixture-based clustering for the ordered stereotype model," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 46-75.
    2. Rocci, Roberto & Vichi, Maurizio, 2008. "Two-mode multi-partitioning," Computational Statistics & Data Analysis, Elsevier, vol. 52(4), pages 1984-2003, January.
    3. Shirley Pledger, 2000. "Unified Maximum Likelihood Estimates for Closed Capture–Recapture Models Using Mixtures," Biometrics, The International Biometric Society, vol. 56(2), pages 434-442, June.
    4. Pledger, Shirley & Arnold, Richard, 2014. "Multivariate methods using mixtures: Correspondence analysis, scaling and pattern-detection," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 241-261.
    5. Stephen Johnson, 1967. "Hierarchical clustering schemes," Psychometrika, Springer;The Psychometric Society, vol. 32(3), pages 241-254, September.
    6. Richard Arnold & Yu Hayakawa & Paul Yip, 2010. "Capture–Recapture Estimation Using Finite Mixtures of Arbitrary Dimension," Biometrics, The International Biometric Society, vol. 66(2), pages 644-655, June.
    7. Wayne DeSarbo & Duncan Fong & John Liechty & M. Kim Saxton, 2004. "A hierarchical bayesian procedure for two-mode cluster analysis," Psychometrika, Springer;The Psychometric Society, vol. 69(4), pages 547-572, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fernández, D. & Arnold, R. & Pledger, S., 2016. "Mixture-based clustering for the ordered stereotype model," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 46-75.
    2. Álvarez de Toledo, Pablo & Núñez, Fernando & Usabiaga, Carlos, 2018. "Matching and clustering in square contingency tables. Who matches with whom in the Spanish labour market," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 135-159.
    3. Daniel Fernández & Richard Arnold & Shirley Pledger & Ivy Liu & Roy Costilla, 2019. "Finite mixture biclustering of discrete type multivariate data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 117-143, March.
    4. Tatjana Miljkovic & Daniel Fernández, 2018. "On Two Mixture-Based Clustering Approaches Used in Modeling an Insurance Portfolio," Risks, MDPI, vol. 6(2), pages 1-18, May.
    5. Roy Costilla & Ivy Liu & Richard Arnold & Daniel Fernández, 2019. "Bayesian model-based clustering for longitudinal ordinal data," Computational Statistics, Springer, vol. 34(3), pages 1015-1038, September.
    6. Eleni Matechou & Ivy Liu & Daniel Fernández & Miguel Farias & Bergljot Gjelsvik, 2016. "Biclustering Models for Two-Mode Ordinal Data," Psychometrika, Springer;The Psychometric Society, vol. 81(3), pages 611-624, September.
    7. Christian Carmona & Luis Nieto-Barajas & Antonio Canale, 2019. "Model-based approach for household clustering with mixed scale variables," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(2), pages 559-583, June.
    8. Jacques, Julien & Biernacki, Christophe, 2018. "Model-based co-clustering for ordinal data," Computational Statistics & Data Analysis, Elsevier, vol. 123(C), pages 101-115.
    9. Pledger, Shirley & Arnold, Richard, 2014. "Multivariate methods using mixtures: Correspondence analysis, scaling and pattern-detection," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 241-261.
    10. Claudia Quinteros-Cartaya & Guillermo Solorio-Magaña & Francisco Javier Núñez-Cornú & Felipe de Jesús Escalona-Alcázar & Diana Núñez, 2023. "Microearthquakes in the Guadalajara Metropolitan Zone, Mexico: evidence from buried active faults in Tesistán Valley, Zapopan," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 116(3), pages 2797-2818, April.
    11. Katarzyna Hampel & Paulina Ucieklak-Jez & Agnieszka Bem, 2021. "Health System Responsiveness in the Light of the Euro Health Consumer Index," European Research Studies Journal, European Research Studies Journal, vol. 0(4B), pages 659-667.
    12. Kim, Junyung & Shah, Asad Ullah Amin & Kang, Hyun Gook, 2020. "Dynamic risk assessment with bayesian network and clustering analysis," Reliability Engineering and System Safety, Elsevier, vol. 201(C).
    13. Paul S. F. Yip & Hua-Zhen Lin & Liqun Xi, 2005. "A Semiparametric Method for Estimating Population Size for Capture–Recapture Experiments with Random Covariates in Continuous Time," Biometrics, The International Biometric Society, vol. 61(4), pages 1085-1092, December.
    14. Roberts, Leigh, 2014. "Consistent estimation of breakpoints in time series, with application to wavelet analysis of Citigroup returns," Working Paper Series 18815, Victoria University of Wellington, School of Economics and Finance.
    15. Chang Xuan Mao & Na You, 2009. "On Comparison of Mixture Models for Closed Population Capture–Recapture Studies," Biometrics, The International Biometric Society, vol. 65(2), pages 547-553, June.
    16. David G Mets & Michael S Brainard, 2018. "An automated approach to the quantitation of vocalizations and vocal learning in the songbird," PLOS Computational Biology, Public Library of Science, vol. 14(8), pages 1-29, August.
    17. Michael Brusco & J Dennis Cradit & Douglas Steinley, 2021. "A comparison of 71 binary similarity coefficients: The effect of base rates," PLOS ONE, Public Library of Science, vol. 16(4), pages 1-19, April.
    18. Noah E. Friedkin, 1984. "Structural Cohesion and Equivalence Explanations of Social Homogeneity," Sociological Methods & Research, , vol. 12(3), pages 235-261, February.
    19. Ben C. Stevenson & Rachel M. Fewster & Koustubh Sharma, 2022. "Spatial correlation structures for detections of individuals in spatial capture–recapture models," Biometrics, The International Biometric Society, vol. 78(3), pages 963-973, September.
    20. David Matesanz Gomez & Guillermo J. Ortega & Benno Torgler, 2011. "Measuring globalization: A hierarchical network approach," CREMA Working Paper Series 2011-11, Center for Research in Economics, Management and the Arts (CREMA).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:15:y:2018:i:1:p:106-:d:126259. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.