IDEAS home Printed from https://ideas.repec.org/a/spr/metron/v78y2020i3d10.1007_s40300-020-00186-2.html
   My bibliography  Save this article

Clustering non-linear interactions in factor analysis

Author

Listed:
  • Erick da Conceição Amorim

    (Universidade Federal de Minas Gerais)

  • Vinícius Diniz Mayrink

    (Universidade Federal de Minas Gerais)

Abstract

Factor analysis is a powerful tool for dimensionality reduction in multivariate studies. This study extends the factor model with non-linear interactions. The main contribution of our work is to present two approaches to cluster the non-linear interactions and thus develop new models that are not restricted to the extreme scenarios where all non-null interactions are different or all are the same. The first strategy to handle the clusters involves a finite mixture of degenerate components. The second option is specified via the Dirichlet process. A comprehensive simulation study is developed to explore the performance of the proposals. A sensitivity analysis is carried out to evaluate advantages of estimating a smoothness parameter defined in a covariance function of the Gaussian process establishing the non-linearity of the interactions. In terms of application, the methodology is illustrated with the analysis of gene expression levels related to four breast cancer data sets. The genes belonging to disjoint genome regions, with copy number alteration, are connected to the main factors and their non-linear interactions are estimated and clustered. The mutual investigation and comparison of these four breast cancer data sets is rarely found in the literature.

Suggested Citation

  • Erick da Conceição Amorim & Vinícius Diniz Mayrink, 2020. "Clustering non-linear interactions in factor analysis," METRON, Springer;Sapienza Università di Roma, vol. 78(3), pages 329-352, December.
  • Handle: RePEc:spr:metron:v:78:y:2020:i:3:d:10.1007_s40300-020-00186-2
    DOI: 10.1007/s40300-020-00186-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s40300-020-00186-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s40300-020-00186-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Eddelbuettel, Dirk & Sanderson, Conrad, 2014. "RcppArmadillo: Accelerating R with high-performance C++ linear algebra," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 1054-1063.
    2. Zhijin Wu & Rafael Irizarry & Robert Gentleman & Francisco Martinez Murillo & Forrest Spencer, 2004. "A Model Based Background Adjustment for Oligonucleotide Expression Arrays," Johns Hopkins University Dept. of Biostatistics Working Paper Series 1001, Berkeley Electronic Press.
    3. Zhijin Wu & Rafael A. Irizarry & Robert Gentleman & Francisco Martinez-Murillo & Forrest Spencer, 2004. "A Model-Based Background Adjustment for Oligonucleotide Expression Arrays," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 909-917, December.
    4. David J. Spiegelhalter & Nicola G. Best & Bradley P. Carlin & Angelika Van Der Linde, 2002. "Bayesian measures of model complexity and fit," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(4), pages 583-639, October.
    5. Oscar M Rueda & Ramón Díaz-Uriarte, 2007. "Flexible and Accurate Detection of Genomic Copy-Number Changes from aCGH," PLOS Computational Biology, Public Library of Science, vol. 3(6), pages 1-8, June.
    6. Carvalho, Carlos M. & Chang, Jeffrey & Lucas, Joseph E. & Nevins, Joseph R. & Wang, Quanli & West, Mike, 2008. "High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1438-1456.
    7. Eddelbuettel, Dirk & Francois, Romain, 2011. "Rcpp: Seamless R and C++ Integration," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i08).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wilson J. Wright & Peter N. Neitlich & Alyssa E. Shiel & Mevin B. Hooten, 2022. "Mechanistic spatial models for heavy metal pollution," Environmetrics, John Wiley & Sons, Ltd., vol. 33(8), December.
    2. Rinku Sharma & Garima Singh & Sudeepto Bhattacharya & Ashutosh Singh, 2018. "Comparative transcriptome meta-analysis of Arabidopsis thaliana under drought and cold stress," PLOS ONE, Public Library of Science, vol. 13(9), pages 1-18, September.
    3. Bachoc, François & Genton, Mark G. & Nordhausen, Klaus & Ruiz-Gazen, Anne & Virta, Joni, 2019. "Spatial Blind Source Separation," TSE Working Papers 19-998, Toulouse School of Economics (TSE).
    4. Jin-Xing Liu & Yong Xu & Chun-Hou Zheng & Yi Wang & Jing-Yu Yang, 2012. "Characteristic Gene Selection via Weighting Principal Components by Singular Values," PLOS ONE, Public Library of Science, vol. 7(7), pages 1-10, July.
    5. Nan Li & Matthew N. McCall & Zhijin Wu, 2017. "Establishing Informative Prior for Gene Expression Variance from Public Databases," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(1), pages 160-177, June.
    6. James Joseph Balamuta & Steven Andrew Culpepper, 2022. "Exploratory Restricted Latent Class Models with Monotonicity Requirements under PÒLYA–GAMMA Data Augmentation," Psychometrika, Springer;The Psychometric Society, vol. 87(3), pages 903-945, September.
    7. Athanasios C. Micheas & Jiaxun Chen, 2018. "sppmix: Poisson point process modeling using normal mixture models," Computational Statistics, Springer, vol. 33(4), pages 1767-1798, December.
    8. Simon Beyeler & Sylvia Kaufmann, 2021. "Reduced‐form factor augmented VAR—Exploiting sparsity to include meaningful factors," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 36(7), pages 989-1012, November.
    9. Sigrun Helga Lund & Daniel Fannar Gudbjartsson & Thorunn Rafnar & Asgeir Sigurdsson & Sigurjon Axel Gudjonsson & Julius Gudmundsson & Kari Stefansson & Gunnar Stefansson, 2014. "A Method for Detecting Long Non-Coding RNAs with Tiled RNA Expression Microarrays," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-9, June.
    10. Krishanpal Anamika & Àkos Gyenis & Laetitia Poidevin & Olivier Poch & Làszlò Tora, 2012. "RNA Polymerase II Pausing Downstream of Core Histone Genes Is Different from Genes Producing Polyadenylated Transcripts," PLOS ONE, Public Library of Science, vol. 7(6), pages 1-14, June.
    11. Lei Zhang & Linlin Wang & Pu Tian & Suyan Tian, 2016. "Identification of Genes Discriminating Multiple Sclerosis Patients from Controls by Adapting a Pathway Analysis Method," PLOS ONE, Public Library of Science, vol. 11(11), pages 1-13, November.
    12. Buddhavarapu, Prasad & Scott, James G. & Prozzi, Jorge A., 2016. "Modeling unobserved heterogeneity using finite mixture random parameters for spatially correlated discrete count data," Transportation Research Part B: Methodological, Elsevier, vol. 91(C), pages 492-510.
    13. Upton Graham J. G. & Harrison Andrew P, 2010. "The Detection of Blur in Affymetrix GeneChips," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-19, October.
    14. Battauz, Michela & Vidoni, Paolo, 2022. "A likelihood-based boosting algorithm for factor analysis models with binary data," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    15. Ryan Abo & Gregory D Jenkins & Liewei Wang & Brooke L Fridley, 2012. "Identifying the Genetic Variation of Gene Expression Using Gene Sets: Application of Novel Gene Set eQTL Approach to PharmGKB and KEGG," PLOS ONE, Public Library of Science, vol. 7(8), pages 1-11, August.
    16. Jeremiah J Faith & Boris Hayete & Joshua T Thaden & Ilaria Mogno & Jamey Wierzbowski & Guillaume Cottarel & Simon Kasif & James J Collins & Timothy S Gardner, 2007. "Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles," PLOS Biology, Public Library of Science, vol. 5(1), pages 1-13, January.
    17. Dimitris Korobilis & Kenichi Shimizu, 2022. "Bayesian Approaches to Shrinkage and Sparse Estimation," Foundations and Trends(R) in Econometrics, now publishers, vol. 11(4), pages 230-354, June.
    18. Tsay, Ruey S. & Ando, Tomohiro, 2012. "Bayesian panel data analysis for exploring the impact of subprime financial crisis on the US stock market," Computational Statistics & Data Analysis, Elsevier, vol. 56(11), pages 3345-3365.
    19. Francis J. DiTraglia, 2011. "Using Invalid Instruments on Purpose: Focused Moment Selection and Averaging for GMM, Second Version," PIER Working Paper Archive 14-045, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania, revised 09 Dec 2014.
    20. Chalise, Prabhakar & Fridley, Brooke L., 2012. "Comparison of penalty functions for sparse canonical correlation analysis," Computational Statistics & Data Analysis, Elsevier, vol. 56(2), pages 245-254.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:metron:v:78:y:2020:i:3:d:10.1007_s40300-020-00186-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.