IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0038873.html
   My bibliography  Save this article

Characteristic Gene Selection via Weighting Principal Components by Singular Values

Author

Listed:
  • Jin-Xing Liu
  • Yong Xu
  • Chun-Hou Zheng
  • Yi Wang
  • Jing-Yu Yang

Abstract

Conventional gene selection methods based on principal component analysis (PCA) use only the first principal component (PC) of PCA or sparse PCA to select characteristic genes. These methods indeed assume that the first PC plays a dominant role in gene selection. However, in a number of cases this assumption is not satisfied, so the conventional PCA-based methods usually provide poor selection results. In order to improve the performance of the PCA-based gene selection method, we put forward the gene selection method via weighting PCs by singular values (WPCS). Because different PCs have different importance, the singular values are exploited as the weights to represent the influence on gene selection of different PCs. The ROC curves and AUC statistics on artificial data show that our method outperforms the state-of-the-art methods. Moreover, experimental results on real gene expression data sets show that our method can extract more characteristic genes in response to abiotic stresses than conventional gene selection methods.

Suggested Citation

  • Jin-Xing Liu & Yong Xu & Chun-Hou Zheng & Yi Wang & Jing-Yu Yang, 2012. "Characteristic Gene Selection via Weighting Principal Components by Singular Values," PLOS ONE, Public Library of Science, vol. 7(7), pages 1-10, July.
  • Handle: RePEc:plo:pone00:0038873
    DOI: 10.1371/journal.pone.0038873
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0038873
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0038873&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0038873?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Zhijin Wu & Rafael A. Irizarry & Robert Gentleman & Francisco Martinez-Murillo & Forrest Spencer, 2004. "A Model-Based Background Adjustment for Oligonucleotide Expression Arrays," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 909-917, December.
    2. JOURNEE, Michel & NESTEROV, Yurii & RICHTARIK, Peter & SEPULCHRE, Rodolphe, 2010. "Generalized power method for sparse principal component analysis," LIDAM Reprints CORE 2232, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    3. Carl Eckart & Gale Young, 1936. "The approximation of one matrix by another of lower rank," Psychometrika, Springer;The Psychometric Society, vol. 1(3), pages 211-218, September.
    4. Shen, Haipeng & Huang, Jianhua Z., 2008. "Sparse principal component analysis via regularized low rank matrix approximation," Journal of Multivariate Analysis, Elsevier, vol. 99(6), pages 1015-1034, July.
    5. Zhijin Wu & Rafael Irizarry & Robert Gentleman & Francisco Martinez Murillo & Forrest Spencer, 2004. "A Model Based Background Adjustment for Oligonucleotide Expression Arrays," Johns Hopkins University Dept. of Biostatistics Working Paper Series 1001, Berkeley Electronic Press.
    6. Dayle L Sampson & Tony J Parker & Zee Upton & Cameron P Hurst, 2011. "A Comparison of Methods for Classifying Clinical Samples Based on Proteomics Data: A Case Study for Statistical and Machine Learning Approaches," PLOS ONE, Public Library of Science, vol. 6(9), pages 1-11, September.
    7. Fumiaki Sato & Soken Tsuchiya & Kazuya Terasawa & Gozoh Tsujimoto, 2009. "Intra-Platform Repeatability and Inter-Platform Comparability of MicroRNA Microarray Technology," PLOS ONE, Public Library of Science, vol. 4(5), pages 1-12, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Merola, Giovanni Maria & Chen, Gemai, 2019. "Projection sparse principal component analysis: An efficient least squares method," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 366-382.
    2. Nerea González-García & Ana Belén Nieto-Librero & Purificación Galindo-Villardón, 2023. "CenetBiplot: a new proposal of sparse and orthogonal biplots methods by means of elastic net CSVD," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(1), pages 5-19, March.
    3. Rosember Guerra-Urzola & Katrijn Van Deun & Juan C. Vera & Klaas Sijtsma, 2021. "A Guide for Sparse PCA: Model Comparison and Applications," Psychometrika, Springer;The Psychometric Society, vol. 86(4), pages 893-919, December.
    4. Michael Greenacre & Patrick J. F Groenen & Trevor Hastie & Alfonso Iodice d’Enza & Angelos Markos & Elena Tuzhilina, 2023. "Principal component analysis," Economics Working Papers 1856, Department of Economics and Business, Universitat Pompeu Fabra.
    5. Guerra Urzola, Rosember & Van Deun, Katrijn & Vera, J. C. & Sijtsma, K., 2021. "A guide for sparse PCA : Model comparison and applications," Other publications TiSEM 4d35b931-7f49-444b-b92f-a, Tilburg University, School of Economics and Management.
    6. Kohei Adachi & Nickolay T. Trendafilov, 2016. "Sparse principal component analysis subject to prespecified cardinality of loadings," Computational Statistics, Springer, vol. 31(4), pages 1403-1427, December.
    7. Yixuan Qiu & Jing Lei & Kathryn Roeder, 2023. "Gradient-based sparse principal component analysis with extensions to online learning," Biometrika, Biometrika Trust, vol. 110(2), pages 339-360.
    8. Jushan Bai & Serena Ng, 2020. "Simpler Proofs for Approximate Factor Models of Large Dimensions," Papers 2008.00254, arXiv.org.
    9. Rinku Sharma & Garima Singh & Sudeepto Bhattacharya & Ashutosh Singh, 2018. "Comparative transcriptome meta-analysis of Arabidopsis thaliana under drought and cold stress," PLOS ONE, Public Library of Science, vol. 13(9), pages 1-18, September.
    10. Nan Li & Matthew N. McCall & Zhijin Wu, 2017. "Establishing Informative Prior for Gene Expression Variance from Public Databases," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(1), pages 160-177, June.
    11. Sigrun Helga Lund & Daniel Fannar Gudbjartsson & Thorunn Rafnar & Asgeir Sigurdsson & Sigurjon Axel Gudjonsson & Julius Gudmundsson & Kari Stefansson & Gunnar Stefansson, 2014. "A Method for Detecting Long Non-Coding RNAs with Tiled RNA Expression Microarrays," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-9, June.
    12. Mihee Lee & Haipeng Shen & Jianhua Z. Huang & J. S. Marron, 2010. "Biclustering via Sparse Singular Value Decomposition," Biometrics, The International Biometric Society, vol. 66(4), pages 1087-1095, December.
    13. Krishanpal Anamika & Àkos Gyenis & Laetitia Poidevin & Olivier Poch & Làszlò Tora, 2012. "RNA Polymerase II Pausing Downstream of Core Histone Genes Is Different from Genes Producing Polyadenylated Transcripts," PLOS ONE, Public Library of Science, vol. 7(6), pages 1-14, June.
    14. Lei Zhang & Linlin Wang & Pu Tian & Suyan Tian, 2016. "Identification of Genes Discriminating Multiple Sclerosis Patients from Controls by Adapting a Pathway Analysis Method," PLOS ONE, Public Library of Science, vol. 11(11), pages 1-13, November.
    15. Upton Graham J. G. & Harrison Andrew P, 2010. "The Detection of Blur in Affymetrix GeneChips," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-19, October.
    16. Ryan Abo & Gregory D Jenkins & Liewei Wang & Brooke L Fridley, 2012. "Identifying the Genetic Variation of Gene Expression Using Gene Sets: Application of Novel Gene Set eQTL Approach to PharmGKB and KEGG," PLOS ONE, Public Library of Science, vol. 7(8), pages 1-11, August.
    17. Jeremiah J Faith & Boris Hayete & Joshua T Thaden & Ilaria Mogno & Jamey Wierzbowski & Guillaume Cottarel & Simon Kasif & James J Collins & Timothy S Gardner, 2007. "Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles," PLOS Biology, Public Library of Science, vol. 5(1), pages 1-13, January.
    18. Amir Beck & Yakov Vaisbourd, 2016. "The Sparse Principal Component Analysis Problem: Optimality Conditions and Algorithms," Journal of Optimization Theory and Applications, Springer, vol. 170(1), pages 119-143, July.
    19. Shen, Dan & Shen, Haipeng & Marron, J.S., 2013. "Consistency of sparse PCA in High Dimension, Low Sample Size contexts," Journal of Multivariate Analysis, Elsevier, vol. 115(C), pages 317-333.
    20. Chalise, Prabhakar & Fridley, Brooke L., 2012. "Comparison of penalty functions for sparse canonical correlation analysis," Computational Statistics & Data Analysis, Elsevier, vol. 56(2), pages 245-254.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0038873. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.