IDEAS home Printed from https://ideas.repec.org/h/spr/spochp/978-0-387-69319-4_19.html
   My bibliography  Save this book chapter

Clustering Proteomics Data Using Bayesian Principal Component Analysis

In: Data Mining in Biomedicine

Author

Listed:
  • Halima Bensmail

    (University of Tennessee)

  • O. John Semmes

    (Eastern Virginia Medical School)

  • Abdelali Haoudi

    (Eastern Virginia Medical School)

Abstract

Bioinformatics clustering tools are useful at all levels of proteomic data analysis. Proteomics studies can provide a wealth of information and rapidly generate large quantities of data from the analysis of biological specimens from healthy and diseased individuals. The high dimensionality of data generated from these studies requires the development of improved bioinformatics tools for efficient and accurate data analysis. For proteome profiling of a particular system or organism, specialized software tools are necessary. However, there have not been significant advances in the informatics and software tools necessary to support the analysis and management of the massive amounts of data generated in the process. Clustering algorithms based on probabilistic and Bayesian models provide an alternative to heuristic algorithms. The number of diseased and non-diseased groups (number of clusters) is reduced to the choice of the number of component of a mixture of underlying probability. Bayesian approach is a tool for including information from the data to the analysis. It offers an estimation of the uncertainties of the data and the parameters involved. We present novel algorithms that cluster and derive meaningful patterns of expression from large scaled proteomics experiments. We processed raw data using principal component analysis to reduce the number of peaks. Bayesian model-based clustering algorithm was then used on the transformed data. The Bayesian model-based approach has shown a superior performance, consistently selecting the correct model and the number of clusters, thus providing a novel approach for accurate diagnosis of the disease.

Suggested Citation

  • Halima Bensmail & O. John Semmes & Abdelali Haoudi, 2007. "Clustering Proteomics Data Using Bayesian Principal Component Analysis," Springer Optimization and Its Applications, in: Panos M. Pardalos & Vladimir L. Boginski & Alkis Vazacopoulos (ed.), Data Mining in Biomedicine, pages 339-362, Springer.
  • Handle: RePEc:spr:spochp:978-0-387-69319-4_19
    DOI: 10.1007/978-0-387-69319-4_19
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a search for a similarly titled item that would be available.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Panos Pardalos & Vera Tomaino & Petros Xanthopoulos, 2009. "Optimization and data mining in medicine," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 17(2), pages 215-236, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:spochp:978-0-387-69319-4_19. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.