Debashis Ghosh (University of Michigan) Arul Chinnaiyan (University of Michigan Pathology and Urology)
Abstract
High-throughput gene expression technologies such as microarrays have been utilized in a variety of scientific applications. Most of the work has been on assessing univariate associations between gene expression with clinical outcome (variable selection) or on developing classification procedures with gene expression data (supervised learning). We consider a hybrid variable selection/classification approach that is based on linear combinations of the gene expression profiles that maximize an accuracy measure summarized using the receiver operating characteristic curve. Under a specific probability model, this leads to consideration of linear discriminant functions. We incorporate an automated variable selection approach using LASSO. An equivalence between LASSO estimation with support vector machines allows for model fitting using standard software. We apply the proposed method to simulated data as well as data from a recently published prostate cancer study.
Download Info
To download:
If you experience problems downloading a file, check if you have the
proper application to
view it first. Information about this may be contained
in the File-Format links below. In case of further problems read
the IDEAS help
page. Note that these files are not on the IDEAS
site. Please be patient as the files may be large.
Length: Date of creation: 11 Jul 2004 Date of revision: Handle: RePEc:bep:mchbio:1041
Note: oai:bepress.com:umichbiostat-1041 Contact details of provider: Web page: http://www.bepress.com
For technical questions regarding this item, or to correct its listing, contact: (Christopher F. Baum).
Related research
Keywords:
References listed on IDEAS Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.: