Question: classification method applied to microarrays (CMA package)
0
gravatar for Juan C Oliveros Collazos
10.1 years ago by
Juan C Oliveros Collazos190 wrote:
Dear all, I am starting using the CMA package for classification of microarray samples. In particular, I want to know which genes are the main responsible for separating about 60 lists of expression values into 2 groups that are already known. I understand that SVM is a good method to find the hyperplane that best separate the two groups but what I need are the genes, not the hyperplane parameters. My questions are: To get a list of genes, should I use in some manner SVMs (or another classification method) or what I need is simply to identify the "informative" genes by using GeneSelection function of CMA package? If so, the learning sets are needed? why? Any recomendation for choosing a gene selection method? Thanks in advance. best, Juan Carlos Oliveros CNB-CSIC, Madrid, Spain
classification cma • 490 views
ADD COMMENTlink modified 10.1 years ago by Stephen Henderson10 • written 10.1 years ago by Juan C Oliveros Collazos190
Answer: classification method applied to microarrays (CMA package)
0
gravatar for Stephen Henderson
10.1 years ago by
Stephen Henderson10 wrote:
The svm is a reasonable classifier that performs OK on microarray data and usually requires no tuning of parameters (usually)-- although many others do too. In order to understand the GeneSelection method you need to understand cross validation (this occurs within the classification function). The cross validation is estimating the classification error by splitting the data into many training and test set combinations. The model -- your svm is built on the training set-- and then tested against the test set to see how many errors of classification are made. If you choose GeneSelection (which you probably should) then the data is reduced to a subset of features/genes based on a simple stat. However not only one set of genes will be selected-- but genes for every training set in the cross validation. Otherwise the likely svm misclassification error would be an overestimate. So when you use the toplist function on your GeneSelection object you will find that there are a number of feature lists none exactly the same. The 'informative' genes are those that occur most frequently in the toplists. You can examine the GeneSelection toplist before you run the classification function-- but obviously you will want to run the classification function to check that the features are indeed 'informative'. You can use the GeneSelection method that gives the least cross- validation error. I'd start with limma but if there is a reasonable separation of classes then they should work similarly. jeez I hope that is clear.... Stephen Henderson UCL On 27 Oct 2009, at 11:21, Juan Carlos Oliveros Collazos wrote: > Dear all, > > I am starting using the CMA package for classification of microarray > samples. > > In particular, I want to know which genes are the main responsible > for separating about 60 lists of expression values into 2 groups > that are already known. I understand that SVM is a good method to > find the hyperplane that best separate the two groups but what I > need are the genes, not the hyperplane parameters. > > My questions are: > > To get a list of genes, should I use in some manner SVMs (or another > classification method) or what I need is simply to identify the > "informative" genes by using GeneSelection function of CMA package? > > If so, the learning sets are needed? why? > > Any recomendation for choosing a gene selection method? > > Thanks in advance. > > best, > > Juan Carlos Oliveros > CNB-CSIC, Madrid, Spain > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENTlink written 10.1 years ago by Stephen Henderson10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 216 users visited in the last hour