Question: CMA package: variable selection with limma
gravatar for Renaud Gaujoux
10.4 years ago by
Renaud Gaujoux170 wrote:
Hi, I used the CMA package (v1.0.0) to evaluate the performance of a kNN classification after a feature selection step using limma, including the whole procedure in the cross-validation. Before finding the CMA package I had implemented myself the same procedure (cross-validation of feature selection) and I couldn't get the same result as with CMA. Looking to the CMA code I found that the feature were selected by their rank based on the statistic computed by the function classifyTestsF from limma (see the CMA code below). My understanding is that this function computes a F-statistic to test the joint significance of all the coefficients in the regression model (in this case 2: the intercept and the contrast of interest, i.e. the difference between the two groups of samples). In my procedure I had ranked the genes based on the p-values of the single coefficient of interest (not the intercept), as suggested by the limma tutorials. Indeed the selected genes in the first step are different. Is there a reason for also looking at the joint significance (intercept + coeff)? I guess the fact intercept is significantly different from zero gives us the information that the gene is actually expressed at a certain level, making us more confident that what is observed is not noise? In standard regression problem (more samples than variables) I would indeed test first the significance of the model as a whole (F-test), but in the context of microarray I had never seen this practice before. Am I missing something? Which practice is recommended in this context? Thanks ------------------ limmatest <- function(X, y, learnind, ...){ require(limma, quietly=TRUE) yprime <- y[learnind] Xprime <- X[learnind,,drop=FALSE] yprime <- as.factor(yprime) des <- model.matrix(~yprime) limo <- lmFit(t(Xprime), des) outp <- eBayes(limo, ...) stat <- classifyTestsF(object = outp, fstat.only = TRUE) attr(stat, "df1") <- NULL attr(stat, "df2") <- NULL new("varseloutput", varsel=stat) }
ADD COMMENTlink written 10.4 years ago by Renaud Gaujoux170
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 148 users visited in the last hour