Search
Question: How to use t.test to select best features out of multiple features?
0
2.8 years ago by
India
babumanish83710 wrote:

Dear all,

I am working on microarray data having 49 samples and 22273 genes. I want to apply t.test to select top ranked genes that best deferentially classify the samples into two groups. I know i can do that by using limma package but i have to use t.test to select the genes. I know how to use t.test for two features but i am not able to find out how can i use t.test for multiple features.

modified 2.8 years ago by svlachavas610 • written 2.8 years ago by babumanish83710
0
2.8 years ago by
svlachavas610
Greece/Athens/National Hellenic Research Foundation
svlachavas610 wrote:

Dear Babumanish837,

what do you mean that you know to use t.test for two features ?

lets say you have the two groups you mentioned.

e <- exprs(eset) # your expression set

test <- do.call("rbind", lapply(rownames(e), function(x) t.test(e[x,Index2], e[x,Index1])[c("estimate","statistic","p.value")])) # where Index2 and Index1 represent the indices-columns of the samples belonging to your group (and optional paired=TRUE if you want paired analysis). And this will return for each probeset the according statistics.

But anyway, you should perform limma analysis. You can use then topTable to get your DE probesets according to your criteria, and as topTable returns a data.frame, you could order and subset your results:

i.e.   study <- factor(rep(c("A","B"),each=6)) # lets say your factor indicating your groups is called study

design <- model.matrix(~study)

fit <- lmFit(eset, design)

fit2 <- eBayes(fit)

selected <- topTable(fit2, coef=2, number=nrow(fit2), adjust.method="fdr", sort.by="none")

and then subset by any values you want: for example,  selected_2 <- subset(selected, select=c(t,logFC,adjusted.P.Val))

and finally order for instanse by the moderated t.statistic :

ordered <- selected_2[order(abs(subset$t), decreasing=TRUE),][1:200,] # to keep the top200 probesets with the biggest moderated t.statistic I hope this helps !! ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by svlachavas610 1 The genefilter package implements rowttest > library(airway); data(airway) > m = assay(airway) > m[] = as.numeric(m) # rowttest wants a ‘numeric’ matrix > head(rowttests(m, airway$dex))
statistic      dm   p.value
ENSG00000000003 -1.3886215 -246.25 0.2143027
ENSG00000000005        NaN    0.00       NaN
ENSG00000000419  0.2306398   23.75 0.8252577
ENSG00000000460 -0.9463499  -10.25 0.3805062
ENSG00000000938 -1.5666989   -0.75 0.1682275
ENSG00000000457 -0.4599108  -16.50 0.6617746
1

Dear @svlachavas,

Could you please explain what is group in the statement

design <- model.matrix(~group)

Dear svlachavas,

Now i understand group is nothing but study. It solved my problem. But i have one question what is the significance of ~ in design <- model.matrix(~group).

Thank You very much for your help.

Dear Babumanish,

just know i saw your answers. By accident i used after the name group and it is study. Im going to correct it immediately