Question

top table or anova?

0

Entering edit mode

Konika Chawla ▴ 20

@konika-chawla-6650

Last seen 7.6 years ago

Hi I have a microarray data comprised of 3 treatment conditions (N,M,I) and 2 time points (4dpi and 5dpi). I used limma - ebayes fit and toptable with coefficient for a specific contrast to get the differentially expressed genes and adjusted P value. I also did a ANOVA test, and I want to know which one is better to identify differentially expressed genes. CODE: for eBAYES and Toptable design <- model.matrix(~ 0+factor(c(1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6))) colnames(design) <- c("N_4dpi", "M_4dpi", "I_4dpi","N_5dpi", "M_5dpi", "I_5dpi") fit <- lmFit(eset, design) contrast.matrix <- makeContrasts("I_4dpi-M_4dpi","I_5dpi-M_5dpi","I_4dpi-N_4dpi","I_5dpi- N_5dpi","M_4dpi-N_4dpi","M_5dpi-N_5dpi", levels=design) fit2 <- contrasts.fit(fit, contrast.matrix) fit2 <- eBayes(fit2) 2) #If we do one comparison at a time I/M 4dpi tab_all<-topTable(fit2, coef=1,adjust="BH",number=100,p.value=1) head(tab_all) write.table(tab_all,"I-M-4dpi.txt",sep="\t",quote=FALSE) This gives zero significant genes. ############################################################ CODE for ANOVA targets <- readTargets("target_data.txt") data <- ReadAffy(filenames=targets$filename) experiment <- targets$experiment time <- targets$time treatment <- targets$treatment ##normalization eset <- rma(data) exprs.eset <- exprs(eset) exprs.eset.df <- data.frame(exprs.eset) aof <- function(x) { m<-data.frame(time, treatment, x); anova(aov(x ~ time + treatment + time * treatment, m)) } anovaresults <- apply(exprs.eset, 1, aof) This gives over a thousand genes with treatment_Pr.F <0.05 , Does this need to be corrected for multiple testing, why is the huge difference? Appreciate your help. -- With Regards, Konika Chawla NTNU, Norway Phone +4772821344 [[alternative HTML version deleted]]

Microarray limma Microarray limma • 1.2k views

ADD COMMENT • link updated 9.8 years ago by James W. MacDonald 65k • written 9.8 years ago by Konika Chawla ▴ 20

score 0 · Answer 1 · 2014-07-16

Hi Konika Chawla, Gordon Smyth has published multiple papers that outline why limma is superior to individual tests for low-replication microarray studies. You can find those references in the limma User's Guide (you can type limmaUsersGuide() at an R prompt after loading the limma package). And as you note, this isn't a fair comparison. You don't say how many measurements you are making, but assuming somewhere around 25K genes, you expect 1250 significant results (unadjusted p-value < 0.05) under the null hypothesis. So having that many 'significant' genes isn't saying much. You could use p.adjust() on the vector of p-values to get BH adjusted p-values if you want to make the comparisons more reasonable. Best, Jim On 7/16/2014 1:30 PM, Konika Chawla wrote: > Hi > I have a microarray data comprised of 3 treatment conditions (N,M,I) and > 2 time points (4dpi and 5dpi). > I used limma - ebayes fit and toptable with coefficient for a specific > contrast to get the differentially expressed genes and adjusted P value. > I also did a ANOVA test, and I want to know which one is better to > identify differentially expressed genes. > > CODE: for eBAYES and Toptable > > design <- model.matrix(~ 0+factor(c(1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6))) > colnames(design) <- c("N_4dpi", "M_4dpi", "I_4dpi","N_5dpi", "M_5dpi", > "I_5dpi") > fit <- lmFit(eset, design) > contrast.matrix <- > makeContrasts("I_4dpi-M_4dpi","I_5dpi-M_5dpi","I_4dpi-N_4dpi ","I_5dpi-N_5dpi","M_4dpi-N_4dpi","M_5dpi-N_5dpi", > levels=design) > fit2 <- contrasts.fit(fit, contrast.matrix) > fit2 <- eBayes(fit2) > 2) #If we do one comparison at a time I/M 4dpi > tab_all<-topTable(fit2, coef=1,adjust="BH",number=100,p.value=1) > head(tab_all) > write.table(tab_all,"I-M-4dpi.txt",sep="\t",quote=FALSE) > > This gives zero significant genes. > ############################################################ > > CODE for ANOVA > > > targets <- readTargets("target_data.txt") > data <- ReadAffy(filenames=targets$filename) > experiment <- targets$experiment > time <- targets$time > treatment <- targets$treatment > > > ##normalization > > eset <- rma(data) > exprs.eset <- exprs(eset) > exprs.eset.df <- data.frame(exprs.eset) > aof <- function(x) { > m<-data.frame(time, treatment, x); > anova(aov(x ~ time + treatment + time * treatment, m)) > } > > anovaresults <- apply(exprs.eset, 1, aof) > > This gives over a thousand genes with treatment_Pr.F <0.05 , Does this > need to be corrected for multiple testing, why is the huge difference? > Appreciate your help. > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099