Question

2fold sigchanges from limma

0

Entering edit mode

Ivan Baxter ▴ 60

@ivan-baxter-1399

Last seen 11.4 years ago

Greetings- I am using limma to analyze a fairly simple multiple treatment experiment and I would like to pull out all the genes which are significantly expressed and change at least two fold. I have figured out a way to do this, but it is a little complicated and I am running into problems downstream when I try to format output tables using annaffy (described below). Is there a simple way to filter the output of topTable to only get the genes which are significant and change more than a give fold-change cutoff? The way that I am going about this is.... cont.matrix <- makeContrasts( EGvsC = EG.C.C.C - C.C.C.C, ECvsC = C.EC.C.C - C.C.C.C, EGECvsC = EG.EC.C.C - C.C.C.C, GTvsC = C.C.G.C - C.C.C.C, GT_APvsC = C.C.G.A - C.C.C.C, GT_APvsGT = C.C.G.A - C.C.G.C, levels=design) fit2 <- contrasts.fit(fit, cont.matrix) fit2 <- eBayes(fit2) results <- decideTests(fit2) idx <- abs(fit2$coefficients) > 1 #make a dataframe like results where 1/-1 indicates sigchange greater than 2 fold comb <- data.frame(gene = rownames(results)) for(i in 1:length(rownames(results))){ for(j in 1:length(colnames(results))){ if(results[i,j] != 0 & idx[i,j] == "TRUE"){ comb[i,j+1] <- results[i,j] } else{comb[i,j+1] <-0} } } #but then I want to make an output table with gene names, gene annotations, fold change, pvalue and the expression values #across the arrays...... cax1_genes <- unlist(as.list(comb$gene[comb$GTAPvsC != 0])) syms <- unlist(mget(cax1_genes, hgu133a2GENENAME)) test <- match(cax1_genes, geneNames(human.eps2)) anncols <- aaf.handler()[c(1:4,7)] anntable <- aafTableAnn(geneNames(human.eps2)[test], "hgu133a2", anncols) #now I need to get the fold- change and adj.p.value for each gene, and here is where I run into trouble. #I tried pulling out all the significant changes using topTable and then pulling out the list of genes that met my criteria, but... contp <- 5 # this is the contrast which is being tested caxsig <- length(which(p.adjust(fit2$p.value[,contp], method = "BH") < 0.05)) cax1sig <- topTable(fit2, coef =contp, number =caxsig, adjust.method = "none", sort.by = "p", resort.by = "M") testtable <- aafTable("log2 change" = format(cax1sig$M[cax1sig$ID == cax1_genes], digits = 2), "pval" = format(cax1sig$adj.P.Val[cax1sig$ID == cax1_genes], digits = 2)) anntablep <- merge(anntable, testtable) # doesn't work, I get the following error: > testtable <- aafTable("log2 change" = format(cax1sig$M[cax1sig$ID == cax1_genes], digits = 2), "pval" = format(cax1sig$adj.P.Val[cax1sig$ID == cax1_genes], digits = 2)) Warning messages: 1: longer object length is not a multiple of shorter object length in: cax1sig$ID == cax1_genes 2: longer object length is not a multiple of shorter object length in: cax1sig$ID == cax1_genes If this is actually a good way to pull out the two fold genes, could anyone tell me what I am doing wrong with this last step? thanks in advance. Ivan > sessionInfo() Version 2.3.0 (2006-04-24) i386-pc-mingw32 attached base packages: [1] "grid" "splines" "tools" "methods" "stats" "graphics" "grDevices" "utils" "datasets" "base" other attached packages: annaffy KEGG GO hgu133a2 RColorBrewer geneplotter annotate hexbin colorspace lattice genefilter "1.4.0" "1.12.0" "1.12.0" "1.12.0" "0.2-3" "1.10.0" "1.10.0" "1.6.0" "0.9" "0.13-8" "1.10.1" survival limma gcrma matchprobes affy affyio Biobase "2.24" "2.7.3" "2.4.1" "1.4.0" "1.10.0" "1.0.0" "1.10.0" -- ************************************************************** Ivan Baxter Research Scientist Bindley Bioscience Center Purdue University 765-543-7288 ibaxter at purdue.edu

GO Survival hgu133a2 annotate hexbin limma gcrma matchprobes annaffy GO Survival hexbin • 1.4k views

ADD COMMENT • link updated 19.1 years ago by Gordon Smyth 53k • written 19.2 years ago by Ivan Baxter ▴ 60

score 0 · Answer 1 · 2006-12-02

Hi Ivan, Take a look at limma2annaffy in the affycoretools package. I think this will do what you want in a one-line function. Best, Jim Ivan Baxter wrote: > Greetings- > > I am using limma to analyze a fairly simple multiple treatment > experiment and I would like to pull out all the genes which are > significantly expressed and change at least two fold. I have figured out > a way to do this, but it is a little complicated and I am running into > problems downstream when I try to format output tables using annaffy > (described below). Is there a simple way to filter the output of > topTable to only get the genes which are significant and change more > than a give fold-change cutoff? > > > The way that I am going about this is.... > > cont.matrix <- makeContrasts( > EGvsC = EG.C.C.C - C.C.C.C, > ECvsC = C.EC.C.C - C.C.C.C, > EGECvsC = EG.EC.C.C - C.C.C.C, > GTvsC = C.C.G.C - C.C.C.C, > GT_APvsC = C.C.G.A - C.C.C.C, > GT_APvsGT = C.C.G.A - C.C.G.C, > levels=design) > fit2 <- contrasts.fit(fit, cont.matrix) > fit2 <- eBayes(fit2) > results <- decideTests(fit2) > idx <- abs(fit2$coefficients) > 1 > #make a dataframe like results where 1/-1 indicates sigchange greater > than 2 fold > comb <- data.frame(gene = rownames(results)) > for(i in 1:length(rownames(results))){ > for(j in 1:length(colnames(results))){ > > if(results[i,j] != 0 & idx[i,j] == "TRUE"){ > comb[i,j+1] <- results[i,j] > } > else{comb[i,j+1] <-0} > } > } > > #but then I want to make an output table with gene names, gene > annotations, fold change, pvalue and the expression values #across the > arrays...... > > > cax1_genes <- unlist(as.list(comb$gene[comb$GTAPvsC != 0])) > syms <- unlist(mget(cax1_genes, hgu133a2GENENAME)) > test <- match(cax1_genes, geneNames(human.eps2)) > anncols <- aaf.handler()[c(1:4,7)] > anntable <- aafTableAnn(geneNames(human.eps2)[test], "hgu133a2", anncols) > > #now I need to get the fold- change and adj.p.value for each gene, and > here is where I run into trouble. > #I tried pulling out all the significant changes using topTable and then > pulling out the list of genes that met my criteria, but... > contp <- 5 # this is the contrast which is being tested > caxsig <- length(which(p.adjust(fit2$p.value[,contp], method = "BH") < > 0.05)) > cax1sig <- topTable(fit2, coef =contp, number =caxsig, adjust.method = > "none", sort.by = "p", resort.by = "M") > testtable <- aafTable("log2 change" = format(cax1sig$M[cax1sig$ID == > cax1_genes], digits = 2), > "pval" = > format(cax1sig$adj.P.Val[cax1sig$ID == cax1_genes], digits = 2)) > anntablep <- merge(anntable, testtable) > > > # doesn't work, I get the following error: > > testtable <- aafTable("log2 change" = format(cax1sig$M[cax1sig$ID == > cax1_genes], digits = 2), "pval" = format(cax1sig$adj.P.Val[cax1sig$ID > == cax1_genes], digits = 2)) > Warning messages: > 1: longer object length > is not a multiple of shorter object length in: cax1sig$ID == > cax1_genes > 2: longer object length > is not a multiple of shorter object length in: cax1sig$ID == > cax1_genes > > If this is actually a good way to pull out the two fold genes, could > anyone tell me what I am doing wrong with this last step? > > thanks in advance. > > Ivan > > > > > sessionInfo() > Version 2.3.0 (2006-04-24) > i386-pc-mingw32 > > attached base packages: > [1] "grid" "splines" "tools" "methods" "stats" > "graphics" "grDevices" "utils" "datasets" "base" > > other attached packages: > annaffy KEGG GO hgu133a2 RColorBrewer > geneplotter annotate hexbin colorspace lattice > genefilter > "1.4.0" "1.12.0" "1.12.0" "1.12.0" "0.2-3" > "1.10.0" "1.10.0" "1.6.0" "0.9" "0.13-8" "1.10.1" > survival limma gcrma matchprobes affy > affyio Biobase > "2.24" "2.7.3" "2.4.1" "1.4.0" "1.10.0" > "1.0.0" "1.10.0" > > > -- James W. MacDonald University of Michigan Affymetrix and cDNA Microarray Core 1500 E Medical Center Drive Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

score 0 · Answer 2 · 2006-12-04

Dear Ivan, Also, you can use the lfc (log-fold-change) argument to decideTests(). Best wishes Gordon >Date: Sat, 02 Dec 2006 09:50:31 -0500 >From: "James W. MacDonald" <jmacdon at="" med.umich.edu=""> >Subject: Re: [BioC] 2fold sigchanges from limma >To: Ivan Baxter <ibaxter at="" purdue.edu=""> >Cc: bioconductor at stat.math.ethz.ch > >Hi Ivan, > >Take a look at limma2annaffy in the affycoretools package. I think this >will do what you want in a one-line function. > >Best, > >Jim > >Ivan Baxter wrote: > > Greetings- > > > > I am using limma to analyze a fairly simple multiple treatment > > experiment and I would like to pull out all the genes which are > > significantly expressed and change at least two fold. I have figured out > > a way to do this, but it is a little complicated and I am running into > > problems downstream when I try to format output tables using annaffy > > (described below). Is there a simple way to filter the output of > > topTable to only get the genes which are significant and change more > > than a give fold-change cutoff? > > > > > > The way that I am going about this is.... > > > > cont.matrix <- makeContrasts( > > EGvsC = EG.C.C.C - C.C.C.C, > > ECvsC = C.EC.C.C - C.C.C.C, > > EGECvsC = EG.EC.C.C - C.C.C.C, > > GTvsC = C.C.G.C - C.C.C.C, > > GT_APvsC = C.C.G.A - C.C.C.C, > > GT_APvsGT = C.C.G.A - C.C.G.C, > > levels=design) > > fit2 <- contrasts.fit(fit, cont.matrix) > > fit2 <- eBayes(fit2) > > results <- decideTests(fit2) > > idx <- abs(fit2$coefficients) > 1 > > #make a dataframe like results where 1/-1 indicates sigchange greater > > than 2 fold > > comb <- data.frame(gene = rownames(results)) > > for(i in 1:length(rownames(results))){ > > for(j in 1:length(colnames(results))){ > > > > if(results[i,j] != 0 & idx[i,j] == "TRUE"){ > > comb[i,j+1] <- results[i,j] > > } > > else{comb[i,j+1] <-0} > > } > > } > > > > #but then I want to make an output table with gene names, gene > > annotations, fold change, pvalue and the expression values #across the > > arrays...... > > > > > > cax1_genes <- unlist(as.list(comb$gene[comb$GTAPvsC != 0])) > > syms <- unlist(mget(cax1_genes, hgu133a2GENENAME)) > > test <- match(cax1_genes, geneNames(human.eps2)) > > anncols <- aaf.handler()[c(1:4,7)] > > anntable <- aafTableAnn(geneNames(human.eps2)[test], "hgu133a2", anncols) > > > > #now I need to get the fold- change and adj.p.value for each gene, and > > here is where I run into trouble. > > #I tried pulling out all the significant changes using topTable and then > > pulling out the list of genes that met my criteria, but... > > contp <- 5 # this is the contrast which is being tested > > caxsig <- length(which(p.adjust(fit2$p.value[,contp], method = "BH") < > > 0.05)) > > cax1sig <- topTable(fit2, coef =contp, number =caxsig, adjust.method = > > "none", sort.by = "p", resort.by = "M") > > testtable <- aafTable("log2 change" = format(cax1sig$M[cax1sig$ID == > > cax1_genes], digits = 2), > > "pval" = > > format(cax1sig$adj.P.Val[cax1sig$ID == cax1_genes], digits = 2)) > > anntablep <- merge(anntable, testtable) > > > > > > # doesn't work, I get the following error: > > > testtable <- aafTable("log2 change" = format(cax1sig$M[cax1sig$ID == > > cax1_genes], digits = 2), "pval" = format(cax1sig$adj.P.Val[cax1sig$ID > > == cax1_genes], digits = 2)) > > Warning messages: > > 1: longer object length > > is not a multiple of shorter object length in: cax1sig$ID == > > cax1_genes > > 2: longer object length > > is not a multiple of shorter object length in: cax1sig$ID == > > cax1_genes > > > > If this is actually a good way to pull out the two fold genes, could > > anyone tell me what I am doing wrong with this last step? > > > > thanks in advance. > > > > Ivan > > > > > > > > > sessionInfo() > > Version 2.3.0 (2006-04-24) > > i386-pc-mingw32 > > > > attached base packages: > > [1] "grid" "splines" "tools" "methods" "stats" > > "graphics" "grDevices" "utils" "datasets" "base" > > > > other attached packages: > > annaffy KEGG GO hgu133a2 RColorBrewer > > geneplotter annotate hexbin colorspace lattice > > genefilter > > "1.4.0" "1.12.0" "1.12.0" "1.12.0" "0.2-3" > > "1.10.0" "1.10.0" "1.6.0" "0.9" "0.13-8" "1.10.1" > > survival limma gcrma matchprobes affy > > affyio Biobase > > "2.24" "2.7.3" "2.4.1" "1.4.0" "1.10.0" > > "1.0.0" "1.10.0" > > > > > > > > >-- >James W. MacDonald >University of Michigan >Affymetrix and cDNA Microarray Core >1500 E Medical Center Drive >Ann Arbor MI 48109 >734-647-5623