Question

different results from limma using the same data set and two different methods

0

Entering edit mode

xiaocui zhu ▴ 70

@xiaocui-zhu-801

Last seen 9.6 years ago

Hello all, I have a cDNA data set with 16 different conditions, each with 3 pairs of dyeSwap measurements (so 6 repeats in total). I ranked differentially expressed genes in each condition using two different methods in Limma, and the two methods yielded different B values and gene ranks. I do not know if this is normal or I did something wrong. Here is how I did it: METHOD 1: Generate a MAList for each condition, and fit the linear model with individual MAList separately. The code looks like the following: M.condition1<-read.delim("condition1-log2Ratio.txt", header=TRUE) A.condition1<-read.delim("condition1-amplitude.txt", header=TRUE) MA.condition1<-new("MAList", list(M=M.condition1, A=A.condition1)) MA.condition1$genes<- read.delim("Genes.txt", header=TRUE) fit.condition1<-lmFit(MA.condition1, design=c(1,-1,1,-1,1,-1)) efit.condition1<-eBayes(fit.condition1) output<-topTable(efit.condition1, number=16200, adjust="fdr") write.table(output, file="condition1-ebayes-sep.txt", sep="\t") METHOD 2: Generate a MAList for the entire data set (96 arrays in total), and fit the linear model with the MAList. The code looks like the following: M.all<-read.delim("all-log2Ratio.txt", header=TRUE) A.all<-read.delim("all-Amplitude.txt", header=TRUE) MA.all<-new("MAList", list(M=M.all, A=A.all)) MA.all$genes<-read.delim("Genes.txt", header=TRUE) treatments <- factor(c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4, 5,5,5,5,5,5,6,6,6,6,6,6,7,7,7,7,7,7, 8,8,8,8,8,8,9,9,9,9,9,9,10,10,10,10,10,10, 11,11,11,11,11,11,12,12,12,12,12,12,13,13,13,13,13,13, 14,14,14,14,14,14,15,15,15,15,15,15,16,16,16,16,16,16)) design <- model.matrix(~ 0+treatments) design<-edit(design) # change the table to reflect the dyeSwap design colnames(design) <- c("condtion1", " condtion2"," condtion3", "condtion4", " condtion5"," condtion6", "condtion7", " condtion8"," condtion9", "condtion10", " condtion11"," condtion12", "condtion13", " condtion14"," condtion15", "condtion16") fit <- lmFit(MA.lab, design) efit<-eBayes(fit) # To get the gene rank of condition1: output<-topTable(efit, coef=1, number=16200,adjust="fdr") write.table(output, file="condition1-ebayes-batch.txt", sep="\t") The rank and the B value of the same gene in the two output files are different. I may have done something incorrectly in METHOD 2, because some genes in its output file have a different "A" value when compared to the average of the six replicates. On the other hand, all genes in the output file of METHOD 1 have the same "M" and "A" value when compared to the average M and average A values of the six replicates. Any help would be greatly appreciated. Eventually I want to compare across different conditions using METHOD 2. Since I am not sure if my code in METHOD 2 is correct, I have to wait until I figure out what's going on. Thanks a lot! Xiaocui

limma limma • 878 views

ADD COMMENT • link 19.8 years ago xiaocui zhu ▴ 70