block design and how it handles missing values in Limma
0
0
Entering edit mode
xiaocui zhu ▴ 70
@xiaocui-zhu-801
Last seen 9.6 years ago
Hello all, I have a cDNA array data sets collected from a time-course experiment. The experiment design was similar to the following: 1)Treat cells with ligands A, ligand B, or a vector control 2)Harvest cells at 1h and 2h 3)Measure expression changes in treated cells relative to time-matched-controls using 2-color cDNA arrays with a dyeSwap design (each treated and time-matched-control sample pairs were hybridized onto two arrays with a dyeSwap). Step 1) to 3) were repeated three times, so that for each treatment condition, we have three biological and two technical repeats. I used Limma to identify differentially expressed genes in response to each ligand, and genes differentially expressed in response to ligand A vs. to ligand B at each time point. Since each time the experiment was repeated, the cell preparation and other experimental conditions might vary slightly, I thought that the data collected from one experiment can be considered a block to account for the batch variance. Parts of the codes taking into account the dyeSwap design and block factor are as the following: #Identify differentially expressed genes at each time point to each ligand treatments <- factor(c(1,1,1,1,1,1, 2,2,2,2,2,2, 3,3,3,3,3,3, 4,4,4,4,4,4)) vector<- c(1,-1,1,-1,1,-1, 1,-1,1,-1,1,-1, 1,-1,1,-1,1,-1, 1,-1,1,-1,1,-1) design <- model.matrix(~ 0+treatments) colnames(design) <- c("A.1h","A.2h","B.1h","B.2h") fit<- lmFit(MA, design, block=c(rep(c(1,1,2,2,3,3),4))) efit <-eBayes(fit) for (i in 1:length(colnames(design))){ output<-topTable(efit, coef = i, number=16200, adj="fdr") write.table(output, file = paste(colnames(design)[i], ".txt", sep=""), sep="\t") } When I examined the output files from the above codes, I was concerned that the M value for some of the array features did not equal to the average of the replicates, even though it's supposed to. This is only seen with features if a pair of its dyeSwap measurements had a "NA" value in only one of the arrays. If both arrays of a dyeSwap pair gave a "NA" value for the feature, the M value would still be equal to the average of replicates as it's supposed to. This problem seemed to be caused by including the block factor in the lmFit statement, because no such inconsistency was found in the output if I removed the block factor. I don't know whether this inconsistency is due to some errors in my codes, or whether block design somehow handles missing value in a dyeSwap pair differently. My questions are: 1) Is it appropriate to use block design in my case? 2) How does block design handles missing values of a dyeSwap pair? Why do I see that in a block design, if a pair of dyeSwap measurements has only one missing value for a feature, the M value of that feature does not equal to the average of the replicates? Any help to this matter will be greatly appreciated! Xiaocui
limma limma • 908 views
ADD COMMENT

Login before adding your answer.

Traffic: 760 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6