Question: Limma design : What is the "best" method to infer biological replicates during differentially expression analysis

0

Santy Marques-Ladeira •

**20**wrote:Dear Limma users,
I'm currently working on Affymetrix microarray (Hu-gene 2.1) with only
one
channel. I got 8 experimental stimuli conducted on 7 donors. During
the
experimental process each stimulus was conducted at the same time for
all
the donors so I assume there are no technical replicates, but I need
to
take in account the "donor effect" (or biological replicates).
In the "Limma" documentation there are 2 ways of including this
"effect"
but it is still not clear to me what is the best choice, and the 2
methods
gave me 86% of correlation for a pairwise comparison test.
*1/The first method I used was including the donors in my design model
:*
# I build a matrix with 2 columns (Stimulus + Donor)
# Each row is an array
sample = matrix(0, nrow=55, ncol=2)
rownames(sample) <- colnames("normalised data I use to perform the
analysis")
colnames(sample) <- c("Stimulus", "Donor")
# In the column Stimulus there is directly the name of the stimulus
tested
sample[1:7,1] = "Stim1"
sample[8:14,1] = "Stim2"
sample[15:20,1] = "Stim3"
sample[21:27,1] = "Stim4"
sample[28:34,1] = "Stim5"
sample[35:41,1] = "Stim6"
sample[42:48,1] = "Stim7"
sample[49:55,1] = "Stim8"
# In the column Donor there is the number of the donor (from 2 to 8)
sample[,2] =
c(rep(c(2,3,4,5,6,7,8),2),3,4,5,6,7,8,rep(c(2,3,4,5,6,7,8),5))
targets = as.data.frame(sample)
Stimulus = factor(targets$Stimulus, levels = c("Stim1", "Stim2",
"Stim3",
"Stim4", "Stim5", "Stim6", "Stim7", "Stim8"))
Donor = factor(targets$Donor, levels = c("2", "3", "4", "5", "6", "7",
"8"))
# I use the model.matrix() function to infer the design based on my
matrix
design <- model.matrix(~-1+Stimulus+Donor)
rownames(design) <- colnames("normalised data I use to perform the
analysis")
colnames(design)[1:8] <- c("Stim1", "Stim2", "Stim3", "Stim4",
"Stim5",
"Stim6", "Stim7", "Stim8")
#This gives me :
# Stim1 Stim2 Stim3 ... ... ...
Donor3
Donor4 ... ... Donor8
#experience1 1 0 0
0 0 0
#experience2 1 0 0
1 0 0
#experience3 1 0 0
0 1 0
#...
#...
#...
First thing is that because I remove for a stimulus the donor2, this
donor
is removed from the design model (maybe because of dimension for
regression
?)
Then I do the usual commands :
fit <- lmFit("normalised data I use to perform the analysis", design)
cont.matrix<-makeContrasts(Stim2vsStim4=Stim2-Stim4, levels=design)
fit2<-contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2)
anadiff.limma.Stim2<-topTable(fit2, number=1000, adjust="BH")
*2/The second method I used is by treating donors as replicates
separately:*
I used the exact same "targets" data.frame, except there is no more
the
Donor column but only the Stimulus column
[...]
design <- model.matrix(~-1+Stimulus)
[...]
# List of donors saved as a variable for blocks
biolrep =
c(rep(c(2,3,4,5,6,7,8),2),3,4,5,6,7,8,rep(c(2,3,4,5,6,7,8),5))
# I used the duplicateCorrelation() function to take in account the
link
between donors
corfit<-duplicateCorrelation(c031, design, ndups=1, block=biolrep)
Then I do the usual commands (incorporating the donor information) :
fit<-lmFit(c031, design, block=biolrep, cor=corfit$consensus)
cont.matrix<-makeContrasts(Stim2vsStim4=Stim2-Stim4, levels=design)
fit2<-contrasts.fit(fit, cont.matrix)
fit2<-eBayes(fit2)
anadiff.limma.Stim2<-topTable(fit2, number=1000, adjust="BH")
When I check the results and compare the list of the 1000 genes given
by
each method, there are 857 genes in common. The fact that I found the
use
of these two methods on this site doesn't helps me much about choosing
one.
Does someone got an idea about what method I should use for biological
replicates ? (And by the way if someone got an answer more precise
about
why Donor2 is excluded, that should be nice)
Best regards,
Santy
[[alternative HTML version deleted]]

ADD COMMENT
• link
•
modified 4.4 years ago
by
Gordon Smyth ♦

**34k**• written 4.4 years ago by Santy Marques-Ladeira •**20**