Hello,

I am testing what condition Limma can not be used and do some simulation (as below). I find that no matter one-color or two-color microarray, if only have **one replicate**, Limma can be used to get the result.

But if **no any replicate**, then Limma refuses analysis in the step : fit2 <- eBayes(fit2)

My question is:

If in all arrays, just **one replicate**, may I use Limma to get **reasonal result** for Differentially Expressed Genes (DEG) **even for several group**, since Limma does can run?

If one replicate is not enough, how many replicates are needed to get reasonal result?

Thanks,

Kevin

##################################

# one-color simulation

##################################

library(limma)

# good Targets

# simulate Targets frame for one-color

gr <- c("c", "t1", "t3", "t2", "c") ################ just "c" is replicate

# gr <- c("c", "t1", "t2") # wrong, no any replicate

Targets <- data.frame(group = gr)

rownames(Targets) <- paste0("a", 1:length(gr))

Targets

# simulate expression matrix according to Targets frame

expMat <- matrix(log2(runif(500 * length(gr))), nrow = 500, ncol = length(gr))

colnames(expMat) <- paste0("a", 1:length(gr))

expMat[1:3, ]

# extract groups

gf <- factor(Targets$group)

gf

# design matrix

design <- model.matrix(~ 0 + gf)

design

colnames(design) <- levels(gf)

design

# first linear model

fit <- lmFit(expMat, design)

# make contrast matrix

treat <- setdiff(levels(gf), "c")

ct <- paste0(treat, "-", "c")

cm <- makeContrasts(contrasts = ct, levels = design)

cm

# fit contrast matrix

fit2 <- contrasts.fit(fit, cm)

fit2

##################################

# it gives wrong info if any

##################################

# final model

fit2 <- eBayes(fit2)

##################################

# extract Differentially Expressed Genes (DEG)

for (i in 1:length(ct)) {

print(topTable(fit2, coef = i))

}

#######################################

# two-color simulation

#######################################

library(limma)

# good Targets

# simulate Targets frame for two-color

# just c("c", "t1") and c("t1", c) is one replicate with dye-swap

cy3cy5 <- rbind(

c("c", "t1"),

c("c", "t2"),

c("t1", "c"),

c("t3", "c")

)

# wrong, no any replicate

# cy3cy5 <- rbind(

# c("c", "t1"),

# c("c", "t2"),

# c("t3", "c"),

# c("t4", "c")

# )

colnames(cy3cy5) <- c("cy3", "cy5")

numArray <- nrow(cy3cy5)

Targets <- data.frame(Cy3 = cy3cy5[, "cy3"], Cy5 = cy3cy5[, "cy5"])

rownames(Targets) <- paste0("a", 1:numArray)

Targets

# simulate expression matrix according to Targets frame

expMat <- matrix(log2(runif(500 * numArray)), nrow = 500, ncol = numArray)

colnames(expMat) <- paste0("a", 1:numArray)

expMat[1:3, ]

# design matrix

design <- modelMatrix(Targets, ref = "c")

design

# first linear model

fit <- lmFit(expMat, design)

# make contrast matrix

treat <- setdiff(as.vector(cy3cy5), "c")

ct <- paste0(treat)

cm <- makeContrasts(contrasts = ct, levels = design)

cm

# fit contrast matrix

fit2 <- contrasts.fit(fit, cm)

fit2

##################################

# it gives wrong info if any

##################################

# final model

fit2 <- eBayes(fit2)

##################################

# extract Differentially Expressed Genes (DEG)

for (i in 1:length(ct)) {

print(topTable(fit2, coef = i))

}

Thanks Gordon. I modified the statements "...Limma refuses analysis in the step" according to you. Yes. I am trying to learning other people's work. They did analysis like this: If microarray is two-color: they identified DEG just using 2-fold change, no matter if there exists replicates. If microarray is one-color: if there are at least two replicates in each group, then they used SAM (significance analysis of microarray), otherwise, they just used 2-fold change to obtain DEG.

I would like to know: if it is better to using the

combined 2-fold change and p-valuesthanjust 2-fold changewhen existing replicates,even just one replicate?Thanks,

Kevin

I almost always use p-value, not combined with fold change. Please read the warning about combining fold change and p-value on the topTable help page: ?topTable.

Regarding the "other people's work", I suspect that their decisions relate more to the limitations of SAM (it can't analyse two colour arrays unless all common reference and it needs a minimum number of arrays) rather than what would one ideally want to do.

Thanks Gordon. Will follow your suggestions.