Question

about limma (paired design) : code verification

0

Entering edit mode

Bogdan ▴ 670

@bogdan-2367

Last seen 17 months ago

Palo Alto, CA, USA

Dear all, and Aaron,

I am using the following R code for the paired design analysis of a 6*2 RNA-seq samples : please could you let me know if there are any changes/modifications you would suggest . thanks !

library("limma")
library("edgeR")

### reading the expression dataset

eset <- read.delim("mm10.strand.both.count.exons.condense.genes.NOADJ",row.names="Symbol")

### reading the library sizes

libsize <- c(28182007, 28599704, 26386209, 28069058, 26361374, 27108450, 27097918, 12155292, 27124214, 26774125, 26241068, 26296757)

### setting up the groups and the subjects

group <- factor(c("csc","csc","csc","csc","csc","csc","non","non","non","non","non","non"))
subject <- factor(c(1,2,3,4,5,6,1,2,3,4,5,6))

### setting up the design and the contrast matrix

design <- model.matrix(~0+group+subject)
contrast.matrix <- makeContrasts(groupcsc-groupnon, levels=design)

### filtering the genes based on CPM :

y <- DGEList(counts=eset,group=group)

keep <- rowSums(cpm(y, lib.size=libsize)>1) >= 3
y <- y[keep,]

y$samples$lib.size <- colSums(y$counts)

### computing the normalization factors :

y <- calcNormFactors(y)

### using the VOOM transformation :

v <- voom(y,design,plot=TRUE)

### doing the LINEAR FIT in LIMMA :

fit <- lmFit(v, design)

fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)

### obtaining and writing the results :

results <- topTable(fit2, coef=1, adjust="fdr", number=24073)

write.table(results, file="results", sep="\t", eol="\n", row.names=TRUE, col.names=TRUE)

limma edgeR • 1.2k views

ADD COMMENT • link 8.8 years ago Bogdan ▴ 670

0

Entering edit mode

Can I ask why you are inputting the libsizes separately? Are these not the same as the column sums of eset?

ADD REPLY • link 8.8 years ago Gordon Smyth 52k

0

Entering edit mode

Hi Gordon, thanks for your email. The reason for inputting the libsizes separately is the following :

the list of genes for DGE analysis contain only protein-coding and long non-coding RNAs as annotated in RefSeq, but not the eRNAs (enhancer RNAs) that could be also captured by RNA-seq.

So I though to keep the total library sizes, instead of using the library sizes only based on protein-coding and long non-coding RNAs. Although of course, the second option is a good choice too.

Shall you have any comments about how to improve or modify the code, please let me know. Thanks ;) !

ADD REPLY • link updated 8.8 years ago by Gordon Smyth 52k • written 8.8 years ago by Bogdan ▴ 670

score 1 · Answer 1 · 2016-06-13

This is quite a standard analysis and I guess you already know that your code is basically correct.

Considering that the library sizes are almost equal, I would personally run a slightly simpler, cleaner analysis using limma-trend:

library("edgeR")
eset <- read.delim("mm10.strand.both.count.exons.condense.genes.NOADJ",row.names="Symbol")
group <- factor(c("csc","csc","csc","csc","csc","csc","non","non","non","non","non","non"))
group <- relevel(group, ref="non")
subject <- factor(c(1,2,3,4,5,6,1,2,3,4,5,6))
design <- model.matrix(~group+subject)
y <- DGEList(counts=eset, group=group)
keep <- rowSums(cpm(y) > 0.5) >= 6
y <- y[keep,, keep.lib.sizes=FALSE]
y <- calcNormFactors(y)
logCPM <- cpm(y, log=TRUE, prior.count=3)
fit <- lmFit(logCPM, design)
fit <- eBayes(fit, trend=TRUE, robust=TRUE)
plotSA(fit)
results <- topTable(fit, coef=2, number=Inf)

score 0 · Answer 2 · 2016-06-13

0

Entering edit mode

Bogdan ▴ 670

@bogdan-2367

Last seen 17 months ago

Palo Alto, CA, USA

Thanks a lot Gordon for confirmation. A question though, if I may add : although I have looked through several webpages,

still not very sure about the difference between :

design <- model.matrix(~group+subject)

and

design <- model.matrix(~0+group+subject)

in terms of the practical output, I have not noticed any difference, I am asking just for my knowledge. Thanks again !

ADD COMMENT • link 8.8 years ago Bogdan ▴ 670

0

Entering edit mode

In the limma-trend pipeline, the results will be the same. In the voom pipeline, the first is slightly better.

But anyway, why go to extra trouble to add in extra steps?

ADD REPLY • link 8.8 years ago Gordon Smyth 52k

0

Entering edit mode

thanks Gordon for the quick reply. I have been just reading the updated documentation on limma and edgeR now (after 2-3 years), and I was just learning (I would say ;)

ADD REPLY • link 8.8 years ago Bogdan ▴ 670