Question

Using R package 'limma' on pre-normalized counts and the effect of non-equivalent reads/sample on limma linear model

2

Entering edit mode

tealfurn ▴ 20

@tealfurn-11619

Last seen 7.6 years ago

Greetings!

I am doing RNA-seq on a microbiome sample. I therefore get many (~70% of all reads) multi-mapped reads due to species similarities (even using a clustered reference database).

These are good reads and real hits, but this causes the # of reads/sample to be largely inflated and would not be correct to use for normalization. I think if I normalize to the "real" # of reads I can overcome this problem, but I would have to feed into limma-voom pipeline (below).

Q1: where in the limma-voom pipeline (below) should I modify to prevent it from trying to normalize?

Q2: if there are an unequal # of reads that limma "sees", or if I remove "voom" when it tries to do the model, does that ruin the modeling itself?

__basic limma-voom pipeline__

1. make DESeq object

y = DGEList(counts = RPKMs, genes = info.matrix, remove.zeros = TRUE)

2. voom

y = calcNormFactors(y, method = "TMM", doWeighting=TRUE)

v=voom(y,design.matrix,plot = TRUE, normalize="quantile")

3. limma

fit = lmFit(v,design.matrix)

fit2 = contrasts.fit(fit, contrast.matrix)

fit2 = eBayes(fit2)

summary(decideTests(fit2, adjust.method = "BH"))

limma rna-seq multi-mapped reads normalization microbiome • 3.2k views

ADD COMMENT • link updated 7.6 years ago by Steve Lianoglou ★ 13k • written 7.6 years ago by tealfurn ▴ 20

score 4 · Answer 1 · 2016-10-06

If all you've got is RPKMs, then don't go the DGEList -> voom route, but rather log2(rpkms + 1) your data and use the limma-trend pipeline.

Going off of the code you provided, that might look something like this:

rpkms.all <- log2(RPKMs + 1)
## remove lowly expressed genes, the threshold I'm using is just an example
rpkms <- rpkms.all[rowMeans(rpkms.all) >= 1), ]
fit <- lmFit(rpkms, design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2, trend=TRUE)
...

Searching this site for "voom" or "llimma trend" will provide more references to dig further if this isn't sufficient