Question: How to filter tximport output for edgeR?
0
gravatar for Peter
2.7 years ago by
Peter0
United Kingdom
Peter0 wrote:

The tximport vignette describes the following steps for using the data with edgeR:

https://github.com/mikelove/tximport/blob/master/vignettes/tximport.md

library(edgeR)
cts <- txi$counts
normMat <- txi$length
normMat <- normMat/exp(rowMeans(log(normMat)))
library(edgeR)
o <- log(calcNormFactors(cts/normMat)) + log(colSums(cts/normMat))
y <- DGEList(cts)
y$offset <- t(t(log(normMat)) + o)
# y is now ready for estimate dispersion functions see edgeR User's Guide

How should I modify the data to filter non-expressed genes before calculating normalization factors? (e.g. cpm>2 in at least 3 samples)

 

edger rna-seq salmon tximport • 709 views
ADD COMMENTlink modified 2.7 years ago by Michael Love24k • written 2.7 years ago by Peter0
Answer: How to filter tximport output for edgeR?
0
gravatar for Michael Love
2.7 years ago by
Michael Love24k
United States
Michael Love24k wrote:

You should be able to adapt the cpm filtering code from the edgeR User Guide, no? y is a DGEList, with normalization factors already calculated. Can you be more specific about your question?

ADD COMMENTlink written 2.7 years ago by Michael Love24k

Using the guide then it would look like this:

...
y <- DGEList(cts)
keep <- rowSums(cpm(y) > 2) >= 3
y <- y[keep, , keep.lib.sizes=FALSE]

But then I guess I need to recalculate the normalization factors, and also was not sure about the offset calculation.

y$offset <- t(t(log(normMat)) + o)

Not that important as I can use countsFromAbundance="lengthScaledTPM" or "scaledTPM" and then use counts, but wanted to compare results from the two approaches.

ADD REPLYlink written 2.7 years ago by Peter0

Maybe one of the edgeR authors can say more on this, but you could just do keep.lib.sizes=TRUE for comparison with the countsFromAbundance approach.

ADD REPLYlink written 2.7 years ago by Michael Love24k

Hello, excuse me, I am a bit new in R. I have used this command y$offset <- t(t(log(normMat)) + o) to make a box plot. what should be written as vertical axis title and horizontal axis title of box plot? may I also make a heatmap clustering for that? thanks in advance

ADD REPLYlink written 4 months ago by lkianmehr0
1

This isn’t a tximport question. Maybe consult related papers and workflows for background. edgeR has workflows you can consult.

ADD REPLYlink written 4 months ago by Michael Love24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 271 users visited in the last hour