Entering edit mode
The tximport vignette describes the following steps for using the data with edgeR:
https://github.com/mikelove/tximport/blob/master/vignettes/tximport.md
library(edgeR)
cts <- txi$counts
normMat <- txi$length
normMat <- normMat/exp(rowMeans(log(normMat)))
library(edgeR)
o <- log(calcNormFactors(cts/normMat)) + log(colSums(cts/normMat))
y <- DGEList(cts)
y$offset <- t(t(log(normMat)) + o)
# y is now ready for estimate dispersion functions see edgeR User's Guide
How should I modify the data to filter non-expressed genes before calculating normalization factors? (e.g. cpm>2 in at least 3 samples)
Using the guide then it would look like this:
But then I guess I need to recalculate the normalization factors, and also was not sure about the offset calculation.
Not that important as I can use
countsFromAbundance="lengthScaledTPM"
or"scaledTPM"
and then use counts, but wanted to compare results from the two approaches.Maybe one of the edgeR authors can say more on this, but you could just do keep.lib.sizes=TRUE for comparison with the countsFromAbundance approach.
Hello, excuse me, I am a bit new in R. I have used this command y$offset <- t(t(log(normMat)) + o) to make a box plot. what should be written as vertical axis title and horizontal axis title of box plot? may I also make a heatmap clustering for that? thanks in advance
This isn’t a tximport question. Maybe consult related papers and workflows for background. edgeR has workflows you can consult.