Question: Correct use of tximport in combination with edgeR cpm()
0
10 days ago by
ATpoint0
Germany
ATpoint0 wrote:

I imported a set of salmon quantifications into R with tximport default settings and exactly used the code on the manual page for tximport to prepare data for use with edgeR. The result is a DGElist with the offsets for the downstream DGE analysis.

Issue: The DGElist (y$samples) does not contain the lib.size factors (they are all 1) for obtaining TMM-normalized counts via cpm(y, log=F). Therefore, the question is how to feed normalization factors into y$samples$norm.factors while still using the information from tximport. One can of course run calcNormFactors(y) manually but then the length offsets from tximport are lost. Is there a recommended approach? edger tximport • 154 views ADD COMMENTlink modified 10 days ago by James W. MacDonald50k • written 10 days ago by ATpoint0 This is beyond my knowledge of edgeR. I checked that chunk of tximport vignette code with Aaron at some point, to make sure we were doing it properly. ADD REPLYlink written 10 days ago by Michael Love23k Oops. Spotted something. Will open an issue. Edit: actually, ignore that, it's fine - phew. ADD REPLYlink modified 10 days ago • written 10 days ago by Aaron Lun23k Maybe add a short comment to the tximport vignette referencing the suggestions from Aaron below. Using the corrected counts for things like clustering etc. is standard so I was actually surprised no one asked this before (by best knowledge, maybe I missed the respective threads). ADD REPLYlink modified 10 days ago • written 10 days ago by ATpoint0 Answer: C: Correct use of tximport in combination with edgeR cpm() 1 10 days ago by Aaron Lun23k Cambridge, United Kingdom Aaron Lun23k wrote: Have a look at csaw::calculateCPM(), which does exactly as you request (see usage here). You'll need to convert it back into a SummarizedExperiment, though, the function doesn't take DGEList objects... or you can use csaw::normFactors() instead of calcNormFactors() to keep everything in a SummarizedExperiment form. (Note the difference in the weighted default, though, as this was built for ChIP-seq data.) ADD COMMENTlink modified 10 days ago • written 10 days ago by Aaron Lun23k Thanks Aaron, I think that should do it. If you move your comment to answer I can accept it. For completeness, here is the code I used: ## convert DGElist to SummarizedExperiments given a DGElist "y" from the code in toplevel question library(csaw) se <- SummarizedExperiment(assays = y$counts)
names(assays(se))[1] <- "counts"
se$totals <- y$samples$lib.size assay(se, "offset") <- y$offset
se.cpm <- calculateCPM(se, use.norm.factors = F, use.offsets = T, log = F)


hi Aaron and AT,

I'll add this to the tximport vignette if Aaron gives the ok. I'm just less knowledgeable about internals so want to make sure I don't promulgate something not accurate.

Looks fine to me. You needn't use.norm.factors if you have use.offsets=TRUE, the latter overrides the former. The only other comments are to avoid T and F, but I know that Mike would never put those in a vignette anyway.

Mike, if you open a PR on the vignette, I can put in some comments to explain what and why, especially around the offset calculation part. Otherwise I'll have to re-remember everything the next time this pops up.

1

I added the following to the vignette in the devel branch of tximport:

https://github.com/mikelove/tximport/commit/225953efef09f2a925c99242034abfa4d933a0f7

Let me know if that looks ok.

Thanks AT

Thank you both for the outstanding responsiveness to questions and issues. Can you move Aaron's comment to answer so it is the toplevel answer?

Answer: Correct use of tximport in combination with edgeR cpm()
0
10 days ago by
United States
James W. MacDonald50k wrote:

If you have an offsets matrix in your DGEList then you won't use the norm.factors anyway, so it wouldn't matter if you did something with them or not. Put a different way, the offsets are supposed to be better than simple normalization factors, and are preferentially used by glmFit.