Last seen 50 minutes ago
WEHI, Melbourne, Australia
If you are using tximport to input data to edgeR, just follow the advice in the tximport vignette about how to do that. Once you create the DGEList object for edgeR, you can proceed with a standard edgeR analysis.
1) How big a difference these make if we are doing alternative splicing and per-transcript analysis as well?
Gene-level differential expression, transcript-level differential expression and testing for alternative splicing are quite different things and need three different approaches to quantification and normalization. The tximport import protocol and offset matrix is only for the gene-level differential expression.
2) Is it true that calculated offsets are used instead of internal normalization?
Yes. Offsets are normalization and encode observation-specific effective library sizes. There would be no point in supplying an offset matrix to edgeR if edgeR then overwrote it.
Is there an explanation somewhere how y$offset variable is handled in each function?
Every function has a help page. Basically, offsets are used throughout. The offsets are used whenever edgeR fits a glm and hence the offset becomes part of any downstream analysis such as dispersion estimation or testing.
3) If we want to normalize on multiple criteria, can we add all those offsets and is that recommended?
edgeR accepts offset matrices from external normalization packages such as EDASeq, cqn or tximport but does not create observation-specific offset matrices itself. If you want create your own offset matrix according to your own criteria, then making sure the offset matrix is sensible is your responsibility. I would not recommend just adding up separate offset matrices. If you are worried about GC content, you could use Salmon, which already adjusts for GC content as part of the transcript quantification.