Question

normOffsets in csaw

0

Entering edit mode

ATpoint ★ 4.8k

@atpoint-13662

Last seen 1 day ago

Germany

Hello Aaron,

looking for clarification:

Using normOffsets() in csaw produces offsets based on the loess fit to correct for non-linear bias. Are these ready to be used with calculateCPM()? I would think no, and one would need to apply scaleOffsets(), is that correct? Still, having the normOffsets() output in your DGEList (or SummarizedExperiment) one can directly feed this object into the default edgeR workflow (estimateDisp etc...) without further steps that have to be taken, is this correct? In any case, do the offsets take full care of both the non-linear bias and differences in composition/depth between libraries, I guess so?

Asking because the recommended code when using offsets from tximport with edgeR (https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html#edger) has quite some steps that have to be taken. Want to be sure I am not messing things up here. Thank you!

csaw • 1.5k views

ADD COMMENT • link updated 4.6 years ago by Aaron Lun ★ 28k • written 4.6 years ago by ATpoint ★ 4.8k

score 2 · Accepted Answer · 2020-09-11

Are these ready to be used with calculateCPM()?

If you set use.offsets=TRUE in calculateCPM(), it should handle all of it for you, including scaleOffset.

Still, having the normOffsets() output in your DGEList (or SummarizedExperiment) one can directly feed this object into the default edgeR workflow (estimateDisp etc...) without further steps that have to be taken, is this correct?

If you have it in your SE, you can run asDGEList() to create a DGEList with offsets, no further work required. If you just have a raw matrix of offsets, then you'll have to use scaleOffset() to add it to your DGEList.

In any case, do the offsets take full care of both the non-linear bias and differences in composition/depth between libraries, I guess so?

Yes.

Asking because the recommended code when using offsets from tximport with edgeR (https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html#edger) has quite some steps that have to be taken. Want to be sure I am not messing things up here. Thank you!

That example is particularly complicated because the starting material for the offsets is the transcript length, which needs to be combined with the biases from library size differences (plus also compositional biases).

catchSalmon() could probably be extended to handle that particular bit with less writing on the user's part.