Question

DESeq2 library composition adjustment in size factors

0

Entering edit mode

twfs • 0

@twfs-24066

Last seen 3.3 years ago

United Kingdom

Hi folks,

The DESeq2 vst rlog size factors computations account for differences in library composition amongst other things. Library composition differences between sample groups are one of the reasons for not using TPMs. However, we are often presented only with TPM values. I wondered if there are existing methods for quantifying a library composition problem from a set of TPM values and if not what might be a good way to quantify the risk of using the TPMs (for visualisation or other purposes). For example, would starting off by computing a matrix of pairwise Kolmogorov-Smirnov tests for sample count distributions and find significant differences do the trick?

Many thanks

Tim

deseq2 normalization • 1.8k views

ADD COMMENT • link updated 4.4 years ago by Michael Love 43k • written 4.4 years ago by twfs • 0

score 0 · Answer 1 · 2020-10-12

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 8 hours ago

United States

Do you only have TPM and no indication of the sequencing depth?

In tximport, one of the options is to scale up TPMs to account for the sequencing depth (e.g. number of reads aligning to the transcriptome). You can consult Soneson et al 2015 for the details, it performs as well as what we have in the DESeq2 vignette for tximport use, which is counts + offset.

ADD COMMENT • link 4.4 years ago Michael Love 43k

0

Entering edit mode

Thanks Mike.

Yes this would be for where you have just TPMs available and want to check to see if library composition is a problem or not.

But wrt tximport - are you saying that tximport->TPM alone will account for a library composition effect in an equivalent way that the median of ratios does? So if you know your TPMs were calculated that way then you're on safer ground?

ADD REPLY • link 4.4 years ago twfs • 0

0

Entering edit mode

No. That is just how the data is passed to the methods which then do their own normalization/offset approach. If you do TPM + library size information -> tximport scaledTPM approach -> VST or rlog, then the last step will perform appropriate median ratio scaling and transformation that stabilizes variance.

ADD REPLY • link 4.4 years ago Michael Love 43k

0

Entering edit mode

Ah ok that makes sense. Normally wouldn't have library size information and only TPMs and so want to try and quantify likely issues, e.g. have an application that already uses TPMs and want to add a quick test to quantify risk by detecting potential library composition effect. Perhaps the K-S test will suffice? Thanks for helping.