Question: Using duplication rate as a covariate
gravatar for rbutler
7 days ago by
rbutler0 wrote:

Working with a workflow that uses Fastp -> Salmon -> Deseq2

Is it generally considered good practice to control for Fastp's read duplication rate and/or Salmon's percent mapped (from meta_info.json) when doing a Deseq DE analysis? I have noticed a fair amount of variability across a set of samples in the same prep batch and sequencing run (duplication rate, 22-52%; percent mapped, 82-92%). Duplication rate in particular seems relevant, as I didn't figure it would be that variable, and previous workflows I had done with STAR had me remove duplicate reads altogether.

I mean, it would be easy enough to do ~ read_dups + trt or ~ read_dups + map_rate + trt, but are there arguments to not do this (i.e., overfitting or removing true variation)?

deseq2 salmon fastp • 44 views
ADD COMMENTlink modified 6 days ago by Michael Love25k • written 7 days ago by rbutler0
Answer: Using duplication rate as a covariate
gravatar for Michael Love
6 days ago by
Michael Love25k
United States
Michael Love25k wrote:

I don't typically add in things like RIN or TIN or duplication or mapping rates.

My preferred approach to control for technical variation is either through Salmon's bias terms (GC, positional, etc.), or otherwise with RUV or SVA and providing these packages with the condition.

ADD COMMENTlink written 6 days ago by Michael Love25k

The vignettes have examples that use 2 SVs. Do you ever use more than 2? using svaseq to estimate the number of factors with gets me a very high number. I tried sequentially plotting SV1, SV1+SV2, SV1+SV2+SV3, etc using cleaned matrices, but I don't know what I am looking for other than for the lowest number of SVs where the batch effect disappears.

ADD REPLYlink written 6 days ago by rbutler0

In that example there are three known batches, and so I know a priori to look for 2 SVs.

I would reach out to the SVA developers on advice on the number of SVs. Maybe a new post and tag the sva package.

ADD REPLYlink written 6 days ago by Michael Love25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 243 users visited in the last hour