Question: Using duplication rate as a covariate
gravatar for rbutler
9 weeks ago by
rbutler0 wrote:

Working with a workflow that uses Fastp -> Salmon -> Deseq2

Is it generally considered good practice to control for Fastp's read duplication rate and/or Salmon's percent mapped (from meta_info.json) when doing a Deseq DE analysis? I have noticed a fair amount of variability across a set of samples in the same prep batch and sequencing run (duplication rate, 22-52%; percent mapped, 82-92%). Duplication rate in particular seems relevant, as I didn't figure it would be that variable, and previous workflows I had done with STAR had me remove duplicate reads altogether.

I mean, it would be easy enough to do ~ read_dups + trt or ~ read_dups + map_rate + trt, but are there arguments to not do this (i.e., overfitting or removing true variation)?

deseq2 salmon fastp • 90 views
ADD COMMENTlink modified 9 weeks ago by Michael Love26k • written 9 weeks ago by rbutler0
Answer: Using duplication rate as a covariate
gravatar for Michael Love
9 weeks ago by
Michael Love26k
United States
Michael Love26k wrote:

I don't typically add in things like RIN or TIN or duplication or mapping rates.

My preferred approach to control for technical variation is either through Salmon's bias terms (GC, positional, etc.), or otherwise with RUV or SVA and providing these packages with the condition.

ADD COMMENTlink written 9 weeks ago by Michael Love26k

The vignettes have examples that use 2 SVs. Do you ever use more than 2? using svaseq to estimate the number of factors with gets me a very high number. I tried sequentially plotting SV1, SV1+SV2, SV1+SV2+SV3, etc using cleaned matrices, but I don't know what I am looking for other than for the lowest number of SVs where the batch effect disappears.

ADD REPLYlink written 9 weeks ago by rbutler0

In that example there are three known batches, and so I know a priori to look for 2 SVs.

I would reach out to the SVA developers on advice on the number of SVs. Maybe a new post and tag the sva package.

ADD REPLYlink written 9 weeks ago by Michael Love26k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 310 users visited in the last hour