Question

Independent filtering very different across different contrasts

0

Entering edit mode

danieljcavanaugh • 0

@danieljcavanaugh-15999

Last seen 5.9 years ago

Hi, I am analyzing an RNAseq experiment that had two different treatments (treated and control) and two time points per treatment (week 2 and week 3), with 3 biological replicates for each treatment/time point. I am interested in determining which genes are differentially expressed between the two treatments at each time point (in other words, comparing week 2 treated samples with week 2 control samples and week 3 treated samples to week 3 control samples).

I have used Galaxy to run DESeq2 separately for each contrast and have kept independent filtering on for both comparisons. For Week 2 data, no genes were filtered out by independent filtering, while for Week 3, all genes with a mean count <~45 across the 6 Week 3 samples have been filtered out, making it so that there are thousands more comparisons for Week 2. It doesn't look like week 2 has a substantial amount of low count genes among those with low p values, so I'm not sure why more genes aren't being filtered out, though I do understand that the independent filtering is supposed to optimize the number of genes that have an adjusted p-value below a given FDR cutoff, so for some reason further filtering is not increasing this number of genes.

Ideally, I'd like to implement a single filtering criterion across all samples so that I can look at whether similar genes are differentially expressed across the two time points. I know it has been recommended in the past to find the minimum filtering threshold across all contrasts and apply that to everything, but in this case I worry that would mean have no filtering which would negatively affect my results for Week 3.

Please advise on whether it would be acceptable to apply a more stringent filtering criterion by either going with the Week 3 filtering threshold for all samples, or using a more arbitrary threshold like a minimum of 10 reads for at least 3 of the samples.

Thanks,
Dan

deseq2 independent filtering • 724 views

ADD COMMENT • link updated 5.9 years ago by Michael Love 41k • written 5.9 years ago by danieljcavanaugh • 0

score 0 · Answer 1 · 2018-06-04

I'd suggest just using a reasonable threshold like, minimum of 10 reads for at least 3 samples. The IF procedure just looks for an optimal threshold, and so that happened to be ~45 for that one contrast, but it's perfectly reasonable to just use a common threshold for all contrasts. The way to do this would be to subset the DESeqDataSet at the start, then run DESeq() and results(dds, independentFiltering=FALSE).