Filtering after DESeq
1
0
Entering edit mode
@3f9f9566
Last seen 21 hours ago
Germany

We have performed RNAseq on a lot of samples of individual flies (~ 20 per condition). After running DESeq and then checking various contrasts (with ihw=TRUE), I find myself with quite a bit of genes that are detected as differentially expressed between 2 conditions. But when I plot their counts, I notice that for a lot of them, it comes from the fact that most individuals have 0 counts while 1 or 2 have a high counts, which ends up being responsible for the significance of the Wald test.

I have read that pre-filtering was recommended, something along the lines of

keep <- rowSums(counts(dds) >= x) >= y

However if you look at the example I am joining to this post of 1 gene : if I filtered out this gene based on the fact that it has more than 0 counts in only 10 individuals, I take the risk of filtering a gene which would have more than 0 counts in 10 individuals of one condition. This would represent half of the individuals of the given condition, which I would consider meaningful. There are a few of these genes in my dataset too.

What could I do ?

Thank you !

DESeq2 IHW • 1.3k views
ADD COMMENT
0
Entering edit mode
@4c708919
Last seen 9 days ago
Australia

The user discusses filtering RNA-seq data post-DESeq analysis, noting that genes with counts in only a few individuals may still be biologically significant. They seek advice on appropriate filtering strategies.

ADD COMMENT

Login before adding your answer.

Traffic: 1256 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6