Is Background Correction necessary before DESeq analysis
1
0
Entering edit mode
vivekr • 0
@vivekr-14904
Last seen 2.4 years ago

Hi, I am working on small RNA-seq data and want to identify differentially expressed miRNAs. I know how to do it with DESeq2 and I have got 8 miRNAs as differentially expressed. But I have seen that some authors also do background correction (BC) before applying DESeq2 on raw count data. Background correction is an approach in which those miRNAs are filtered whose sum of counts are less than specified threshold (let's say 5). This reduce the variance in the data and remove all those miRNAs with very poor expression value. Now, when I apply background correction, I am getting different results (in new result, some miRNAs are common with deseq2 result without BC and some are different). My question is which result I should trust and consider for downstream analysis and is BC really necessary before applying DESeq2 analysis in small RNA-seq data.

Thanks.

deseq2 cancer • 299 views
0
Entering edit mode
@mikelove
Last seen 12 hours ago
United States

I don't have a great answer for you. Obviously you may have removed some of the miRNA with lower counts. Why don't you look at these by eye using plotCounts to see if you think it's better to have removed these or not?

0
Entering edit mode

Thanks Michal for your response. As we know that small RNA-seq counts data is very heterogeneous. So, it is highly possible to have some miRNAs with poor counts and some miRNAs with very high counts. If majority of counts data is either sparse or have poor counts then its different. In my case, I found that out of 980 miRNAs, I have left around 700 miRNAs after BC. Ideally, differentially expressed miRNAs should be same irrespective of dispersion present in count matrix or whether miRNAs with poor counts are present or removed. Then why results are changed after BC. It must be the same as without BC. Isn't it?

0
Entering edit mode

The results will change when you add or remove features because of all the parameters that are estimated using all features. So this isn’t surprising or a sign of something wrong. I’d recommend looking at some of the counts like I suggested above.