DESeq2 biais when genes are missing from the annotation?
1
0
Entering edit mode
corend • 0
@corend-14293
Last seen 7.1 years ago

As this concerns bioinformatics in general, I also posted here.

I am working on RNAseq data,

I made my count table using kallisto and then tximport to work with DESeq2.

My genes are a set of cDNAs, (supposed to be corresponding to all the genes of my species), but the annotation is quite bad, when I align on these cDNAs I get 60% of mapping, instead of 95% on total genome.

I have 2 conditions: (A and B) and 3 replicates in each condition.

My fear is: If a gene is over-expressed in A, not expressed in B, and not in my cDNA list, I expect to have less reads in A than is B and when the normalization by DESeq2 occurs, it could create a bias ?

Example:

A: 1 1 1 1 2 2 2 2 3 3

B: 1 1 1 1 2 3 3 3 3 3

3 is not annotated, then after normalization by DESeq2:

A: 1 1 1 1 1 2 2 2 2 2

B: 1 1 1 1 1 1 1 1 2 2

1 over-expressed in B, but it is not true.

How can I deal with this kind of problem?

Should I add a line in my table with "unmapped reads" to have a better normalization?

rnaseq deseq2 • 1.6k views
ADD COMMENT
0
Entering edit mode

Do you expect or observe that the proportion of unmapped reads is different across groups or samples? 

ADD REPLY
0
Entering edit mode

Yes indeed, I map 65% of my reads in condition B and 55% in condition A.

ADD REPLY
0
Entering edit mode

And how about at the genomic level? 

ADD REPLY
0
Entering edit mode

90% condition B

93% condition A

 

ADD REPLY
3
Entering edit mode
@mikelove
Last seen 3 days ago
United States

If I understand your question correctly, you are assuming that DESeq2 uses total count normalization, but it does not. DESeq2 (and all other methods in Bioconductor I can think of) use a robust method to estimate the scaling factors for each sample. You can read about the scaling method ("median ratio" normalization) in the DESeq2 paper.

ADD COMMENT

Login before adding your answer.

Traffic: 700 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6