salmon output for DESeq2 analysis
2
0
Entering edit mode
@capricygcapricyg-17892
Last seen 2.4 years ago
United States

HI, Michael,

I read your DESeq2 vignette: http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html

and found that DESeqDataSet could be derived from the salmon output (Transcript abudance) or count matrix.

I wonder if you ever compare the results of these two process (salmon->tximport->DESeq2 versus counts->DESeq2) for differential gene call from the same sequencing dataset?

Thanks.

C

DESeq2 • 32k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 3 days ago
United States

Yes such comparisons were made in the tximport publication.

ADD COMMENT
1
Entering edit mode

 

Michael,

Thank you very much for your quick response!

To make sure my understanding is correct, I found the following paper:

https://www.ncbi.nlm.nih.gov/pubmed/26925227

And you conclusion is: "salmon->tximport->DESeq2" is better than "counts->DESeq2"?

Kind regards,

C.

ADD REPLY
0
Entering edit mode

Yes the advantages are that it protects against estimation bias from DTU, enables certain fragment level biases to be estimated and preserves multimapping reads.

ADD REPLY
0
Entering edit mode

I have different concerns, actually:

counts data usually come from genome alignment; however, salmon data from the transcriptome alignment. I found tximport converted counts were not really matching the genome alightment-based counts...

ADD REPLY
0
Entering edit mode

What would be the point of tximport if you got the same thing as the genome-based alignment? Put another way, both alignment to the genome with subsequent counting and alignment to the transcriptome and then collapsing to the gene level are attempts to get at the same thing - the relative amount of transcript in a given sample for each gene. But we don't know how much transcript there is!

The fact that two different methods of estimating some underlying (unobserved) quantity don't necessarily agree doesn't invalidate either of them, because we don't know what the base truth is. If you want to believe that aligning to the genome and then generating counts is 'the right way to do things', then you should do that. If you are persuaded by Mike's paper that you get better results aligning to the transcriptome and then summarizing using tximport, then you should do that instead. But comparing the two and noting they are different doesn't tell you anything because the only reason for having a different method is because it's different than what came before.

ADD REPLY
0
Entering edit mode

Hi, James,

As you mentioned that we don't know what the base truth is, whenever the outputs are different, I just would like to know if anyone has ever tested which one makes more sense...

C.

ADD REPLY
0
Entering edit mode

Yes. Mike did, in the tximport paper that he already mentioned. Have you read it?

ADD REPLY
0
Entering edit mode

Good points. Just want to point out that Charlotte Soneson is the first author.

ADD REPLY

Login before adding your answer.

Traffic: 598 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6