Transcript-level differential expression using DESeq2
1
1
Entering edit mode
Rain Yi ▴ 10
@rain-yi-24046
Last seen 17 months ago

Dear DESeq2 community,

I am just wondering whether I can use DESeq2 to perform transcript-level differential expression with Salmon quantification data? And if I can, is that essential to include bootstrap when using Salmon? I saw a tutorial said DESeq2 will not work if one want to do transcript-level differential expression, just curious whether it is true or not.

Thanks.

deseq2 transcript-level differential expression • 1.3k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 8 hours ago
United States

We have two relevant papers from my lab here:

This is a benchmark that explores gene level tools used to perform DTE (the benchmark follows the workflow section on DTU):

https://doi.org/10.12688/f1000research.15398.3

And this is a new Bioconductor method/package we developed (based on ideas from SAMseq) for performing DTE. The method is called Swish and the package is fishpond.

https://doi.org/10.1093/nar/gkz622

I really like this method, it’s fast, easy to use, has very good error control and sensitivity across a range of sample size, and the vignette shows how to make results tables and plots just like in DESeq2 (counts plots and MA plots).

https://bioconductor.org/packages/release/bioc/vignettes/fishpond/inst/doc/swish.html

ADD COMMENT
0
Entering edit mode

Thanks Michael. So what you suggested is using fishpond to do transcript-level differential expression instead of DESeq2?

ADD REPLY
1
Entering edit mode

Yes, it is exactly designed for this. Give it a shot and let me know if you have any questions.

ADD REPLY
0
Entering edit mode

Thanks Michael! I'll try it out. One more question, can I use DESeq2 for transcript-level differential expression if the read quantification data were generated by StringTie since this is an alignment-based aligner? Or I still have to use fishpond?

ADD REPLY
1
Entering edit mode

With StringTie they don’t have a measure of quantification uncertainty so you can’t use Swish. You can use DESeq2 then.

ADD REPLY
0
Entering edit mode

Hi Michael!

1)I have done quantification using Kallisto. Can fishpond be used with txi.kallisto$counts as an input to perform transcript-level differential expression.

2)In the vignette, https://bioconductor.org/packages/devel/bioc/vignettes/tximport/inst/doc/tximport.html#kallisto, abundance.h5 is used for reading transcript-level information. If I do not use tx2gene and follow the DEseq2 protocol, will I get transcript-level differential expression?

thanks ekta

ADD REPLY
0
Entering edit mode

You can use tximeta to import kallisto bootstraps to run Swish as well. You will get transcript level counts for performing DTE.

ADD REPLY
0
Entering edit mode

Hi Michael .. I used kallisto bootstraps to run Swish and performed the Differential transcript expression.

sum(mcols(y)$qvalue < .05) gives 1317

But surprisingly the qvalue for all these 1317 differentially expressed transcripts is 7.59E-06.

And after this, 220 transcripts have qvalue 0.916510084, 39337 transcripts have qvalue of 0.9442869.

What could be the possible reasons for it?

Ekta

ADD REPLY
0
Entering edit mode

This is not an issue, see the vignette which talks about this phenomenon.

ADD REPLY
0
Entering edit mode

Hi Michael, thanks for your reply.

One more question. If the qvalue (and pvalue) for all DTEs are the same, the Volcano plot will look like a thick line as the value of y-axis (here qvalue) is identical. How to make a plot of DTEs in such cases?

Thanks

ekta

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Yes, i tried that. I used salmon bootstraps also for the same RNAseq data to run Swish and performed the Differential transcript expression. in this case I got the qvalue of 8.826125e-06 for all my DETs.

here is the code and plot

yres <- mcols(y)[mcols(y)$keep,]
min(yres$qvalue, na.rm=TRUE)
# [1] 8.826125e-06


tiff(filename="Volcano_DETs.tiff", width = 6, height = 6, units = 'in', res=600)
with(yres, plot(log2FC, -log10(qvalue),
            xlim=c(-6,6), ylim=c(0,6)))
abline(v=0, col="red")

dev.off()

https://imgur.com/pmB8O77

thanks

ekta

ADD REPLY
1
Entering edit mode

So the way the p-values are generated (non-parametric) you won't get a ranking among these that are highly significant. You can use logFC. Do you need a volcano plot for some reason?

ADD REPLY
0
Entering edit mode

No, the Volcano plot is not essential. I just wanted to show DETs with the help of a plot. Could you suggest a way for it? What are the other good alternatives to show the DETs (publication purpose)?

Thanks

ekta

ADD REPLY
1
Entering edit mode

To show some extent of the DTE you can plot the LFC of the significant set and their mean count, we have plotMASwish to do this.

Another interesting plot is the number of DTE per gene, e.g. do you recover more than one transcript per gene.

ADD REPLY
0
Entering edit mode

Ok.. thanks for your valuable suggestion and quick replies. I have generated an MA plot using plotMASwish.

thanks

ekta

ADD REPLY

Login before adding your answer.

Traffic: 229 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6