Different results using DESeq2
1
0
Entering edit mode
bekah ▴ 40
@bekah-12633
Last seen 6.7 years ago

Hiya,

​I am finding that if I am inputting my data in one count matrix and calling to contrast two different treatments from the four (each with 5 sample replicates) using the contrast function in DESeq2 that I get different differential expression results to that if I look at the samples using the automated trinity DE analysis pipeline using DESeq2. I am assuming that this is because all of the samples are taken into account when calculating dds (20 samples, but 10 specified to use for P value calculation) when I am manually using R, when compared to trinity which pulls the pairwise comparisons and makes a new matrix to input (10 samples instead of 20)?

Best wishes,

​Rebekah

deseq2 • 2.0k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 22 hours ago
United States

That would result in different parameters and inference yes. I don't know how Trinity calls DESeq2.

ADD COMMENT
0
Entering edit mode

Trinity seems to make a new matrix containing just the data from the two pairs of treatments to input into DESeq2 - is it better to use this method or use the entire dataset and use the contrast function in the script?

ADD REPLY
0
Entering edit mode

See the DESeq2 FAQ in the vignette for my answer to this question.

ADD REPLY
0
Entering edit mode

Cheers - sorry I had missed that entirely when I read the vignette the first time!

ADD REPLY
0
Entering edit mode

I'm a little bit confused - is there a benefit for having the same single dispersion value for the genes across all samples, as in is it better to be consistent (therefore having a consistent sensitivity to selecting sig, DE genes) if wanting to eventually look at the different log counts across all samples? Or is this just a benefit in that it takes less time than having to input each pairwise comparison in?

ADD REPLY
0
Entering edit mode

One advantage to having a single dispersion parameter is that you have more samples with which to estimate it, so less variance on the estimator. If the dispersion is similar across groups, then you have improved estimation from estimating with a single parameter. However, if the dispersion is very different, it tends to be too conservative in that you overestimate dispersion for some groups due to high dispersion in other groups. Note that dispersion of counts is not the same as variance of counts, dispersion can be thought of as approximately the square of the coefficient of variation.

ADD REPLY
0
Entering edit mode

Ah thanks that makes sense!

ADD REPLY

Login before adding your answer.

Traffic: 1060 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6