Question: Unusual DESeq2 results
gravatar for taamslab
9 months ago by
taamslab0 wrote:


Im currently re performing some analyses of RNA-seq data via DESeq2, however since the first analysis (about a year ago) Ive got some very strange genes turning up as significantly expressed. These genes have exceedingly high p-values and fold changes (>2^15 !!) while also being lowly expressed ( Basemean of <100) and often driven by a single sample.
the experiment is as follows: an RNA seq analysis (Unstranded ) of Cytokine expressing cells from 3 separate patients, comparing 2 cell types (CD4/CD8) expressing 1 of 2 cytokines (pop1/pop2).

any advice would be appreciated. This is the code Ive been using (Cant give the data sets for reproduction due to confidentiality)


dds <- DESeqDataSetFromMatrix(countData = SFpops.454[,3:11], colData = samples.454, design = ~ EXPERIMENT.ID + CELL.CYTOKINE)

vsd <- vst(dds, blind=FALSE)

rld <- rlogTransformation(dds)

res <- results(dds2, contrast = (c("CELL.CYTOKINE", "CD8-pop1", "CD8-pop2")))

and a selection of output results below

genename baseMean log2FoldChange lfcSE stat pvalue padj significant
TAAR7P 1.175253 25.03149 3.981587 6.286812 3.24E-10 2.22E-07
PRODH 0.815967 23.86939 3.989826 5.982564 2.20E-09 1.16E-06
FLJ46284 6.843331 23.5942 3.779361 6.242907 4.30E-10 2.79E-07
KRT81 0.998216 23.42814 4.022196 5.824714 5.72E-09 2.61E-06
AC007655.1 0.696205 23.32993 3.994779 5.840105 5.22E-09 2.40E-06
AC022916.4 0.698751 23.09876 4.023616 5.740795 9.42E-09 4.06E-06
SMIM25 3.920455 22.88505 2.405183 9.51489 1.82E-21 2.40E-17

Is there something altered in the package which makes these really lowly abundant genes now below the critical threshold?

rnaseq deseq2 • 130 views
ADD COMMENTlink modified 9 months ago by Michael Love25k • written 9 months ago by taamslab0
Answer: Unusual DESeq2 results
gravatar for Michael Love
9 months ago by
Michael Love25k
United States
Michael Love25k wrote:

Lowly expressed is not a barrier to having evidence of DE. A simple example: [10,10,10] vs [100,100,100]. Surely this is a DE gene for an experiment with low dispersion. The p-value will certainly be low enough for this gene to enter, e.g. a 1% FDR bounded set. The logFC will be very large, and undefined if the first group has all 0s, but if you use lfcShrink() you can obtain finite LFC even for such cases. I recommend to use lfcShrink() with type="apeglm" which provides reliable LFC estimates as opposed to the MLE which is printed in the table above. If you say the genes are driven by a single sample, usually that would be detected by the outlier filtering, but it can be missed sometimes, because our outlier filtering approach is not a perfect solution. You can also remove these cases via lfcShrink(). They will very likely have a reduced LFC when you apply shrinkage, and you can filter out such genes either using lfcThrehsold (which generates new posterior probabilities) or a post-hoc threshold on the shrunken LFC.

ADD COMMENTlink written 9 months ago by Michael Love25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 88 users visited in the last hour