Hello,
Im currently re performing some analyses of RNA-seq data via DESeq2, however since the first analysis (about a year ago) Ive got some very strange genes turning up as significantly expressed. These genes have exceedingly high p-values and fold changes (>2^15 !!) while also being lowly expressed ( Basemean of <100) and often driven by a single sample.
the experiment is as follows: an RNA seq analysis (Unstranded ) of Cytokine expressing cells from 3 separate patients, comparing 2 cell types (CD4/CD8) expressing 1 of 2 cytokines (pop1/pop2).
any advice would be appreciated. This is the code Ive been using (Cant give the data sets for reproduction due to confidentiality)
dds <- DESeqDataSetFromMatrix(countData = SFpops.454[,3:11], colData = samples.454, design = ~ EXPERIMENT.ID + CELL.CYTOKINE)
vsd <- vst(dds, blind=FALSE)
rld <- rlogTransformation(dds)
res <- results(dds2, contrast = (c("CELL.CYTOKINE", "CD8-pop1", "CD8-pop2")))
and a selection of output results below
genename | baseMean | log2FoldChange | lfcSE | stat | pvalue | padj | significant |
TAAR7P | 1.175253 | 25.03149 | 3.981587 | 6.286812 | 3.24E-10 | 2.22E-07 |
PRODH | 0.815967 | 23.86939 | 3.989826 | 5.982564 | 2.20E-09 | 1.16E-06 |
FLJ46284 | 6.843331 | 23.5942 | 3.779361 | 6.242907 | 4.30E-10 | 2.79E-07 |
KRT81 | 0.998216 | 23.42814 | 4.022196 | 5.824714 | 5.72E-09 | 2.61E-06 |
AC007655.1 | 0.696205 | 23.32993 | 3.994779 | 5.840105 | 5.22E-09 | 2.40E-06 |
AC022916.4 | 0.698751 | 23.09876 | 4.023616 | 5.740795 | 9.42E-09 | 4.06E-06 |
SMIM25 | 3.920455 | 22.88505 | 2.405183 | 9.51489 | 1.82E-21 | 2.40E-17 |
Is there something altered in the package which makes these really lowly abundant genes now below the critical threshold?