DESeq2 shows comparison nonsignificant, unclear why
1
0
Entering edit mode
daniella • 0
@daniella-24096
Last seen 22 days ago
United States

Hi,

I am performing differential expression on gene expression between different time points- time point 0 vs. time points 0.5, 1 and 3.

For a specific gene, I have the following expression profile (this is normalized reads): Normalized reads +- SE

It seems in the plot like there is a significant difference between time points 0 and 1. Actually, this is a gene which I know is induced in this paradigm from many past experiments. However, DESeq does not find the comparison significant:

              baseMean log2FoldChange    lfcSE     stat    pvalue      padj
 GeneX 1230.466       1.277551 1.164463 1.097116 0.2725906 0.8389563

The individual values of time point 0 are:

tp0.1 tp0.2 tp0.3

1622.129 1260.146 1831.165

and for 1:

tp1.1 tp1.2 tp1.3

2878.453 4137.590 4406.023

If I perform a simple t-test on the two time points, I get p-value = 0.0304.

My question is- what might be the reason that this comparison is coming up as insignificant in the DESeq analysis? Thanks a lot!

DESeq2 RNASeq DifferentialExpression • 131 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 2 days ago
United States

Provide code, including your colData and design.

ADD COMMENT
0
Entering edit mode

Hey Michael, thanks so much for your reply. I've uploaded the raw reads and colData to:

https://drive.google.com/file/d/1ryg73rZbsvVjIwp4drhF2iONVTHWDeAY/view?usp=sharing

https://drive.google.com/file/d/1JsnOpEP2D7JkTBsN7kcAKUkyq-q1Hth_/view?usp=sharing

The code is:

dds.dd = DESeqDataSetFromMatrix(countData = raw.reads.dd,
                                   colData = samp.data.dd,
                                   design = ~ group) 
rld.dd = rlogTransformation(dds.dd)


design(dds.dd)  # the design is according to the group, which is a combination of the cell type and time point
dds.dd = DESeq(dds.dd)


#------- Compare Cell Type B time point1  vs. 0 timepoint, look at result of GeneX

res.1.dd = results(dds.dd, contrast=c("group","CellTypeB.1","CellTypeB.0"))
res.1.dd.df = as.data.frame(res.1.dd)
res.1.dd.df[rownames(res.1.dd.df)=="GeneX",]  #turns out nonsignificant
ADD REPLY
0
Entering edit mode

Two recommendations, the two cell types are very different, for a gene like GeneX, it's not even expressed in the other cell type. So one is to split these within cell type comparisons by making two dds objects. The second is to remove thousands of unexpressed genes once you split into the two datasets (either way I would recommend to do this for speed).

You're better off modeling like this:

> dds2 <- dds[,grep("CellTypeB", dds$group)]
> dds2$group <- droplevels(dds2$group)
> keep <- rowSums(counts(dds2) >= 10) >= 3
> table(keep)
keep
FALSE  TRUE
39828 15507
> dds2 <- dds2[keep,]
> dds2 <- DESeq(dds2)
> res <- results(dds2, contrast=c("group","CellTypeB.1","CellTypeB.0"))
> res["GeneX",]
log2 fold change (MLE): group CellTypeB.1 vs CellTypeB.0
Wald test p-value: group CellTypeB.1 vs CellTypeB.0
DataFrame with 1 row and 6 columns
       baseMean log2FoldChange     lfcSE      stat      pvalue        padj
      <numeric>      <numeric> <numeric> <numeric>   <numeric>   <numeric>
GeneX   2439.44        1.27855   0.23307   5.48566 4.11919e-08 7.33678e-05
ADD REPLY

Login before adding your answer.

Traffic: 189 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6