Seeking help in multifactor differential gene expression analysis using DESeq2. Can not get significance difference while comparing 3 time factor (0,2 and 4h) with 3 group (1 control, 2 treated)
1
0
Entering edit mode
@9581bfe0
Last seen 23 days ago
Malaysia

Dear Experts, I have RNAseq data from 6 different plant samples (2 control, 2 Sensitive, and 2 tolerance), and different location of one species. I am trying to see the effect of the different groups at different time points, but after going through all the posts and vignettes I am confused.

dds <- DESeqDataSetFromMatrix(htseq_data, sampleTable, design = ~ group + time + group:time )

And this is my sample table:

             time    group
MR219_0h_1   0h      CT
MR219_0h_2   0h      CT
MR219_0h_3   0h      CT
MR219_2h_1   2h      CT
MR219_2h_2   2h      CT
MR219_2h_3   2h      CT
MR219_4h_1   4h      CT
MR219_4h_2   4h      CT
MR219_4h_3   4h      CT
MU005_0h_1   0h      HS
MU005_0h_2   0h      HS
MU005_0h_3   0h      HS
MU005_2h_1   2h      HS
MU005_2h_2   2h      HS
MU005_2h_3   2h      HS
MU005_4h_1   4h      HS
MU005_4h_2   4h      HS
MU005_4h_3   4h      HS
MU201_0h_1   0h      CT
MU201_0h_2   0h      CT
MU201_0h_3   0h      CT
MU201_2h_1   2h      CT
MU201_2h_2   2h      CT
MU201_2h_3   2h      CT
MU201_4h_1   4h      CT
MU201_4h_2   4h      CT
MU201_4h_3   4h      CT
MU235_0h_1   0h      HT
MU235_0h_2   0h      HT
MU235_0h_3   0h      HT
MU235_2h_1   2h      HT
MU235_2h_2   2h      HT
MU235_2h_3   2h      HT
MU235_4h_1   4h      HT
MU235_4h_2   4h      HT
MU235_4h_3   4h      HT
MU244_0h_1   0h      HT
MU244_0h_2   0h      HT
MU244_0h_3   0h      HT
MU244_2h_1   2h      HT
MU244_2h_2   2h      HT
MU244_2h_3   2h      HT
MU244_4h_1   4h      HT
MU244_4h_2   4h      HT
MU244_4h_3   4h      HT
MU251_0h_1   0h      HS
MU251_0h_2   0h      HS
MU251_0h_3   0h      HS
MU251_2h_1   2h      HS
MU251_2h_2   2h      HS
MU251_2h_3   2h      HS
MU251_4h_1   4h      HS
MU251_4h_2   4h      HS
MU251_4h_3   4h      HS

DESeq2 Script ass follwing

DDS <- DESeqDataSetFromMatrix(countData = CountData,
                             colData = colData,
                             design= ~ group + time + group:time )

keep <- rowSums(counts(DDS)) >= 10
DDS <- DDS[keep,]
DDS

DDS$time <- relevel(DDS$time, ref = "0h")
DDS$group <- relevel(DDS$group, ref = "CT")

DDS = DESeq(DDS)

resHS_0h=results(DDS,contrast=list("group_HS_vs_CT"))
head(resHS_0h)
summary(resHS_0h)
res_HS_0h = as.data.frame(resHS_0h)
Hs_0h = res_HS_0h[(res_HS_0h$padj <0.05) & (res_HS_0h$baseMean >50) & (abs(res_HS_0h$log2FoldChange)>1),]

resHS_2h=results(DDS,contrast=list(c("time_2h_vs_0h","groupHS.time2h")))
head(resHS_2h)
summary(resHS_2h)
res_HS_2h = as.data.frame(resHS_2h)
Hs_2h = res_HS_2h[(res_HS_2h$padj <0.05) & (res_HS_2h$baseMean >50) & (abs(res_HS_2h$log2FoldChange)>1),]

resHS_4h=results(DDS,contrast=list(c("time_4h_vs_0h","groupHS.time4h")))
head(resHS_4h)
summary(resHS_4h)
res_HS_4h = as.data.frame(resHS_4h)
Hs_4h = res_HS_4h[(res_HS_4h$padj <0.05) & (res_HS_4h$baseMean >50) & (abs(res_HS_4h$log2FoldChange)>1),]

resHT_0h=results(DDS,contrast=list("group_HT_vs_CT"))
head(resHT_0h)
summary(resHT_0h)
res_HT_0h = as.data.frame(resHT_0h)
Ht_0h = res_HT_0h[(res_HT_0h$padj <0.05) & (res_HT_0h$baseMean >50) & (abs(res_HT_0h$log2FoldChange)>1),]

resHT_2h=results(DDS,contrast=list(c("time_2h_vs_0h","groupHT.time2h")))
head(resHT_2h)
summary(resHT_2h)
res_HT_2h = as.data.frame(resHS_2h)
Ht_2h = res_HT_2h[(res_HT_2h$padj <0.05) & (res_HT_2h$baseMean >50) & (abs(res_HT_2h$log2FoldChange)>1),]

resHT_4h=results(DDS,contrast=list(c("time_4h_vs_0h","groupHT.time4h")))
head(resHT_4h)
summary(resHT_4h)
res_HT_4h = as.data.frame(resHT_4h)
Ht_4h = res_HT_4h[(res_HT_4h$padj <0.05) & (res_HT_4h$baseMean >50) & (abs(res_HT_4h$log2FoldChange)>1),]

But while I extract genes from two groups as an example for 4h up and down-regulated genes. I suspect there was no such difference. Like same type of genes up-regulated in HS and HT

Example table bellow

# HS genes with basemean, lfc and pvalue
Gene_ID                 basemean           lfc                 pvalue                        padj
LOC_Os01g04360.1    35927.7676492306    13.269454923308     5.96868498666644E-99      3.94304235484021E-96
LOC_Os01g04370.1    139637.319907929    13.4582530102767       1.38429446664833E-82   6.1520562996882E-80
LOC_Os01g04380.1    32939.0908584791    16.927448436553    2.74264559598584E-68   8.37981078783523E-66

# HT genes with basemean, lfc and pvalue
Gene_ID                 basemean           lfc                 pvalue                        padj
LOC_Os01g04360.1    35927.7676492306    10.7995523770545       1.1096536703974E-79     7.14303984495556E-77
LOC_Os01g04370.1    139637.319907929    11.3647008801624       1.02512039214524E-63    4.02119491325098E-61
LOC_Os01g04380.1    32939.0908584791    8.3083189329929    1.08555613708649E-61    3.78512316966061E-59

There was no significant difference between them.

Can you please help me in this regard? Any help much appreciated.

DifferentialExpression GeneExpression R enrichment DEGseq • 1.1k views
ADD COMMENT
0
Entering edit mode
swbarnes2 ★ 1.4k
@swbarnes2-14086
Last seen 59 minutes ago
San Diego

Those look like different results to me. It's not at all clear why you are so sure it's wrong. Have you eyeballed the expression of those genes?

When simply comparing one subset of samples to another, it's simpler to use the method outlined here:

http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#interactions

ADD COMMENT
0
Entering edit mode

Thanks for your kind reply. Those genes belong to HSP20, Suppose to have high expression under HT but if we look into the table its has high lfc under HS. Is any other way I can play with it to get a good result? I am following the DESeq2 manuall to get the best support.

ADD REPLY

Login before adding your answer.

Traffic: 716 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6