Deseq2 time series question,
Entering edit mode
Last seen 4.6 years ago

Dear all, I would very much appreciate any assistance in the analysis of our RNA-SEQ data, i feel that i'm making a mistake somewhere but cant quite lay my finger on where this is happening. 

Study design: 8 wasp allergic patients were stung by a wasp, peripheral blood samples were drawn before the sting (T0) and at 5 set time points afterwards (T5, T11, T15, T30 and T60 minutes). 3 patients developed a systemic reaction (WelReactie), 5 patients dit not develop a systemic reaction (GeenReactie).  

Study question:  Which genes are deferentially expressed at any time-point between the group "WelReactie" and the group "GeenReactie"


directory = 'C:/Users/Default.COLTON/Documents/R/Longitudinal'
sampleFiles <- grep(".htseq", list.files(directory),value=T)
sampleCondition <- c('GeenReactie','GeenReactie','GeenReactie','GeenReactie','GeenReactie','GeenReactie',
time <- factor(c('T0', 'T11', 'T15', 'T30', 'T5', 'T60',
'T0', 'T11', 'T15', 'T30', 'T5', 'T60',
'T0', 'T11', 'T15', 'T30', 'T5', 'T60',
'T0', 'T11', 'T15', 'T30', 'T5', 'T60',
'T0', 'T11', 'T15', 'T30', 'T5', 'T60',
'T0', 'T11', 'T15', 'T30', 'T5', 'T60',
'T0', 'T11', 'T15', 'T30', 'T5', 'T60',
'T0', 'T11', 'T15', 'T30', 'T5', 'T60'))
sampleTable<-data.frame(sampleName=sampleFiles, fileName=sampleFiles, condition=sampleCondition, time=time)
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design=~condition+time+time:condition)
ddsHTSeq$condition<-factor(ddsHTSeq$condition, levels=c("GeenReactie","WelReactie"))
ddsHTSeq$time <- factor(ddsHTSeq$time, levels=c("T0", "T5", "T11", "T15", "T30", "T60"))
dds = DESeq(ddsHTSeq, test="LRT", reduced=~condition + time)
resTC <- results(dds)
resTC$symbol <- mcols(ddsTC)$symbol
write.csv(res, file="DATE-Longitudinal.csv")


  baseMean log2FoldChange lfcSE stat pvalue padj
ENSG00000124882 11.87947 3.418625 0.899554 25.94876 9.13E-05 1
ENSG00000198918 34.99821 2.464948 0.583217 21.557 0.000635 1
ENSG00000170632 49.10306 1.171724 0.285648 20.24493 0.001124 1
ENSG00000085274 90.14553 0.950642 0.231124 19.91252 0.001298 1

Outcome:  To my surprise, all adjusted P values were exactly 1, whereas just a standard crossectional analysis for each time point sepperatly does indicate DEG.

Bioconductor question: are there no DEG or, far more likely, what am i doing wrong?



rnaseq deseq2 timecourse • 1.1k views
Entering edit mode
Last seen 1 hour ago
United States

"whereas just a standard cross-sectional analysis for each time point separately does indicate DEG"

Do you mean for the top gene?

One detail is: the test you have performed is for differences after T0. It will not find a gene significant if there are differences at each time point which are equal in magnitude to the difference at T0.

Can you post a plotCounts() of the top gene? See the time series section in the workflow.


Entering edit mode

Dear Micheal, 

To clarify what I meant: deseq analysis of all 8 samples taken at baseline indicate 24 DEG (adjp<0.05), between the groups with and without a systemic reaction.

In addition, deseq analysis of all 8 samples taken at T60 minutes indicates 300 DEG (adjp<0.05), between the groups with and without a systemic reaction. To me, this is an indication of some time dependent differential expression of genes.

Attached is the plotCounts() of the top genes based on not adjusted p value, as all adjusted P values are 1.

Entering edit mode

It looks like there is DE at T60, but the other time points are not so clear, and the within-condition variability is very high, which leads to not-so-small p-values. So it could be the case that you have more power to detect the differences at T60 than when you do the LRT. 

Given that there is a lot of within-condition variability here, I'm wondering if you would have improved power by reducing the number of terms you test in the LRT. Usually, I don't recommend to use time as a linear covariate, but it could potentially help here to recover a few genes which differ in the slope of log counts over time.

What if you try:

dds$t <- as.numeric(sub("T(.*)","\\1",as.character(dds$time)))
design(dds) <- ~ condition + t + condition:t
dds <- DESeq(dds, test="LRT", reduced=~condition + t)

Login before adding your answer.

Traffic: 563 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6