Time-series Analysis using DESeq2 and LRT test
2
0
Entering edit mode
Assa Yeroslaviz ★ 1.5k
@assa-yeroslaviz-1597
Last seen 6 days ago
Germany

Hi all

I have a data set of 27 sample from 9 different time-points with triplicates of control and knockouts.

sampleNames    condition    treatment    TP    batch
A    CTRL0    CTRL    0    1
B    CTRL0    CTRL    0    1
C    CTRL0    CTRL    0    1
D    KO0    KO    0    1
E    KO0    KO    0    1
F    KO0    KO    0    1
G    CTRL3    CTRL    3    1
H    CTRL3    CTRL    3    1
I    CTRL3    CTRL    3    1
J    KO3    KO    3    1
K    KO3    KO    3    1
L    KO3    KO    3    1
M    CTRL6    CTRL    6    1
N    CTRL6    CTRL    6    1
O    CTRL6    CTRL    6    1
P    KO6    KO    6    1
P2    KO6    KO    6    2
Q    KO6    KO    6    1
Q2    KO6    KO    6    2
R    KO6    KO    6    1
R2    KO6    KO    6    2
S    CTRL9    CTRL    9    1
T    CTRL9    CTRL    9    1
U    CTRL9    CTRL    9    1
V    KO9    KO    9    1
W    KO9    KO    9    1
X    KO9    KO    9    1

I have already done a pair0wise analysis within each time-point, comparing the KO against the control. I would like now to test for genes differentially expressed in over time.

from previous experiment I know I can use the LRT test for that. This time it is different though, as I have control samples for each of the time points. in the previous analyses I have had a 0 time-point and have therefore used the reduced=~1 reduced model.

In the case above I would think the correct model will be

full_model = formula(~ treatment + TP + treatment:TP)

the reduced model I would like to use would be in this case

reduced_model = formula(~ treatment + TP)

And than run :

dds = DESeq(dds, test="LRT", full=full_model, reduced=reduced_model)

Am I correct in my assumption, that this will give me the genes differentially changed in a condition-specific manner over time?

For that I have two more questions -

1. should I use the condition column rather than the treatment column, where the each control and treated samples are time-point specific?

2. What should be changed, fig i would also like to include possible batch effects? Some samples were sequenced separately, due to technical difficulties.

would it be enough just add the batch effect to the two models like that?

full_model = formula(~ treatment + TP + batch + treatment:TP)

the reduced model I would like to use would be in this case

reduced_model = formula(~ treatment + TP + batch)

thanks

Assa

deseq2 LRT timecourse • 3.3k views
0
Entering edit mode

I see 4 time points not 9. Can you explain?

You only have 3 samples from batch 2 correct? Do these samples separate in the PCA plot?

0
Entering edit mode

1. sorry, yes these are only four time-points. I mixed it with the time-point 9.

2. Yes. the sample from the different batches fit perfectly well together in a PCA

BTW, does it matter what I put in the design parameter, when creating the dds object?. I was thinking of doing it like that:

dds<-DESeqDataSetFromMatrix(countData=countTable,
colData=phenotype,
design= ~ treatment + TP + batch + treatment:TP
)
1
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Don't use the 'condition' column above. You just want the 'pure' variables treatment (CTRL and KO) and time (0,3,6,9), not the two pasted together. And make sure time is a factor:

class(dds$time) Same for batch: class(dds$batch)

Then you can use the designs you have above with batch, which should help if, as you said the batch 2 samples separate in a PCA.

0
Entering edit mode

0
Entering edit mode

Hey! I have similar design, could you please explain how I can specify KO and ctrl specifically in my model?

0
Entering edit mode

Check the vignette and workflow on how to get started with DESeq2, and then for choosing an appropriate design, I'd recommend to consult with a statistician who can help guide the design and interpretation.

0
Entering edit mode
Assa Yeroslaviz ★ 1.5k
@assa-yeroslaviz-1597
Last seen 6 days ago
Germany

Hi Michael,

I am not sure, I have the correct parameters. I have tried as discussed above and run DESeq2 with the following parameters

dds<-DESeqDataSetFromMatrix(countData=countTable,
colData=phenotype,
design= ~ treatment + TP + treatment:TP)
full_model = formula(~ treatment + TP + treatment:TP)
reduced_model = formula(~ treatment + TP)
dds = DESeq(dds, test="LRT", full= full_model, reduced=reduced_model, parallel = TRUE)
res<-results(object = dds, parallel = TRUE, cooksCutoff = FALSE)

But after checking the plotCounts(), it doesn't look like it is really changing over time. I added the two highest significant genes according to the res object.

there is clear time or condition-specific behaviour, unless the subtle changes I can detect are enough for DESeq2 to set these genes as condition-specific changes over time. is this the case?

thanks

assa

0
Entering edit mode

If I look at the data I get the feeling, that the changes are over time, but not condition-specific. Is this the correct approach to identify genes that are significantly differentially expressed between the Knock-out and the control over all time points?

0
Entering edit mode
The first one looks plausible. It would easier if you colored by WT vs KO. Use the intgroup argument.
0
Entering edit mode

What is happening when I am running the results command with any parameters?

these are the resultNames(dds) I have

> resultsNames(dds)
[3] "TP_3_vs_0"                "TP_6_vs_0"
[7] "treatmentmad2KO.TP6"      "treatmentmad2KO.TP9" 

As far as I understand it, I always get per default the last parameter in the list. So here i get  "treatmentmad2KO.TP9". But is this what i want for the differences between the mad2KO and CTRL over time?Wouldn't this one just give me the differences between TP9 and TP0?

Won't the contrast "treatment_mad2KO_vs_CTRL" be better for what i am looking for?

thanks again

Assa

0
Entering edit mode

This is discussed in the man page for ?results, about what results() gives you when you've performed an LRT.