Question

RNA-seq time course analysis with only one experimental condition

0

Entering edit mode

mah19 • 0

@mah19-21388

Last seen 5.0 years ago

Hello, We are conducting an experiment that involves performing RNA-seq at 3 time points(t1,t2,t3). So far we have RNA-seq data from 3 biological replicates for the cancer model at the three time points, but we are yet to generate those for the control. I have used DEseq2 to perform pair-wise differential gene expression analysis for each pair of time points (t1,t2,t3), on the assumption that time point t1 is as close to wild type ( cancer free) as possible. However, I was wondering if there was a better way to compare gene expression between time points in the same condition ( i.e. cancer) using DEseq2 time course approach. I have tried to follow Michael Love's tutorial (http://master.bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#time-course-experiments) section 9 but DEseq2 will not accept a design formula where there is only one experimental condition:

dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~condition + time_point)
Error in DESeqDataSet(se, design = design, ignoreRank) : 
  design contains one or more variables with all samples having the same value,
  remove these variables from the design

I understand the limitation of this approach, but at the moment, this is the available data. Any insights on how to better examine the available data is highly appreciated :) Thank you

deseq2 cancer rna-seq time-series • 2.0k views

ADD COMMENT • link 5.0 years ago mah19 • 0

score 0 · Answer 1 · 2019-07-18

0

Entering edit mode

swbarnes2 ★ 1.4k

@swbarnes2-14086

Last seen 18 hours ago

San Diego

If all the samples have the same condition value, it shouldn't be in the design.

ADD COMMENT • link 5.0 years ago swbarnes2 ★ 1.4k

0

Entering edit mode

Dear,

I have a question related to this. Apologies if this question has already been addressed somewhere, but I couldn't find it so quickly. Your help would be very much appreciated! In my experiment, I have a group of patients (20) who all underwent a similar treatment at approximately the same time point (week0). We have 5 time points, and at week 0 they received the treatment. Now, I want to check if something happens over time at all, while correcting for the baseline expression of each patient (as this is highly variable between patients). In the vignette it is indicated that DESeq2 can be used to analyze paired samples by writing the formula ~subject+condition. In my case, the 'condition' would then simply be 'Time', right? When I use the formula ~PatientID+Time and I look at the resultsNames(diagdds), I get the following output:

> resultsNames(diagdds)


[1] "Intercept"                    "PatientID_RH1.057_vs_RH1.002"
 [3] "PatientID_RH1.119_vs_RH1.002" "PatientID_RH1.121_vs_RH1.002"
 [5] "PatientID_RH1.153_vs_RH1.002" "PatientID_RH1.292_vs_RH1.002"
 [7] "PatientID_RH1.453_vs_RH1.002" "PatientID_RH1.463_vs_RH1.002"
 [9] "PatientID_RH1.472_vs_RH1.002" "PatientID_RH1.656_vs_RH1.002"
[11] "PatientID_RH1.657_vs_RH1.002" "PatientID_RH1.663_vs_RH1.002"
[13] "PatientID_RH1.675_vs_RH1.002" "PatientID_RH1.700_vs_RH1.002"
[15] "PatientID_RH1.710_vs_RH1.002" "PatientID_RH1.749_vs_RH1.002"
[17] "PatientID_RH1.791_vs_RH1.002" "PatientID_RH1.820_vs_RH1.002"
[19] "PatientID_RH1.960_vs_RH1.002" "PatientID_RH1.983_vs_RH1.002"
[21] "Timepoint_week08_vs_week00"

I have checked that PatientID is a factor and have indicated that the contrast should be Timepoint, week 0 versus 8, see:

res=results(diagdds, contrast=c("Timepoint","week00","week08"),cooksCutoff = FALSE)

When I select this contrast, it should provide the results of the paired analysis right, where week 00 is the baseline of all patients, and it is checked whether at week 08 changes occurred as compared to week 00 for each patient. Then the ultimate result will provide whether expression of specific genes changed over all patients on week 08 as compared to week 00, while correcting for the paired-design?

Best and thanks a lot in advance, Quinten

ADD REPLY • link 5.0 years ago Quinten • 0

0

Entering edit mode

Yes that's right. And you can use plotCounts to take a look at the top genes to see if these match your understanding of the design.

ADD REPLY • link 5.0 years ago Michael Love 42k

0

Entering edit mode

Hi Michael,

Thanks a lot for the quick reply and I will definitely check out the plotCounts function!. One more question: If I want to directly have a look at the total longitudinal study (so include all the time points), I have to use the LRT as indicated in the vignette, chapter 9 (http://master.bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#time-course-experiments). In my case, the full model would be ~PatientID + Timepoint, while the reduced model would be ~PatientID right? This will then test whether there is a consistent (?) change in specific genes (microbes in my case) over time across all patients, with week 00 as baseline? If I then want to check at which time point a specific gene (microbe) is changed, I can do this using the Wald test by paired-sample testing. For my understanding, this is to some degree comparable to a simple ANOVA where post-hoc testing is required to find the specific differences? Lastly, I have some problems understanding the underlying test. I am familiar with LRT in linear-mixed models, where goodness of fit is tested between the 2 models. Is that the same case here, or should I interpret the LRT somewhat differently? Perhaps this has been asked before, but at least I couldn't find it...

Thanks a lot for your help,

Quinten

ADD REPLY • link 5.0 years ago Quinten • 0

0

Entering edit mode

Yes, and see the stageR package in Bioconductor for appropriate error control on the post-hoc test.

Please see the LRT section of the vignette which discusses how to interpret the test result.

ADD REPLY • link 5.0 years ago Michael Love 42k