Question

DESeq Design for Time-Series Analysis

0

Entering edit mode

WS • 0

@b7b0e0f2

Last seen 22 months ago

United States

Hello,

I am analyzing an RNA-seq dataset generated by a previous lab member and I have a question regarding DESeq design for a time-series analysis. The data was generated from a drug treatment performed in vitro (same cell line for all samples) and I have the following samples to work with:

0h - No treatment, a single set of duplicate samples collected 2 days - duplicates collected for drug treatment and DMSO control 10 days - duplicates collected for drug treatment and DMSO control 42 days - duplicates collected for drug treatment and DMSO control

My goal with the analysis is two-fold. First, I want to identify genes that are differentially expressed between drug treatment vs. control at each time point. For this, I plan to analyze only the 2d, 10d, and 42d time points using a design of ~ condition + time + condition:time and performing the analysis similar to res <- results(dds, name="condX.time10h", test="Wald") for each time point (DESeq2 analysis for RNAseq data at different time points and different conditions).

Second, I want to identify genes that exhibit a different expression pattern over time for the two treatment groups (ie. starting at the reference level of 0h, gene A's expression increases over time in the drug treatment condition while it remains constant in the DMSO condition).

My question is with regards to how the second analysis should be performed. It seems it would have been ideal to collect separate duplicates at the 0h time point, one for the DMSO condition and one for the drug treatment condition and then a design of ~ condition + time + condition:time would be used for the full model and a design of ~ condition + time would be used for the reduced model and a LRT would be performed. However, I was hoping to still include the 0h time point as a reference level if at all possible. As it stands now, in my sampleTable I have a condition column with three conditions: NoTreatment (assigned only to the 0h duplicates), DMSO, and drug and I have a time column with four time points: 0h, 48h, 240h, 1008h.

When I attempted to construct the DESeqDataSet with the following design:

dds_time_course <- DESeqDataSetFromTximport(txi, colData = sampleTable, design = ~ condition + time + condition:time)

I received the following error:

Error in checkFullRank(modelMatrix) :
the model matrix is not full rank, so the model cannot be fit as specified.
Levels or combinations of levels without any samples have resulted in
column(s) of zeros in the model matrix.

Please read the vignette section 'Model matrix not full rank':

vignette('DESeq2')

I read through the vignette but was not sure how to appropriately design the DESeq analysis given that I only have one set of duplicates for the 0h time point. I apologize if a similar question has been answered elsewhere and I greatly appreciate any help!

RNASeq DESeq2 timecourse • 808 views

ADD COMMENT • link 22 months ago WS • 0

0

Entering edit mode

I just wanted to provide an update that for the analysis towards my first goal, I modified the approach to use a design of condition + time only and then generated DE results for each time point in the following way:

dds$group <- factor(paste0(dds$time, dds$condition))
design(dds) <- ~ group
dds <- DESeq(dds)


res_48h <- results(dds, contrast=c("group", "48DMSO", "48Drug"))
res_240h <- results(dds, contrast=c("group", "240DMSO", "240Drug"))
res_1008h <- results(dds, contrast=c("group", "1008DMSO", "1008Drug"))

I believe the original approach I described using ~ condition + time + condition:time and res <- results(dds, name="condX.time10h", test="Wald") is helpful to control for differences that existed for treatment vs. control at time 0h, but I only have one set of duplicates at time 0h. I hope this makes sense and if this does not seem like the proper way to achieve my first analysis goal, I really appreciate suggestions!

ADD REPLY • link 22 months ago WS • 0