Hi everyone!
I have a question about the DEseq2 analysis. Can it be used to analyze a 3-way interaction?
In my experiment, we have 3 variables: Diet + Date + Time, that need to be analyze together.
Briefly, we had samples from subjects that had 2 different diets for a period of 21 days. After that period the samples were used to feed an in vitro fermentation system. We collected samples at several time points after fermentation: 0hr after fermentation, 5, 10, 24 and 48hrs. We want to know if there are changes on the microbiota structure counting the interaction of Diet + Date (0 day, 21 days) and Time (the 5 fermentation time points).
I have been trying to use DESeq2 to analyze this interaction but for some reason is not working. I am wondering if my model is too complicated for this type of analysis.
These are the steps I have been following:
Documents:
Metadata: metadata_RS_no_pool_DESeq2
Count data: Taxonomy_Count_Data_RS_transpose_no_pool_DESeq2_2
Commands:
coldata <- metadata_RS_no_pool_DESeq2
mat <- Taxonomy_Count_Data_RS_transpose_no_pool_DESeq2_2
dds <- DESeqDataSetFromMatrix(countData = as.matrix(mat), colData = DataFrame(coldata), design = formula(~Diet + Date + Time))
mm1<-model.matrix(~Diet + Date + Time + Diet:Date + Date:Time + Diet:Time + Diet:Date:Time + Diet:SubjectID, coldata) #Full model.
idx<-which(colSums(mm1==0)==nrow(mm1))
mm1<-mm1[,-idx]
mm0<-model.matrix(~Diet + Date + Time + Diet:Date + Date:Time + Diet:Time + Diet:SubjectID, coldata) #Reduced model.
idx<-which(colSums(mm0==0)==nrow(mm0))
mm0<-mm0[,-idx]
ddsOTU <- DESeq(dds, betaPrior=FALSE, full=mm1)
But this is what I get:
using supplied model matrix
estimating size factors
estimating dispersions
gene-wise dispersion estimates
Error in solve.default(qr.R(qrx)) : 'a' (1 x 0) must be square
I would appreciate any feedback on this issue.
Thanks in advance!
Ida
hi Ida,
I don't follow the difference between date and time. How many samples do you have at day 0? Is there a time series at day 0 or only at day 21? How many biological replicates do you have? Maybe it's easiest to just print the colData?
Hi Michael,
Thanks for the quick response. Here the information of the coldata. As it is a lot of information, I will divide the table in 2 posts:
The whole experiment is divided in two parts. The first part was done with volunteers randomized to one of the two diets (e.g MRE vs HAB, is the habitual diet). This part was for a period of 21 days. We had samples before starting with the diet consumption, defined as day 0, and samples after the consumption finished, defined as day 21. This is what I call "Date" in my metadata. To feed the in vitro fermentation system, we did a pool of volunteers for each diet, meaning from MRE diet we took five volunteer samples and we made a pool, and from HAB diet, 5 volunteer samples and made a pool. These are fecal samples.
We used that pool to feed the in vitro fermentation system (samples from MRE and HAB at day 0 and 21). "Time" in my metadata is referring to the 5 time points we collected samples during the fermentation [e.g. 0hr (inoculation), 5hr. 10hr, 24hr, and 48hr after fermentation]. At each time point we had 3 replicates defined by A,B,C,D.
At day 0 we have 30 samples and at day 21, 30 samples. As you can notice the design is pretty complicated. We have the effect of the diet during 21 days, and the effect of the fermentation hours.
Let me know if something is not that clear.
Thanks!
PS. Sorry for dividing it but the limit is 5000 characters.