DESeq2 for a 3-way interaction analysis?
1
0
Entering edit mode
idapantoja • 0
@idapantoja-17586
Last seen 5.9 years ago
MA

Hi everyone!

I have a question about the DEseq2 analysis. Can it be used to analyze a 3-way interaction?

In my experiment, we have 3 variables: Diet + Date + Time, that need to be analyze together.

Briefly, we had samples from subjects that had 2 different diets for a period of 21 days. After that period the samples were used to feed an in vitro fermentation system. We collected samples at several time points after fermentation: 0hr after fermentation, 5, 10, 24 and 48hrs. We want to know if there are changes on the microbiota structure counting the interaction of Diet + Date (0 day, 21 days) and Time (the 5 fermentation time points).

I have been trying to use DESeq2 to analyze this interaction but for some reason is not working. I am wondering if my model is too complicated for this type of analysis.

These are the steps I have been following:

Documents:

Metadata: metadata_RS_no_pool_DESeq2

Count data: Taxonomy_Count_Data_RS_transpose_no_pool_DESeq2_2

Commands:

coldata <- metadata_RS_no_pool_DESeq2

mat <- Taxonomy_Count_Data_RS_transpose_no_pool_DESeq2_2

dds <- DESeqDataSetFromMatrix(countData = as.matrix(mat), colData = DataFrame(coldata), design = formula(~Diet + Date + Time))

mm1<-model.matrix(~Diet + Date + Time + Diet:Date + Date:Time + Diet:Time + Diet:Date:Time + Diet:SubjectID, coldata) #Full model.

idx<-which(colSums(mm1==0)==nrow(mm1))                                                                                                  

mm1<-mm1[,-idx]

mm0<-model.matrix(~Diet + Date + Time + Diet:Date + Date:Time + Diet:Time + Diet:SubjectID, coldata) #Reduced model.

idx<-which(colSums(mm0==0)==nrow(mm0))

mm0<-mm0[,-idx]

ddsOTU <- DESeq(dds, betaPrior=FALSE, full=mm1)

But this is what I get:

using supplied model matrix

estimating size factors

estimating dispersions

gene-wise dispersion estimates

Error in solve.default(qr.R(qrx)) : 'a' (1 x 0) must be square

I would appreciate any feedback on this issue.

Thanks in advance!

Ida         

deseq2 • 1.2k views
ADD COMMENT
0
Entering edit mode

hi Ida,

I don't follow the difference between date and time. How many samples do you have at day 0? Is there a time series at day 0 or only at day 21? How many biological replicates do you have? Maybe it's easiest to just print the colData?

ADD REPLY
0
Entering edit mode

Hi Michael,

Thanks for the quick response. Here the information of the coldata. As it is a lot of information, I will divide the table in 2 posts:

index SubjectID Time Diet Date
Natick.240 B 0 MRE 0
Natick.241 B 5 MRE 0
Natick.242 B 10 MRE 0
Natick.243 B 24 MRE 0
Natick.244 B 48 MRE 0
Natick.247 C 0 MRE 0
Natick.248 C 5 MRE 0
Natick.249 C 10 MRE 0
Natick.250 C 24 MRE 0
Natick.251 C 48 MRE 0
Natick.252 D 0 MRE 21
Natick.253 D 5 MRE 21
Natick.254 D 10 MRE 21
Natick.255 D 24 MRE 21
Natick.256 D 48 MRE 21
Natick.257 E 0 MRE 21
Natick.258 E 5 MRE 21
Natick.259 E 10 MRE 21
Natick.260 E 24 MRE 21
Natick.261 E 48 MRE 21
Natick.262 F 0 MRE 21
Natick.263 F 5 MRE 21
Natick.264 F 10 MRE 21
Natick.265 F 24 MRE 21
Natick.266 F 48 MRE 21
ADD REPLY
0
Entering edit mode
Natick.269 A 0 HAB 0
Natick.270 A 5 HAB 0
Natick.271 A 10 HAB 0
Natick.272 A 24 HAB 0
Natick.273 A 48 HAB 0
Natick.274 B 0 HAB 0
Natick.275 B 5 HAB 0
Natick.276 B 10 HAB 0
Natick.277 B 24 HAB 0
Natick.278 B 48 HAB 0
Natick.279 C 0 HAB 0
Natick.280 C 5 HAB 0
Natick.281 C 10 HAB 0
Natick.282 C 24 HAB 0
Natick.283 C 48 HAB 0
Natick.284 D 0 HAB 21
Natick.285 D 5 HAB 21
Natick.286 D 10 HAB 21
Natick.287 D 24 HAB 21
Natick.288 D 48 HAB 21
Natick.289 E 0 HAB 21
Natick.290 E 5 HAB 21
Natick.291 E 10 HAB 21
Natick.292 E 24 HAB 21
Natick.293 E 48 HAB 21
Natick.294 F 0 HAB 21
Natick.295 F 5 HAB 21
Natick.296 F 10 HAB 21
Natick.297 F 24 HAB 21
Natick.298 F 48 HAB 21
Natick.300 A 0 MRE 0
Natick.301 A 5 MRE 0
Natick.302 A 10 MRE 0
Natick.303 A 24 MRE 0
Natick.304 A 48 MRE 0
ADD REPLY
0
Entering edit mode

The whole experiment is divided in two parts. The first part was done with  volunteers randomized to one of the two diets (e.g MRE vs HAB, is the habitual diet). This part was for a period of 21 days. We had samples before starting with the diet consumption, defined as day 0, and samples after the consumption finished, defined as day 21. This is what I call "Date" in my metadata. To feed the in vitro fermentation system, we did a pool of volunteers for each diet, meaning from MRE diet we took five volunteer samples and we made a pool, and from HAB diet, 5 volunteer samples and made a pool. These are fecal samples.

We used that pool to feed the in vitro fermentation system (samples from MRE and HAB at day 0 and 21).  "Time" in my metadata is referring to the 5 time points we collected samples during the fermentation [e.g. 0hr (inoculation), 5hr. 10hr, 24hr, and 48hr after fermentation]. At each time point we had 3 replicates defined by A,B,C,D. 

At day 0 we have 30 samples and at day 21, 30 samples. As you can notice the design is pretty complicated. We have the effect of the diet during 21 days, and the effect of the fermentation hours. 

Let me know if something is not that clear.

Thanks!

PS. Sorry for dividing it but the limit is 5000 characters.

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 3 days ago
United States

I would approach this dataset by combining diet and date into one factor called group. Then you can use a design of ~group + group:sample + group:time. Because the groups have different number of samples, you'll have to use the approach you have above to remove the columns of the model matrix with all 0's.

ADD COMMENT

Login before adding your answer.

Traffic: 453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6