Help with missing results levels in a multi-factor time-series data analysis using DESeq2
1
0
Entering edit mode
giltu1 • 0
@giltu1-10797
Last seen 8.6 years ago

Hello,

I have a complex time-series RNA-seq expression dataset which I'm currently trying to analyse using DESeq2.

I have two plant species, grown under two nitrate levels along six days of a heat wave experiment (a day before the heat exposure, three days of heat exposure, and two days of recovery). I also have the data from the last day, of plants (both species, with either nutrient levels) that were not exposed to heat at all (as a second control), so I tried analysing the data with and without these second control samples.   

Here is my coldata table (I'm presenting only the samples of species "At" to reduce the text, but it's the same for species "Ah"):

sampleID species temperature nitrate time_point
At_1C_1 At low high 1
At_2C_1 At low high 1
At_3C_1 At low high 1
At_1LN_1 At low low 1
At_2LN_1 At low low 1
At_3LN_1 At low low 1
At_1HS_2 At high high 2
At_2HS_2 At high high 2
At_3HS_2 At high high 2
At_1CS_2 At high low 2
At_2CS_2 At high low 2
At_3CS_2 At high low 2
At_1HS_3 At high high 3
At_2HS_3 At high high 3
At_3HS_3 At high high 3
At_1CS_3 At high low 3
At_2CS_3 At high low 3
At_3CS_3 At high low 3
At_1HS_4 At high high 4
At_2HS_4 At high high 4
At_3HS_4 At high high 4
At_1CS_4 At high low 4
At_2CS_4 At high low 4
At_3CS_4 At high low 4
At_1HS_5 At high high 5
At_2HS_5 At high high 5
At_3HS_5 At high high 5
At_1CS_5 At high low 5
At_2CS_5 At high low 5
At_3CS_5 At high low 5
At_1C_6 At low high 6
At_2C_6 At low high 6
At_3C_6 At low high 6
At_1HS_6 At high high 6
At_2HS_6 At high high 6
At_3HS_6 At high high 6
At_1LN_6 At low low 6
At_2LN_6 At low low 6
At_3LN_6 At low low 6
At_1CS_6 At high low 6
At_2CS_6 At high low 6
At_3CS_6 At high low 6

 

As one species is more tolerant to the treatment than the other, my main biological questions are

1. Which genes expressed differently between the two species under different nutrient treatments (species:nitrate interaction).

2. Which genes are differently regulated during the different stages (control, heat and recovery), hence the species:nitrate:time interaction with different contrasts.

However, when I'm running the test, it seems that some of the resultsNames levels are missing. 

Here is my code, when I'm ignoring the last day control samples (excluded from both counts and coldata tables):

dds <- DESeqDataSetFromMatrix(countData = counts, colData = coldata, design = ~species + nitrate + time_point + species:time_point + nitrate:time_point + species:nitrate + species:nitrate:time_point)

dds <- DESeq(dds)

## As a result I get:

resultsNames(dds)

[1] "Intercept"                         "species_At_vs_Ah"                 
 [3] "nitrate_low_vs_high"               "time_point_2_vs_1"                
 [5] "time_point_3_vs_2"                 "time_point_4_vs_1"                
 [7] "time_point_5_vs_2"                 "time_point_6_vs_1"               
 [9] "speciesAt.time_point2"             "speciesAt.time_point3"            
[11] "speciesAt.time_point4"             "speciesAt.time_point5"            
[13] "speciesAt.time_point6"            "nitratelow.time_point2"           
[15] "nitratelow.time_point3"            "nitratelow.time_point4"           
[17] "nitratelow.time_point5"            "nitratelow.time_point6"          
[19] "speciesAt.nitratelow.time_point1"  "speciesAt.nitratelow.time_point2" 
[21] "speciesAt.nitratelow.time_point3"  "speciesAt.nitratelow.time_point4" 
[23] "speciesAt.nitratelow.time_point5"  "speciesAt.nitratelow.time_point6"

I was expecting to get also interaction results for the "Ah" species and high nitrate levels, e.g. speciesAh.nitratelow.time_point2  and speciesAt.nitratehigh.time_point2, to contrast the two species. Am I missing something here?

I also tried to combine the species and nitrate factors to one species_nitrate factor, and running the test with the following design:

design(dds)
~species_nitrate + time_point + species_nitrate:time_point

## and then I got:

resultsNames(dds)

[1] "Intercept"                           "species_nitrate_Ah.low_vs_Ah.high"  
 [3] "species_nitrate_At.high_vs_Ah.high"  "species_nitrate_At.low_vs_Ah.high"  
 [5] "time_point_5_vs_2"                   "time_point_7_vs_2"                  
 [7] "time_point_8_vs_2"                   "time_point_9_vs_2"                  
 [9] "time_point_10_vs_2"                  "species_nitrateAh.low.time_point5"  
[11] "species_nitrateAt.high.time_point5"  "species_nitrateAt.low.time_point5"  
[13] "species_nitrateAh.low.time_point7"   "species_nitrateAt.high.time_point7" 
[15] "species_nitrateAt.low.time_point7"   "species_nitrateAh.low.time_point8"  
[17] "species_nitrateAt.high.time_point8"  "species_nitrateAt.low.time_point8"  
[19] "species_nitrateAh.low.time_point9"   "species_nitrateAt.high.time_point9" 
[21] "species_nitrateAt.low.time_point9"   "species_nitrateAh.low.time_point10" 
[23] "species_nitrateAt.high.time_point10" "species_nitrateAt.low.time_point10"

Where here, I'm missing the species_nitrate_At.low_vs_At.high and species_nitrate_At.low_vs_Ah.low results, as well as the results for the Ah.high interaction with time.

Can you please help me sort it out?

best,

Gil

 

R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)
DESeq2_1.12.4
deseq2 • 850 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 7 days ago
United States

hi,

The levels are not missing. When you have interaction terms in a formula and main effects, you will not have interaction terms for the reference level of the factors (e.g. species=Ah, nitrate=high, time point=2). The interaction terms are *additional* effects for the non-reference levels beyond the effect seen for the reference level. This is diagramed in the DESeq2 vignette section on interactions.

Given that you have a fairly complex experimental design with potential interactions between three factors, I'd recommend that you collaborate with a statistician to help with a design formula and how to extract the comparisons you'd like to make. The choice of design, and so the terms that show up in the model, is not one-to-one with the layout of the sample table, but involves some discussion and assumptions.

ADD COMMENT
0
Entering edit mode

Hi Michael,

Thanks for clearing that up.

Best,

Gil

ADD REPLY

Login before adding your answer.

Traffic: 621 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6