Search
Question: Multi-factor analysis Deseq2
0
gravatar for Ir5sE
22 days ago by
Ir5sE0
Ir5sE0 wrote:

 

Hi Mike and community,

I’m using theDESeq2 package for my RNA-seq analysis. I would like to get feedback on the choice of my models and applied contrasts. If there is a better model to answer our questions, I would appreciate any suggestions. 

I use a 3-level factor analysis:

Cultivar factor with 2 levels (Jonathan, Prima)

Time factor with 3 levels (T1, T2, T3)

Treatment factor with 2 levels (Infected, Control) with 3 biological replicates each.

Experiment: two cultivars have been inoculated witha pathogenic fungus, and sampleswere taken for sequencing at day 5 (time point1), 15 (time point 2), and 30 (time point3).  Control samples have been treated in the same manner, but inoculation has been done with water. One of the purpose of control samples was to remove genes that were expressed due to physical cut of the plant.

Aim:

1) Effect of treatment on cultivar

2) Effect of time on cultivar 

Complication: In order to compare the effect of treatment within and across cultivars, we have to reduce infected samples with control samples before making any comparison for each time point and cultivar. Therefore, I have nested cultivar with time point for my first question, and run full model using Wald test where I have used only interaction terms to compare effect of treatment within and between time points for each cultivar.

I have been wondering if using main effect + interaction term could give me the same control over treatment within each cultivar and time point without nesting?

1) Effect of treatment on cultivar:

My reference levels are in alphabetical order. Jonathan_t1, Control. 

full~Cultivar + Cultivar:Treatment

#> resultsNames(dds)

#[1] "Intercept"                          "Cultivar_Jonathan_t2_vs_Jonathan_t1" "Cultivar_Jonathan_t3_vs_Jonathan_t1"

#[4] "Cultivar_Prima_t1_vs_Jonathan_t1"   "Cultivar_Prima_t2_vs_Jonathan_t1"   "Cultivar_Prima_t3_vs_Jonathan_t1"  

#[7] "CultivarJonathan_t1.TreatmentInf"   "CultivarJonathan_t2.TreatmentInf"   "CultivarJonathan_t3.TreatmentInf"  

#[10] "CultivarPrima_t1.TreatmentInf"      "CultivarPrima_t2.TreatmentInf"      "CultivarPrima_t3.TreatmentInf"

 

In order to compare across cultivars while controlling for specific time points, I applied this contrast:

contrast = list(c("CultivarJonathan_t1.TreatmentInf","CultivarPrima_t1.TreatmentInf")), test = 'Wald', alpha = 0.01)

name = "CultivarJonathan_t1.TreatmentInf", test = 'Wald', alpha = 0.01

2) Effect of time on cultivar:

I want to detect genes that are differentially expressed due to time within cultivar and across cultivars, I didn’t nest any variables in the model, I have followed this post: DESeq2 time series contrasts between timepoints.  

full ~Cultivar + Treatment + Time + Cultivar:Time

reduced ~Cultivar + Treatment + Time, test="LRT"

> resultsNames(dds)

[1] "Intercept"                 "Cultivar_Prima_vs_Jonathan" "Treatment_Inf_vs_Cont"     "Time_T2_vs_T1"             "Time_T3_vs_T1"            

[6] "CultivarPrima.TimeT2"       "CultivarPrima.TimeT3"

> coldata
   SampleName Treatment Time    Cultivar
1    Jonathan       Inf   T1 Jonathan_t1
2    Jonathan       Inf   T1 Jonathan_t1
3    Jonathan       Inf   T1 Jonathan_t1
4    Jonathan      Cont   T1 Jonathan_t1
5    Jonathan      Cont   T1 Jonathan_t1
6    Jonathan      Cont   T1 Jonathan_t1
7       Prima       Inf   T1    Prima_t1
8       Prima       Inf   T1    Prima_t1
9       Prima       Inf   T1    Prima_t1
10      Prima      Cont   T1    Prima_t1
11      Prima      Cont   T1    Prima_t1
12      Prima      Cont   T1    Prima_t1
13   Jonathan       Inf   T2 Jonathan_t2
14   Jonathan       Inf   T2 Jonathan_t2
15   Jonathan       Inf   T2 Jonathan_t2
16   Jonathan      Cont   T2 Jonathan_t2
17   Jonathan      Cont   T2 Jonathan_t2
18   Jonathan      Cont   T2 Jonathan_t2
19      Prima       Inf   T2    Prima_t2
20      Prima       Inf   T2    Prima_t2
21      Prima       Inf   T2    Prima_t2
22      Prima      Cont   T2    Prima_t2
23      Prima      Cont   T2    Prima_t2
24      Prima      Cont   T2    Prima_t2
25   Jonathan       Inf   T3 Jonathan_t3
26   Jonathan       Inf   T3 Jonathan_t3
27   Jonathan       Inf   T3 Jonathan_t3
28   Jonathan      Cont   T3 Jonathan_t3
29   Jonathan      Cont   T3 Jonathan_t3
30   Jonathan      Cont   T3 Jonathan_t3
31      Prima       Inf   T3    Prima_t3
32      Prima       Inf   T3    Prima_t3
33      Prima       Inf   T3    Prima_t3
34      Prima      Cont   T3    Prima_t3
35      Prima      Cont   T3    Prima_t3
36      Prima      Cont   T3    Prima_t3

 

Thank you!

Iryna

ADD COMMENTlink modified 20 days ago by Michael Love20k • written 22 days ago by Ir5sE0
1
gravatar for Michael Love
20 days ago by
Michael Love20k
United States
Michael Love20k wrote:

hi Iryna

I think you can model everything in one go, and extract contrasts afterward. I tried to mock up your experiment (minus the replication) with this code:

cond <- gl(2,3,12) # control vs treated
cltv <- gl(2,6) # cultivar 1 or 2
time <- gl(3,1,12) # three time points

Then the following design, ~cltv + cltv:time + cond:cltv:time, gives you a cultivar difference at time point, differences for each cultivar at each time point, and a potential effect of treatment for each cultivar and time point. This sounds like the correct design for your experiment. You can test individual effects in this design using a Wald test, or if you want to test multiple coefficients, you can remove the columns from the design matrix and use a LRT. Are you familiar with how to build the model matrix:

mm <- model.matrix(~cltv + cltv:time + cond:cltv:time, data=colData(dds))

If you wanted to do the LRT, you can provide mm to DESeq() as the full argument, and then remove specific columns from mm and provide this reduced matrix to the reduced argument.

ADD COMMENTlink written 20 days ago by Michael Love20k

Thank you so much Michael for your answer!

I just want to make sure I'm correct with building and interpreting my contrasts.

1) Cultivar difference at a time point

reference level is cltv1: difference between cultivars at time point 3

results(dds, contrast = list(c("cltv2","cltv2.time3")), test = 'Wald') 

2) Differences for each cultivar at each time point.

reference level cultivar 2:  Gives the difference between time3 over reference level cultivar 

results(dds, name = "cltv1.time3")

3) Effect of treatment for each cultivar and time point. Here I'm a bit confused, as the most important comparison to make is contrast between cltv1 vs cltv2. My question is weather I can use only interaction terms to compare treatment effect across cultivars and specific time point (I) ?

base-level is cult2:`Will this contrast give the effect of treatment in cultivar 2 at time 3 vs2 controlling for individual baseline?

(I)results(dds, contrast = list(c("cltv2.time3.cond2", "cltv2.time2.cond2")), test = 'Wald')

Or shall I use main effect + interaction term(II)?`

difference between cult1 vs cult2 at time point 3 including the effect of treatment in the model?

(II) results(dds, contrast = list(c("cltv2","cltv2.time3.cond2")), test = 'Wald')

In case I need to use main effect and interaction term, how could I compare between cultivars? For instance, if I have more then 2 cultivars, shall I change base level to get in resultsName the needed main effect (cultivar 3 vs cultivar5), because only main effect Im getting is between cultivar and baseline cultivar. Would appreciate any help!

Thanks in advance,

Iryna

ADD REPLYlink modified 12 days ago • written 12 days ago by Ir5sE0
1

hi Iryna,

The design I suggested is not the easiest for extracting e.g. differences between various groups at every time point. (1) isn't correct as you have it, but rather than risk not getting it right, you can just combine all three factors into one and use ~group for those contrasts. You can take the 'dds' output from a first run using the design I have in my first answer, reassign the design ~group and then just run nbinomWaldTest() without re-running dispersion estimation to do this.

Yes, the (I) line of code you have is defining "effect of treatment in cultivar 2 at time 3 vs 2". You don't add the main effect because it would cancel out.

For your last question, there's not "controlling for treatment" in the interaction model. You have different effects for the two treatments.

ADD REPLYlink written 11 days ago by Michael Love20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 462 users visited in the last hour