Deseq 2 multiple comparisons (3 groups, 24 samples)
1
0
Entering edit mode
@matthewbaldwin-17095
Last seen 2.3 years ago
United Kingdom

Dear all, 

I am hoping you can help the correct test for my experimental design which is as follows: I tested the cells of 3 different donors (rotator cuff tears), on 8 different nano fibre scaffolds (different anisotropy; random vs aligned and different fibre diameters; 300nm,1000nm,2000 and 4000). 

I wish to compare the effects of each nano scaffold against all other scaffolds e.g. aligned300 vs aligned 1000 or aligned300 vs random4000, creating 56 possible comparisons. 

Do do this I have constructed the following data frame:

samplenames<-meta_data$SampleID 
samples <- data.frame((samplenames),donor=as.factor(c(rep("1",4),rep("2",4),rep("3",4))), anisotrophy=as.factor(c(rep("aligned",12), rep("random",12))), diameter=as.factor(rep(c("300","1000","2000","4000"),6)))​

X.samplenames.

<fctr>

donor

<fctr>

anisotrophy

<fctr>

diameter

<fctr>

 
i25-Healthy1-Aligned-300 1 aligned 300  
i26-Healthy1-Aligned-1000 1 aligned 1000  
i27-Healthy1-Aligned-2000 1 aligned 2000  
i28-Healthy1-Aligned-4000 1 aligned 4000  
i33-Healthy2-Aligned-300 2 aligned 300  
i34-Healthy2-Aligned-1000 2 aligned 1000  
i35-Healthy2-Aligned-2000 2 aligned 2000  
i36-Healthy2-Aligned-4000 2 aligned 4000  
i41-Healthy3-Aligned-300 3 aligned 300  
i42-Healthy3-Aligned-1000 3 aligned 1000

 

       
         

Q1: Having read the Deseq2 manual I think that in order to undertake the multiple comparisons, I need to perform the LRT analysis, is this correct?

With this in mind I have undertaken the following:

dds_multi<- DESeqDataSetFromMatrix(countData =healthy, colData=samples, design=~donor+anisotrophy+diameter+donor:anisotrophy+donor:diameter+anisotrophy:diameter)
keep <- rowSums(counts(dds_multi)) >= 10
dds <- dds_multi[keep,]
dds_multi <- DESeq(dds_multi, test="LRT", reduced=~donor + anisotrophy + diameter)​

This gives me the following results:

resultsNames(dds_multi)​
[1] "Intercept"                      "donor_2_vs_1"                  
 [3] "donor_3_vs_1"                   "anisotrophy_random_vs_aligned" 
 [5] "diameter_2000_vs_1000"          "diameter_300_vs_1000"          
 [7] "diameter_4000_vs_1000"          "donor2.anisotrophyrandom"      
 [9] "donor3.anisotrophyrandom"       "donor2.diameter2000"           
[11] "donor3.diameter2000"            "donor2.diameter300"            
[13] "donor3.diameter300"             "donor2.diameter4000"           
[15] "donor3.diameter4000"            "anisotrophyrandom.diameter2000"
[17] "anisotrophyrandom.diameter300"  "anisotrophyrandom.diameter4000"

I then report each comparison in turn until I get results for each of the 56 possible comparisons, for example:

res1<- results(dds_multi, alpha = 0.05, name = "diameter_2000_vs_1000", test="Wald")​ or 
res2<- results(dds_multi, alpha = 0.05, contrast=list(c("diameter_2000_vs_1000","anisotrophyrandom.diameter2000")),  test="Wald") 

Q2: Is this the correct way to do this? 

 

thanks for any help that can be offered,

Mat

 

deseq • 1.3k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 2 minutes ago
United States

I’m not sure doing all pairwise comparisons of 8 levels is the best way to approach this (and the interaction model doesn’t seem to be what you want I’d guess). What is your end goal with this analysis?

ADD COMMENT
0
Entering edit mode

 

Thank Michael, 

Grateful for your help. The goal for the analysis are several fold:

1) Which factor (anisotrophy or diameter) produces the greatest change at an RNA level for tendon cells

2) Which genes are influenced by different anisotrophy (random vs aligned)

3) Which genes are influences by different diameter (300, 1000, 2000, 4000nm). And is the effect of diameter further modified by the scaffolds overall alignment (e.g. 300 random vs 300 random)

thanks, 

Mat

 

 

ADD REPLY
0
Entering edit mode

Did you consider modeling diameter as a numeric? Do you have any idea how expression will change over diameter?

ADD REPLY
0
Entering edit mode

Hi Michael, 

It would be fine to model as a diameter as a numeric e.g. 0.3, 1,2,4. We believe that the greatest difference will be for the smallest i.e. 0.3um and largest diameters i.e. 4um

ADD REPLY
0
Entering edit mode

So one option is ~donor + condition + diameter, with diameter as a numeric. This assumes no interaction between condition and diameter. Another option is ~donor + condition + condition:diameter, where you will get different diameter slopes for the two conditions.

ADD REPLY
0
Entering edit mode

Thanks Michael. 

I have thought further and noted in the vignette about grouping interactions. I have refined my 3 questions:

1) For scaffolds with aligned fibre orientation, what is the effect of fibre diameter ?( i.e. 300Aligned vs 1000Aligned, 300Aligned vs 2000Aligned and 300Aligned vs 4000Aligned)

2) For scaffolds with random fibre orientation, what is effect of fibre diameter ? (i.e. 300random vs 1000random, 300random vs 2000random  and 300random vs 4000random)

3) For a fixed fibre diameter, what effect does loss of scaffold alignment have? (i.e 300random vs 300 aligned, 1000random vs 1000aligned, 2000random vs 2000aligned, 4000random vs 4000aligned).

As such I have simplified the design to:

dds_disease<- DESeqDataSetFromMatrix(countData =disease, colData=samples_disease, design=~donor+anisotrophy+diameter+anisotrophy:diameter)
keep <- rowSums(counts(dds_disease)) >= 10
dds_disease <- dds_disease[keep,]
dds_disease$group <- factor(paste0(dds_disease$anisotrophy, dds_disease$diameter))
dds_disease$group <- relevel(dds_disease$group, ref = "aligned300")
design(dds_disease) <- ~ group
dds_disease <- DESeq(dds_disease)
resultsNames(dds_disease)

This produces the following comparisons:

[1] "Intercept"                      "group_aligned300_vs_random300"  "group_aligned1000_vs_random300" "group_aligned2000_vs_random300" "group_aligned4000_vs_random300" "group_random1000_vs_random300"  "group_random2000_vs_random300"  "group_random4000_vs_random300"

This will allow me to answer question 1. For example: 

res_disease_LFC <- lfcShrink(dds_disease, coef="group_aligned4000_vs_aligned300", type="apeglm")

However, since I cannot use the contrast argument under the lfcShrink function, to generate the comparisons I want I think I need to relevel and then rerun the Deseq2 function. Is this permitted? i.e. 

dds_disease$group <- relevel(dds_disease$group, ref = "random300")
dds_disease <- DESeq(dds_disease)
resultsNames(dds_disease)

This then generates the below comparisons, which I can use to answer question 2: 

"Intercept"                      "group_aligned300_vs_random300"  "group_aligned1000_vs_random300" "group_aligned2000_vs_random300"  "group_aligned4000_vs_random300" "group_random1000_vs_random300"  "group_random2000_vs_random300"  "group_random4000_vs_random300" 

thanks again for all the help. Very much appreciated, 

Mat

ADD REPLY
0
Entering edit mode

hi Mat,

Looks good. Yes, to use apeglm for LFC shrinkage, you need to relevel, but you only need to rerun nbinomWaldTest() actually, which should be much faster. So you don't need to rerun dispersion estimation.

ADD REPLY

Login before adding your answer.

Traffic: 845 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6