DESeq2 Factorial Design correcting for Batch + Lane + RIN
1
0
Entering edit mode
@choishingwan-9327
Last seen 8.4 years ago
Hong Kong

So recently we have got sample back with the following design

Condition Treatment Lane Batch RIN
Case Treated 1 1 7.7
Case Treated 2 1 7.7
Case Treated 1 4 7.6
Control Treated 2 4 8.1
Control Treated 1 1 7.8
Control Treated 2 1 7.7
Case Untreated 1 4 7.5
Case Untreated 2 3 7.8
Case Untreated 2 2 7.9
Control Untreated 2 1 7.4
Control Untreated 1 3 8
Control Untreated 1 2 7.8
data.frame(Condition=rep(rep(c("Case","Control"), each=3),2), Treatment=rep(c("Treated", "Untreated"),each=6), Lane=c(1,2,1,2,1,2,1,2,2,2,1,1), Batch=c(1,1,4,4,1,1,4,3,2,1,3,2), RIN=c(7.7,7.7,7.6,8.1,7.8,7.7,7.5,7.8,7.9,7.4,8,7.8))

What we would like to do is to:

1. Compare effect of treatment in case 

2. Compare effect of treatment in control

3. Compare and contrast the effect of treatment in case when compared to control.

From previous answers, it seems like there are two different model that we can use:

1.  ~Batch+Lane+RIN+Condition+Condition:Treatment

2. ~Batch+Lane+RIN+Condition+Group

Where group is <Treatment><Condition>

Take for example, when we try to compare the difference of condition when there is no treatment

e.g. Untreated Case vs Untreated Control

For 1, we use

dds = DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design = ~Batch+RIN+Lane+Condition+Condition:Treatment)
dds$Condition = relevel(dds$Condition, ref="Control")
dds$Treatment = relevel(dds$Treatment, ref="Untreated")
dds = DESeq(dds)
results(dds, contrast=c("Condition", "Control", "Case"))

For 2 we use

coldata$Group = paste0(coldata$Treatment, "-", coldata$Condition)
dds = DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design = ~Batch+RIN+Lane+Group)
dds = DESeq(dds)
results(dds, contrast=c("Group", "Untreated-Control", "Untreated-Case"))

However, I found that the two design actually give very different results.

Using the interaction term model, we will get around 438 genes with padj < 0.05 whereas none of the genes can anywhere close to significance when we use the group model.

Is there something that I did wrongly? Which one is the more appropriate method? And most importantly, why was there such a large difference?

deseq2 • 3.2k views
ADD COMMENT
4
Entering edit mode
@mikelove
Last seen 12 hours ago
United States

It's hard to say what's going on. Can you post the version number as well? If you haven't yet, make sure to update to the latest version 1.10.

I wouldn't ever add RIN to the design. This doesn't make sense in terms of the model and may be causing problems.

ADD COMMENT
0
Entering edit mode

Thank you Michael,

I am currently using DESeq2 version 1.10.0. I was adding the the RIN just to see if the RNA degradation might affect the results. I have tried to get the correlation of the p-value (which might not be correct but was only try to get an idea) of all possible combinations and the results from model 1 and model 2 doesn't agrees (the correlation file is here). So I am uncertain whether if there is some problem with my scripts...

(Forgot to mention, the rows with Interact as prefix are those that uses the interaction model).
 

Below is my R sessionInfo():

R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_Hong Kong SAR.1252  LC_CTYPE=English_Hong Kong SAR.1252    LC_MONETARY=English_Hong Kong SAR.1252
[4] LC_NUMERIC=C                           LC_TIME=English_Hong Kong SAR.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DESeq2_1.10.0              RcppArmadillo_0.6.300.2.0  Rcpp_0.12.2                SummarizedExperiment_1.0.1
 [5] Biobase_2.30.0             GenomicRanges_1.22.1       GenomeInfoDb_1.6.1         IRanges_2.4.4             
 [9] S4Vectors_0.8.3            BiocGenerics_0.16.1       

loaded via a namespace (and not attached):
 [1] RColorBrewer_1.1-2   futile.logger_1.4.1  plyr_1.8.3           XVector_0.10.0       futile.options_1.0.0
 [6] tools_3.2.2          zlibbioc_1.16.0      rpart_4.1-10         digest_0.6.8         RSQLite_1.0.0       
[11] annotate_1.48.0      gtable_0.1.2         lattice_0.20-33      DBI_0.3.1            proto_0.3-10        
[16] gridExtra_2.0.0      genefilter_1.52.0    cluster_2.0.3        stringr_1.0.0        locfit_1.5-9.1      
[21] nnet_7.3-10          grid_3.2.2           AnnotationDbi_1.32.0 XML_3.98-1.3         survival_2.38-3     
[26] BiocParallel_1.4.0   foreign_0.8-65       latticeExtra_0.6-26  Formula_1.2-1        geneplotter_1.48.0  
[31] ggplot2_1.0.1        reshape2_1.4.1       lambda.r_1.1.7       magrittr_1.5         scales_0.3.0        
[36] Hmisc_3.17-0         MASS_7.3-43          splines_3.2.2        xtable_1.8-0         colorspace_1.2-6    
[41] stringi_1.0-1        acepack_1.3-3.3      munsell_0.4.2       
ADD REPLY
0
Entering edit mode

Repeat the analysis but definitely do not include RIN in the design. Then compare 1 and 2.

ADD REPLY
0
Entering edit mode

It is slightly better in that both doesn't return any significance, however, the p-value still differs in that their correlation is only around 0.6827998

Really thank you for your help Michael

ADD REPLY
0
Entering edit mode

Correlation on p-values isn't a great comparison. Better is a cross tabulation:

table(res1$padj < .1, res2$padj < .1)

Remember though, that they are not identical methods so the results are not identical (2 uses LFC shrinkage and the 1 does not). My advice is if you are interested in testing the interaction term use 1, if you are interested in comparing groups use 2.

ADD REPLY
0
Entering edit mode

Thank you Michael. None of them return any  significance so I tried to use pvalue < 0.05 instead

    Model 2  
    False True
Model1 False 29994 0
  True 20 51

If that is normal, maybe I will stick with model 2 just because it is much simpler to understand. Thanks again for your timely feedback!

ADD REPLY

Login before adding your answer.

Traffic: 506 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6