I want to know the most appropriate way to compare one group within another group when doing a differential expression test.
In our analysis, we have two cell populations (CD11b, TN) and two treatment conditions (UT, DBP.FITC). I want to know how I should compare different treatments within a cell population (e.g. within TN, compare UT vs DBP.FITC). Here's the column data:
Label Replicate Condition CellPop test 1500-1 TN.UT.1 1 UT TN TN.UT 1500-11-LY CD11b+.UT.4 4 UT CD11b+ CD11b+.UT 1500-12 CD11b+.UT.3 3 UT CD11b+ CD11b+.UT 1500-13 CD11b+.DBP+FITC.1 1 DBP+FITC CD11b+ CD11b+.DBP+FITC 1500-14 CD11b+.DBP+FITC.2 2 DBP+FITC CD11b+ CD11b+.DBP+FITC 1500-15 CD11b+.DBP+FITC.3 3 DBP+FITC CD11b+ CD11b+.DBP+FITC 1500-2 TN.UT.2 2 UT TN TN.UT 1500-3 TN.UT.3 3 UT TN TN.UT 1500-4 TN.DBP+FITC.1 1 DBP+FITC TN TN.DBP+FITC 1500-5 TN.DBP+FITC.2 2 DBP+FITC TN TN.DBP+FITC 1500-6 TN.DBP+FITC.3 3 DBP+FITC TN TN.DBP+FITC 1890-17 CD11b+.UT.r.2 2 UT CD11b+ CD11b+.UT
The DESeq2 manual suggests that combining the two factors is an appropriate way to do the analysis:
Many users begin to add interaction terms to the design formula, when in fact a much simpler approach would give all the results tables that are desired. We will explain this approach fi rst, because it is much
simpler to perform. If the comparisons of interest are, for example, the effect of a condition for diff erent sets of samples, a simpler approach than adding interaction terms explicitly to the design formula
is to perform the following steps:
1. combine the factors of interest into a single factor with all combinations of the original factors
2. change the design to include just this factor, e.g. '~ group'
Using this design is similar to adding an interaction term, in that it models multiple condition effects which can be easily extracted with 'results'.
So using this approach, I can compare my groups by only including the combined term 'test':
dds.sub <-DESeqDataSetFromMatrix( countData = collapsed.paper.mat, colData = paper.samples.df, design = ~ test);
And then results can be extracted from this:
> head(results(dds.sub, contrast=list(c("testTN.DBP.FITC"), c("testTN.UT")))) log2 fold change (MAP): testTN.DBP.FITC vs testTN.UT Wald test p-value: testTN.DBP.FITC vs testTN.UT DataFrame with 6 rows and 6 columns ...
But I want to make sure that I'm taking into account variation that is attributable to the cell population. Can I include the separate factors only?:
dds.sub <-DESeqDataSetFromMatrix( countData = collapsed.paper.mat, colData = paper.samples.df, design = ~ CellPop + Condition);
Or include an interaction term?:
dds.sub <-DESeqDataSetFromMatrix( countData = collapsed.paper.mat, colData = paper.samples.df, design = ~ CellPop + Condition + CellPop:Condition);
Once I've done that, how do I get results out of it? Unfortunately, my factor foo is not sufficient for me to be able to work out how to implement the interaction model and get the right results out. Here is my best attempt trying to pull out results without the interaction term:
"design = ~ CellPop + Condition"
> print(resultsNames(dds.sub));  "Intercept" "CellPopCD11b." "CellPopTN" "ConditionDBP.FITC" "ConditionUT" > head(results(dds.sub, contrast=list(c("CellPopTN", "ConditionDBP.FITC"), c("CellPopTN", "ConditionUT")))) Error in head(results(dds.sub, contrast = list(c("CellPopTN", "ConditionDBP.FITC"), : error in evaluating the argument 'x' in selecting a method for function 'head': Error in cleanContrast(object, contrast, expanded = isExpanded, listValues = listValues, : elements in the contrast list should only appear in the numerator (first element of contrast list) or the denominator (second element of contrast list), but not both
And also with the interaction term:
"design = ~ CellPop + Condition + CellPop:Condition"
> print(resultsNames(dds.sub));  "Intercept" "CellPop_TN_vs_CD11b." "Condition_UT_vs_DBP.FITC" "CellPopTN.ConditionUT" > head(results(dds.sub, contrast=list(c("CellPopTN.ConditionUT"), c("CellPopTN.ConditionDBP.FITC")))) Error in head(results(dds.sub, contrast = list(c("CellPopTN.ConditionUT"), : error in evaluating the argument 'x' in selecting a method for function 'head': Error in cleanContrast(object, contrast, expanded = isExpanded, listValues = listValues, : all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)'
How can I get this working?
Here's the session info, for reference:
> sessionInfo() R version 3.1.3 (2015-03-09) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux Server release 6.5 (Santiago) locale:  LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8  LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C  LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages:  grid parallel stats4 stats graphics grDevices utils datasets methods base other attached packages:  PKI_0.1-3 base64enc_0.1-3 VennDiagram_1.6.16 futile.logger_1.4.1 shiny_0.12.2 FactoMineR_1.31.3  stringr_1.0.0 DESeq2_1.6.3 RcppArmadillo_0.6.100.0.0 Rcpp_0.12.1 GenomicRanges_1.18.4 GenomeInfoDb_1.2.5  IRanges_2.0.1 S4Vectors_0.4.0 BiocGenerics_0.12.1 ellipse_0.3-8 car_2.0-25 loaded via a namespace (and not attached):  acepack_1.3-3.3 annotate_1.44.0 AnnotationDbi_1.28.2 BatchJobs_1.6 BBmisc_1.9 Biobase_2.26.0 BiocParallel_1.0.3  brew_1.0-6 checkmate_1.6.2 cluster_2.0.3 codetools_0.2-14 colorspace_1.2-6 DBI_0.3.1 digest_0.6.8  fail_1.3 flashClust_1.01-2 foreach_1.4.2 foreign_0.8-66 Formula_1.2-1 futile.options_1.0.0 genefilter_1.48.1  geneplotter_1.44.0 ggplot2_1.0.1 gridExtra_2.0.0 gtable_0.1.2 Hmisc_3.17-0 htmltools_0.2.6 httpuv_1.3.3  iterators_1.0.7 jsonlite_0.9.17 lambda.r_1.1.7 lattice_0.20-33 latticeExtra_0.6-26 leaps_2.9 lme4_1.1-9  locfit_1.5-9.1 magrittr_1.5 MASS_7.3-43 Matrix_1.2-2 MatrixModels_0.4-1 mgcv_1.8-7 mime_0.4  minqa_1.2.4 munsell_0.4.2 nlme_3.1-122 nloptr_1.0.4 nnet_7.3-10 pbkrtest_0.4-2 plyr_1.8.3  proto_0.3-10 quantreg_5.19 R6_2.1.1 RColorBrewer_1.1-2 reshape2_1.4.1 rpart_4.1-10 RSQLite_1.0.0  rstudio_0.98.1103 rstudioapi_0.3.1 scales_0.3.0 scatterplot3d_0.3-36 sendmailR_1.2-1 SparseM_1.7 splines_3.1.3  stringi_0.5-5 survival_2.38-3 tools_3.1.3 XML_3.98-1.3 xtable_1.7-4 XVector_0.6.0