Search
Question: Translating model formula from DESeq to DEXSeq
0
2.9 years ago by
United States

I am trying to run DEXSeq and running into an issue where I get an error when I try to run testForDEU:

unable to find an inherited method for function ‘mcols’ for signature ‘"matrix"’

After running through testForDEU with the debugger, I noticed the following issue with degrees of freedom, which doesn't actually end up being bubbled up through the normal output of testForDEU:

Browse[2]> splitObject
$1 <remote-error in nbinomLRT(x, reduced = reducedModelMatrix, full = fullModelMatrix): less than one degree of freedom, perhaps full and reduced models are not in the correct order> traceback() available as 'attr(x, "traceback")' I was trying to recapitulate in DEXSeq a model that I used for a DESeq analysis, and it looks like I didn't quite get it right. Here's what my DESeq model formula was: ~Strain + Treatment + Strain:Treatment The general concept was to use this to answer "Is the treatment effect different across strains?" (effectively testing for a difference in responsiveness to treatment between the strains). There are two strains, and two states for treatment (treated, untreated). I haven't previously used DEXSeq for more than just simple two-group comparisons, so after consulting the vignette I came up with the following for DEXSeq: fullModel <- ~sample + exon + Strain:exon + Treatment:exon + Strain:Treatment:exon reducedModel <- ~sample + exon + Strain:Treatment:exon dxd <- DEXSeqDataSetFromHTSeq(countFiles, sampleData = sampleTable, design = fullModel,flattenedfile = annot) dxd$Treatment <- relevel(dxd$Treatment, ref = "untreated") dxd$Strain <- relevel(dxd\$Strain, ref = "strain1")
dxd <- estimateSizeFactors(dxd)
dxd <- estimateDispersions(dxd, formula = fullModel)
dxd <- testForDEU(dxd, fullModel = design(dxd), reducedModel = reducedModel)

I imagine that the issue is that I have five parameters specified in the full model, and only three in the reduced model, but I'm not sure how to specify it to get what I'm looking for. Any assistance appreciated!

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] BiocInstaller_1.20.0       openxlsx_3.0.0             DEXSeq_1.16.1              DESeq2_1.10.0              RcppArmadillo_0.6.200.2.0
[6] Rcpp_0.12.1                SummarizedExperiment_1.0.1 Biobase_2.30.0             BiocParallel_1.4.0         GenomicRanges_1.22.1
[11] GenomeInfoDb_1.6.1         IRanges_2.4.1              S4Vectors_0.8.1            BiocGenerics_0.16.1

loaded via a namespace (and not attached):
[1] genefilter_1.52.0    statmod_1.4.22       locfit_1.5-9.1       reshape2_1.4.1       splines_3.2.2        lattice_0.20-33
[7] colorspace_1.2-6     survival_2.38-3      XML_3.98-1.3         foreign_0.8-66       DBI_0.3.1            RColorBrewer_1.1-2
[13] lambda.r_1.1.7       plyr_1.8.3           stringr_1.0.0        zlibbioc_1.16.0      Biostrings_2.38.0    munsell_0.4.2
[19] gtable_0.1.2         futile.logger_1.4.1  hwriter_1.3.2        latticeExtra_0.6-26  geneplotter_1.48.0   biomaRt_2.26.0
[25] AnnotationDbi_1.32.0 proto_0.3-10         acepack_1.3-3.3      xtable_1.8-0         scales_0.3.0         Hmisc_3.17-0
[31] annotate_1.48.0      XVector_0.10.0       Rsamtools_1.22.0     gridExtra_2.0.0      ggplot2_1.0.1        digest_0.6.8
[37] stringi_1.0-1        grid_3.2.2           tools_3.2.2          bitops_1.0-6         magrittr_1.5         RCurl_1.95-4.7
[43] RSQLite_1.0.0        Formula_1.2-1        cluster_2.0.3        futile.options_1.0.0 MASS_7.3-45          rpart_4.1-10
[49] nnet_7.3-11   
modified 2.9 years ago by Alejandro Reyes1.6k • written 2.9 years ago by Cornwell, Adam110
3
2.9 years ago by
Alejandro Reyes1.6k
Dana-Farber Cancer Institute, Boston, USA
Alejandro Reyes1.6k wrote:

Thanks for the detailed report! I think the reduced model that you want is the following:

reducedModel <- ~sample + exon + Strain:exon + Treatment:exon

This reduced model does not contain the interaction between the exon, treatment and strain, which would be the differences in exon usage due to the treatments that are different between strains.

Let me know if this works!
Alejandro

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Alejandro Reyes1.6k

Just to confirm- I am looking for differential exon usage attributable to differences in the treatment across the strains- I don't need the triple interaction term to get that?

This is the first time I've had to do this full model/reduced model specification, and it's a little confusing.

1

OK! To find strain-specific differences in exon usage due to the treatment, you would need to use the following formulae:

fullModel <-  ~sample + exon + Strain:exon + Treatment:exon + Strain:Treatment:exon

reducedModel <- ~sample + exon + Strain:exon + Treatment:exon

To detect differences in exon usage that affect both strains in the same manner due to the treatment:

fullModel <-  ~sample + exon + Strain:exon + Treatment:exon

reducedModel <- ~sample + exon + Strain:exon

Alejandro

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by Alejandro Reyes1.6k

I'm trying the model you suggested, which got through testForDEU, but I'm further not sure what to do for calculating fold changes with estimateExonFoldChanges. DESeq2 allows fetching results for anything that comes up in resultsNames(). However, DEXSeq seems to only allow fetching foldchanges for coefficients represented by the factors that appear in colData() which in this case are strain and treatment individually. How would I go about fetching the results for the interaction, as I was able to do with DESeq2?

1

The results given by the DEXSeqResults function will contain the p-values of the LRT (comparing the reduced and full models). The fold change implementation of DEXSeq can be estimated based only for one variable (the one specified by fitExpToVar in estimateExonFoldChanges). Calculating relative exon usage based on two interacting terms is currently not implemented in DEXSeq.

We did a similar thing for this manuscript: http://www.pnas.org/content/110/38/15377.full.pdf
And defined relative exon usage coefficients based on the exon usage in a specific tissue-species combination compared to the average exon usage across all tissue-species combination (see Fig1). I think this also could be done with your data, for the interaction strain-treatment. Although this might involve modifying the code from the supplementary material and some additional coding...

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by Alejandro Reyes1.6k

Thanks for the info!