DESeq2 only runs interaction factor x continuous variable with old not new version
4
0
Entering edit mode
hmgeiger ▴ 10
@hmgeiger-9902
Last seen 5.4 years ago

Trying to run the following code to compare the interaction between three diagnosis groups (call them A, B, and C) and age (a continuous variable, not grouped into factors). The code works fine with DESeq2 1.6.3, but not with 1.10.1.

I checked and diagnosis is definitely a factor and age is definitely an integer.

design <- data.frame(Sample=paste(1:24,"D",sep=""),Diagnosis=factor(rep(c("A","B","C")),times=c(11,7,6)),Age=c(78,75,54,58,54,75,48,47,42,30,25,43,65,51,56,31,50,46,65,50,80,48,59,61))

dds <- DESeq(DESeqDataSetFromMatrix(countData=featureCounts[,paste(1:24,"D",sep="")],colData=design,design=~Diagnosis:Age))

First comparison is this, and is where I get the error. I would eventually also like to compare A vs. C and B vs. C.

results <- results(dds,contrast=c("Diagnosis","A.Age","B.Age"))

Error when running newer DESeq2:

Error in rowSums(cts.sub == 0) :

  'x' must be an array of at least two dimensions

SessionInfo:

R version 3.2.1 (2015-06-18)

Platform: x86_64-unknown-linux-gnu (64-bit)

Running under: CentOS Linux 7 (Core)


locale:

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              

 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    

 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   

 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 

 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       


attached base packages:

[1] parallel  stats4    methods   stats     graphics  grDevices utils    

[8] datasets  base     


other attached packages:

 [1] DESeq2_1.10.1              RcppArmadillo_0.6.500.4.0 

 [3] Rcpp_0.12.3                SummarizedExperiment_1.0.2

 [5] Biobase_2.30.0             GenomicRanges_1.22.4      

 [7] GenomeInfoDb_1.6.3         IRanges_2.4.8             

 [9] S4Vectors_0.8.11           BiocGenerics_0.16.1       


loaded via a namespace (and not attached):

 [1] RColorBrewer_1.1-2   futile.logger_1.4.1  plyr_1.8.3          

 [4] XVector_0.10.0       futile.options_1.0.0 zlibbioc_1.16.0     

 [7] rpart_4.1-10         RSQLite_1.0.0        annotate_1.48.0     

[10] gtable_0.2.0         lattice_0.20-33      DBI_0.3.1           

[13] gridExtra_2.2.1      genefilter_1.52.1    cluster_2.0.3       

[16] locfit_1.5-9.1       grid_3.2.1           nnet_7.3-12         

[19] AnnotationDbi_1.32.3 XML_3.98-1.4         survival_2.38-3     

[22] BiocParallel_1.4.3   foreign_0.8-66       latticeExtra_0.6-28 

[25] Formula_1.2-1        geneplotter_1.48.0   ggplot2_2.1.0       

[28] lambda.r_1.1.7       Hmisc_3.17-2         scales_0.4.0        

[31] splines_3.2.1        colorspace_1.2-6     xtable_1.8-2        

[34] acepack_1.3-3.3      munsell_0.4.3       

 

deseq2 • 2.3k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

You should include the main effects as well, I think you mean to use a design of ~Diagnosis*Age which expands to ~Diagnosis + Age + Diagnosis:Age.

Can you try this, and then see what you get with resultsNames(dds)?

ADD COMMENT
0
Entering edit mode

Also see my note in the vignette on using continuous covariates like age. It is in the FAQ.

ADD REPLY
0
Entering edit mode
hmgeiger ▴ 10
@hmgeiger-9902
Last seen 5.4 years ago

I tried this, but still get the same error at the results step. 

resultsNames(dds) now returns the following:

[1] "Intercept"               "Diagnosis_Normal_vs_MDD"

[3] "Diagnosis_SCZ_vs_MDD"    "Age"                    

[5] "DiagnosisNormal.Age"     "DiagnosisSCZ.Age"  

ADD COMMENT
0
Entering edit mode

Ah, it should be giving a more helpful error but it looks like my error checking code is broken and I need to fix this in the development branch. Thanks for tipping me to this.

In order to use the style of argument: contrast=c(x,y,z), x needs to be a variable in the design, and y and z need to be levels of that factor. A.Age and B.Age are not levels of Diagnosis, which should throw a helpful error.

From ?results:

contrast: this argument specifies what comparison to extract from the
          ‘object’ to build a results table. one of either:

            • a character vector with exactly three elements: the name
              of a factor in the design formula, the name of the
              numerator level for the fold change, and the name of the
              denominator level for the fold change (simplest case)

But can you tell me what you are trying to extract? The interaction terms are:

results(dds, name="DiagnosisNormal.Age")

results(dds, name="DiagnosisSCZ.Age")

ADD REPLY
0
Entering edit mode
hmgeiger ▴ 10
@hmgeiger-9902
Last seen 5.4 years ago

Oh, sorry, I was previously using aliases for some of the variables.

Continuing with the aliases, this should read:

[1] "Intercept"               "Diagnosis_B_vs_A"

[3] "Diagnosis_C_vs_A"    "Age"                    

[5] "DiagnosisB.Age"     "DiagnosisC.Age"  

ADD COMMENT
0
Entering edit mode
hmgeiger ▴ 10
@hmgeiger-9902
Last seen 5.4 years ago

The investigator wants to look at the interaction between age and diagnosis. They wouldn't be opposed to grouping age into factors as appropriate I don't think, though I also wanted to just see if it could be done keeping age as a continuous variable.

ADD COMMENT
0
Entering edit mode

A small note about using the support site: you can use the ADD REPLY or ADD COMMENT buttons to reply to a previous answer. You are adding answers to your own question.

So, you should now be able to extract the interaction terms. If you have questions about their interpretation, check the interaction section of the vignette.

ADD REPLY
0
Entering edit mode

Got it! Thanks.

This is how I ended up setting up my script. A is normal, B and C are my two test conditions.

dds <- DESeq(DESeqDataSetFromMatrix(countData=featureCounts[,paste(1:24,"D",sep="")],
colData=design,design=~Age + Diagnosis + Diagnosis:Age))

B_vs_control_plus_interaction_age <- results(dds,contrast=list(c("Diagnosis_B_vs_control","Age.B")))

C_vs_control_plus_interaction_age <- results(dds,contrast=list(c("Diagnosis_C_vs_control","Age.C")))

 

ADD REPLY
0
Entering edit mode

You may want to speak to a local biostatistician to help with the interpretation here.

Note that, if you code Age as a numeric, the "Age" term in the results is the log2 fold change for the A group associated with an increase in Age of 1 unit. And the interaction "Age.B" is the difference between the Age LFC for group B and the Age LFC group A.

ADD REPLY

Login before adding your answer.

Traffic: 798 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6