DESeq2 model design using continuous viable only
Entering edit mode
NinaKons • 0
Last seen 3.8 years ago
SanFran, CA

Hi Mike,

I am trying to design a DESeq2 model using solely a continuous variable.

However, it is treated as a factor inside the model, even after I specify it 'as.numeric'. 

  1.  How can a "cont_var" be specified in a way that it is seen/accepted as a numeric continuous variable inside the model?
  2.  How can this numeric continuous variable be used in the results

Please, see a reproducible example below, many thanks in advance:

# gene counts
cts3 <- matrix(c(0, 1, 0, 2, 3000, 0, 100, 200, 500), ncol=3)
colnames(cts3) <- c("samp1", "samp2", "samp3")
rownames(cts3) <- c("gene1", "gene2", "gene3")

# metadata
metadata3 <- matrix(c(0, 4, 130), ncol=1)
rownames(metadata3) <- c("samp1", "samp2", "samp3") 
colnames(metadata3) <- "cont_var"

# DESeq model, treats 'cont_var' as factor, why?
dds3 <- DESeqDataSetFromMatrix(countData = cts3,
                              colData = metadata3,
                              design = ~ as.numeric(cont_var)) 

dds3 <- DESeq(dds3)

# transformation
pca <- DESeq2::varianceStabilizingTransformation(dds3, blind=FALSE)

# plot
pcaplot(pca, intgroup="cont_var", text_labels=FALSE, point_size = 5)  

# contrast, how?
results = results(dds3, contrast=c("cont_var", ..,  ...), cooksCutoff = TRUE)




deseq2 mikelove • 1.5k views
Entering edit mode

Many thanks Mike,

The reason I think that the 'cont_var' is not considered as a continuous numeric variable is the plot above.

E.g. I expected the legend to have continuous scale rather than indicating individual values.

Am I wrong? Thanks 


Entering edit mode

"group" is turned into a factor just for this PCA plot, but not for the DESeq() analysis. You can see it is a continuous variable by examining resultsNames(dds). There will be just a single coefficient, not two coefficients: 4_vs_0 and 130_vs_0.

Entering edit mode
Last seen 3 hours ago
United States

The following works to treat cont_var as continuous:

dds <- DESeqDataSetFromMatrix(countData = cts3, 
  colData = metadata3, 
  design = ~cont_var)

dds <- DESeq(dds)
res <- results(dds)

This prints a message:

  the design formula contains a numeric variable with integer values,
  specifying a model with increasing fold change for higher values.
  did you mean for this to be a factor? if so, first convert
  this variable to a factor using the factor() function

Note that it's simply a message, it kept 'cont_var' as a continuous variable, but instructed how to change it to a factor if it was not intended to treat it as a continuous variable.

Entering edit mode

Hi Michael,

I have a couple of questions about DESeq2.

1) Could we use it for repeated measured outcome? Do we only need to add a time variable in the design as below?

dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design= ~ time + condition)

2) For the condition variable, could we use continuous outcome here? Or we can only use the factor variable for condition due to NB distribution behind the DESeq2?

3) I saw you mentioned that DESeq2 only works for replicates data, here replicates data means biological replicate or technical replicate?

Thanks in advance.


Entering edit mode

link to other post:

DESeq2 for longitudinal data


Login before adding your answer.

Traffic: 224 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6