Question

Clarification about DESeq2 counts() function when dealing with covariates

0

Entering edit mode

rb123 • 0

@cedd98d2

Last seen 5 months ago

United States

I have a conceptual question about the counts() function in DESeq2.

counts() uses a DESeq2 object as an input (usually called as dds). While creating this object, we can add covariates to the design formula. Let's say the formula is ~ genotype + age.

Does counts() generate normalized counts values corrected for covariates, or NOT considering the covariates?

Code example:


dds <- DESeqDataSetFromMatrix(countData = counts_mat, colData = samples, design = ~ genotype + age)
dds <- DESeq(dds, parallel = T)

counts(dds)

DESeq2 • 1.1k views

ADD COMMENT • link updated 19 months ago by swbarnes2 ★ 1.3k • written 19 months ago by rb123 • 0

score 1 · Answer 1 · 2022-09-13

Let's check!

## Make a fake-o
> dds <- makeExampleDESeqDataSet()
## get the colData to use with a second fake-o
> cd <- colData(dds)
## Add fake age action
> cd$age <- runif(12)*50
> cd
DataFrame with 12 rows and 2 columns
         condition       age
          <factor> <numeric>
sample1          A  29.46607
sample2          A  14.12943
sample3          A  43.81298
sample4          A  37.69926
sample5          A   4.87461
...            ...       ...
sample8          B   48.8576
sample9          B   19.1212
sample10         B   12.9558
sample11         B   43.8092
sample12         B   38.2382
## fake-o2
> dds2 <- DESeqDataSetFromMatrix(countData = assay(dds), colData = cd, design = ~condition + age)
  the design formula contains one or more numeric variables that have mean or
  standard deviation larger than 5 (an arbitrary threshold to trigger this message).
  Including numeric variables with large mean can induce collinearity with the intercept.
  Users should center and scale numeric variables in the design to improve GLM convergence.
> dds <- DESeq(dds)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
> dds2 <- DESeq(dds2)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing

> all.equal(counts(dds), counts(dds2))
[1] TRUE

> all.equal(counts(dds, normalize = TRUE), counts(dds2, normalize = TRUE))
[1] TRUE

Survey says?

score 0 · Answer 2 · 2022-09-13

0

Entering edit mode

swbarnes2 ★ 1.3k

@swbarnes2-14086

Last seen 8 hours ago

San Diego

You should also be able to look up the algorithm DESeq uses, and apply it yourself to your data in R or Excel. Then you can see that it in no way uses the design.

ADD COMMENT • link 19 months ago swbarnes2 ★ 1.3k