Question

Clarifying what results are returned by submitting a list of names to results() when using a user-specified matrix

0

Entering edit mode

thadryan • 0

@thadryan-23400

Last seen 3.2 years ago

New England

Hello!

I am hoping to clarify the generation of results tables for analyses using a user-generated matrices, specifically the use of the contrasts argument. My design, which features two groups at two time points and adjusts for individual, is at the bottom of this post in case it helps.

What does submitting a name from resultsNames() to the contrast argument in results() give you in the context of a user-provided matrix? I created the design so I could control for individual in a study with a different number of individuals between groups. While I validated my design, I'm unsure of the DESeq2 interface for querrying it for results given that I can't use name=... anymore with "Group_GroupA_vs_GroupB".

It looks like just a name ("GroupGroupA") is testing for difference in the means between GroupA vs GroupB as they are encoded as 0s and 1s on that column.


mcols(results(dds, contrast = list(c("GroupGroupB"))))

                    type                                  description
                <character>                               <character>
baseMean       intermediate mean of normalized counts for all samples
log2FoldChange      results log2 fold change (MLE): GroupGroupA effect
lfcSE               results         standard error: GroupGroupA effect
stat                results         Wald statistic: GroupGroupA effect
pvalue              results      Wald test p-value: GroupGroupA effect
padj                results                      BH adjusted p-values

I would assume then, if I was interested in the effect of group after controlling for other factors via the design I could get the genes DE between those groups like this:

# DE based on groups
results(dds, contrast = list(c("GroupGroupB")))
# ...and for visits...
results(dds, contrast = list(c("VisitVisit2")))
# ...interaction...
results(dds, contrast = list(c("GroupB.VisitVisit2")))

I noticed that I get an output no matter how many names I put (indeed, I can just put list(resultsNames(dds))). So I did this:

# DE considering all in some way?
res <- results(dds, contrast = list(c("GroupGroupB", "VisitVisit2","GroupGroupB.VisitVisit2")))
mcols(res)

DataFrame with 6 rows and 2 columns
                       type                                                                  description
                <character>                                                                  <character>
baseMean       intermediate                                    mean of normalized counts for all samples
log2FoldChange      results log2 fold change (MLE): GroupB+VisitVisit2+GroupB.VisitVisit2 effect
lfcSE               results         standard error: GroupB+VisitVisit2+GroupB.VisitVisit2 effect
stat                results         Wald statistic: GroupB+VisitVisit2+GroupB.VisitVisit2 effect
pvalue              results      Wald test p-value: GroupB+VisitVisit2+GroupB.VisitVisit2 effect
padj                results                                                         BH adjusted p-values

Summarized:

1) Am I getting the main effects the way I think I am?

2) What is being testing when I pass multiple names?

Thanks!:

Thadryan

Design, in case it helps:


suppressMessages(library(tidyverse))
suppressMessages(library(DESeq2))

# create the matrix for reference if needed
samples <- tibble(
  ID = rep(seq(1,14), each = 2),
  Group = c(rep("GroupA", 12), rep("GroupB", 16)),
  Visit = c(rep(c("Visit1", "Visit2"), 14))
) %>% data.frame

# GroupA ID 1-6, GroupB ID 1-8, two visits each
samples$GroupID <- as.factor(c(rep(seq(1, 6), each = 2), rep(seq(1, 8), each = 2)))

# add main and interaction effects for desired conditions 
customMatrix <- model.matrix(~Group + Visit + Group:GroupID + Group:Visit, samples) %>%
  data.frame %>%
  # remove all-zero columns
  dplyr::select(-c("GroupGroupA.GroupID7","GroupGroupA.GroupID8")) %>%
  as.matrix

DESeq2 • 756 views

ADD COMMENT • link 4.0 years ago thadryan • 0

score 0 · Answer 1 · 2020-12-02

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 6 days ago

United States

"What does submitting a name from resultsNames() to the contrast argument in results() give you..."

When you provide characters to the list() input to resultsNames, the input is taken as (from ?results):

the names of the fold changes for the numerator, and the names of the fold changes for the denominator

You are specifying which coefficients should get a +1 and which should get a -1 in the contrast vector, c:

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#contrasts-1

As far as how to interpret the coefficients from a particular design, I ask that users collaborate with local statisticians on this, because I am limited in the time I have to provide support here, and I have to restrict myself to software-related questions. Questions about the analysis plan and interpretation of results are up to the analyst.

ADD COMMENT • link 4.0 years ago Michael Love 43k

0

Entering edit mode

Thank you!

I'm confused though - that's from the help section about providing 2 character vectors and my question only deals with 1.

What does DESeq2 assume the denominator is when I do this?:

res <- results(dds, contrast = list(c("GroupGroupB")))

res %>% mcols

DataFrame with 6 rows and 2 columns
                       type                               description
                <character>                               <character>
baseMean       intermediate mean of normalized counts for all samples
log2FoldChange      results log2 fold change (MLE): GroupGroupB effect
lfcSE               results         standard error: GroupGroupB effect
stat                results         Wald statistic: GroupGroupB effect
pvalue              results      Wald test p-value: GroupGroupB effect
padj                results                      BH adjusted p-values

What has DESeq2 done in this case?

This makes it appear as though I am comparing the FCs from GroupA and GroupB (as one would in a non-custom matrix via the name argument with "Group_GroupA_vs_GroupB"), is this the case?

I can't do for example:

results(dds, contrast = list(c("GroupGroupB", "GroupGroupA")))

Because only one of them is included in the output of resultsNames(), so I assume the only way I could get the results of the comparison of GroupA vs GroupB is to do:

res <- results(dds, contrast = list(c("GroupGroupB")))

...but I don't know what DESeq2 assumes or does in this case. It calls it the "GroupGroupB effect" (or "GroupB+VisitVisit2 effect" and so on) so I'm not sure what it has done. It seems that the only thing it could mean was the effect of that coefficient, but it's not labeled "Group_GroupA_vs_GroupB" the way it normally would be so I'm not sure. My post-hoc testing suggests that it's returning results for the test of GroupA vs GroupB but I can't be sure if I don't know what it's assuming about what the numerator vs denominator is.

ADD REPLY • link 4.0 years ago thadryan • 0

0

Entering edit mode

If you don't provide a second list, then you are only specifying the coefficients that should get a +1 in the 'c' vector.