Question

DESeq2 contrasts using all relevant samples vs all samples

0

Entering edit mode

jeffrey.maurer • 0

@951550f1

Last seen 5 months ago

United States

Let's say I have an RNAseq experiment of 12 samples. There are 3 subjects, 2 cell types (B and T cell) and 2 diseases (SLE and HC). I combine cell type and disease into a single combined variable called group, such that it is a factor with 4 distinct levels. Let's say that there are 15k genes detected and each sample has a library depth of around 20M uniquely mapped reads.

I then go through the rigmarole of DESeq2:

coldata = files[, c("sample", "group")]
row.names(coldata) = coldata$sample

# Establish DESeq DGE object
dds = DESeqDataSetFromMatrix(countData=as.matrix(cts[, files$sample]),
                             colData=coldata,
                             design = as.formula(~group))

# Run DESeq fitting and get results
dds <- DESeq(dds)

Now let's say I just want to look at the difference between HC and disease within each cell type:

resB = results(dds, contrast = c("group", "B_SLE", "B_HC"),
                  pAdjustMethod = "fdr", independentFiltering = F)
resT = results(dds, contrast = c("group", "T_SLE", "T_HC"),
                  pAdjustMethod = "fdr", independentFiltering = F)

This typically how we do it. However, extracting the results in this manner means that DESeq2 is not just fitting to our variable of interest (SLE v HC), but also another axis (B v T). We can perform a similar analysis as such:

for (cellType in unique(files$cellType) {
    f = files[files$cellType==cellType, ]
    coldata = f[, c("sample", "disease")]
    row.names(coldata) = coldata$sample

    # Establish DESeq DGE object
    dds = DESeqDataSetFromMatrix(countData=as.matrix(cts[, f$sample]),
                                 colData=coldata,
                                 design = as.formula(~disease))

    # Run DESeq fitting and get results
    dds <- DESeq(dds) 
    res = results(dds, contrast = c("group", "SLE", "HC"),
                  pAdjustMethod = "fdr", independentFiltering = F)
}

However, this gives us a different result with different numbers of DEG and different types of DEG. There likely is not a correct answer to this question, only considerations for each method, but how do you know when one is correct and not the other?

DESeq2 • 2.2k views

ADD COMMENT • link 2.8 years ago jeffrey.maurer • 0

0

Entering edit mode

With PCA, what % of the variance is caused by the difference between B cells and T cells?

ADD REPLY • link 2.8 years ago swbarnes2 ★ 1.4k

score 2 · Accepted Answer · 2022-08-30

2

Entering edit mode

Michael Love 43k

@mikelove

Last seen 14 days ago

United States

We have a FAQ in the vignette that tries to answer this question.

ADD COMMENT • link 2.8 years ago Michael Love 43k

0

Entering edit mode

Yup, I see it now.

http://www.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#if-i-have-multiple-groups-should-i-run-all-together-or-split-into-pairs-of-groups

Thank you.

ADD REPLY • link 2.8 years ago jeffrey.maurer • 0