Question

DESeq2 coefficients for APEGLM-shrinkage

0

Entering edit mode

zolotaryovgl • 0

@zolotaryovgl-18489

Last seen 5.2 years ago

Prague

Hi, everyone.

My question is about the coef argument for lfcShrink function, and the overall experimental design formula.

I'm now trying to create a list of ranked genes for GSEA, and I ran into unexpected troubles.

My experimental design is 2 cell lines (b16,tc1a9) * 3 replicates * 2 conditions (treated,control).

I want to estimate the DE genes for each cell line separately.

For extracting DESeq2 results it works fine:

dds<- DESeqDataSetFromMatrix(countData = data,
                              colData = metadat,
                              design = ~cell_line+treatment+cell_line:treatment)
> resultsNames(dds)
[1] "Intercept"                    "cell_line_TC1A9_vs_B16"      
[3] "treatment_ifng_vs_control"    "cell_lineTC1A9.treatmentifng"

B16.result <- results(dds, contrast=c("treatment",'treated',"control"))

Here I assume that DESeq2 treats B16 as a base level of cell_line. For the second cell line I do:

TC1A9.result <- results(dds, list( c("treatment_ifng_vs_control","cell_lineTC1A9.treatmentifng") ))

Next, I want to use apeglm shrunken LFC estimates for ranking. I create a dummy variable by collapsing original ones:

metadat_collapsed = data.frame(sample_name = metadat$sample_name,condition = paste0(metadat$cell_line,'_',metadat$treatment))
metadat_collapsed$condition <- factor(dds$condition, levels = c("untreated","treated"))

dds<- DESeqDataSetFromMatrix(countData = data,
                             colData = metadat_collapsed,
                             design = ~ condition)
dds = DESeq(dds)

Next, I'm extracting shrunken estimates:

b16.res = lfcShrink(dds, coef="condition_B16_ifng_vs_B16_control", type="apeglm")

And here the problem comes in: How to get shrunken estimates for another cell line? results function accepts the list of contrast coefficients, but lfcShrink does not.

Here are coefficient names:

resultsNames(dds)
[1] "Intercept"                             
[2] "condition_B16_ifng_vs_B16_control"     
[3] "condition_TC1A9_control_vs_B16_control"
[4] "condition_TC1A9_ifng_vs_B16_control"

I think, I'm doing something wrong and I would appreciate any help with this issue.

deseq2 apeglm • 2.6k views

ADD COMMENT • link updated 5.4 years ago by Michael Love 41k • written 5.4 years ago by zolotaryovgl • 0

score 0 · Answer 1 · 2018-11-26

There is a way to use a different design that is equivalent, such that two coefficients represent the two different treatment effects:

design(dds) <- ~cell_line + cell_line:treatment

This will give two coefficients that account for the cell_line differences, and then two coefficients, one for the treatment effect in each cell line. You can then use results(dds, name="...") and lfcShrink(dds, coef="...") to obtain results and shrunken LFCs for these two treatment effects.

At the end of this section here we have some strategies for re-arranging terms to obtain a coefficient for apeglm to shrink, but the important thing to note is that you don't have to re-run dispersion estimation if the design are equivalent, you only need to re-run nbinomWaldTest() to obtain new MLE coefficients to shrink.

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#extended-section-on-shrinkage-estimators