Using betaPrior=F in results() in DESeq2?
1
0
Entering edit mode
@nik-fortelny-7458
Last seen 3.9 years ago
Canada

Hi there,

I have googled for a few days now but am not getting there. Please help!

Let's say I have the following dataset (2 conditions and 3 genotypes) and want to get differences between all genotypes (and ignore the condition). I run the following:

# Generate example Data
dds <- makeExampleDESeqDataSet(n=100,m=18)
dds$genotype <- factor(rep(rep(c("I","II", "III"),each=3),2))

# Run DESeq2
dds1 <- dds
design(dds1) <- ~ 0 + genotype + condition
dds1 <- DESeq(dds1, betaPrior=F) 
resultsNames(dds1)

# get differences between genotypes
results(dds1, contrast=c("genotype", "III","II") )
results(dds1, contrast=c("genotype", "III","I") )
results(dds1, contrast=c("genotype", "II","I") )

I understand I have to run DESeq with betaPrior=F because I don't want to shrink the betas to zero for the groups in the DESeq call. But then when I get meaningful differences (betas) in the call to "results", can I then tell DESeq2 to calculate the MAP instead of MLE betas?

Now to avoid this problem I tried to work with the idea of one group being chosen as intercept:

# With intercept
dd2 <- dds
design(dd2) <- ~ genotype + condition
dd2 <- DESeq(dd2, betaPrior=T)
resultsNames(dd2)
model.matrix(~ dds$genotype + dds$condition)

I did not expect to get these groups from resultsNames:

"Intercept"   "genotypeI"   "genotypeII"  "genotypeIII" "conditionA"  "conditionB" 

But rather something like what is returned by model.matrix:

(Intercept) genotypeII genotypeIII conditionB

The same is true when I only use condition in design

dd3 <- dds
design(dd3) <- ~ condition
dd3 <- DESeq(dd3, betaPrior=T) 
resultsNames(dd3)
model.matrix(~ dds$condition)

Again an resultNames gives me an intercept plus two conditions, where I only expected one. I used to use limma, so maybe I just didn't get some of the differences with DESeq2 but I'm obviously a bit confused.

Thank you for your help!

Nik

> sessionInfo() R version 3.3.1 (2016-06-21) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.11.6 (El Capitan)

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] data.table_1.9.6 DESeq2_1.12.4 SummarizedExperiment_1.2.3 [4] Biobase_2.32.0 GenomicRanges_1.24.3 GenomeInfoDb_1.8.7 [7] IRanges_2.6.1 S4Vectors_0.10.3 BiocGenerics_0.18.0

deseq2 • 3.4k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 2 days ago
United States

Small note: you always want to use full names for TRUE and FALSE, because T and F can be used as variable names in R.

There is often little difference in the test statistics and p-values between betaPrior=TRUE or FALSE, and it mostly has the effect to shrink non-informative LFCs to zero. If you want LFC moderation you can use betaPrior=TRUE. I don't have a function, although I am planning on writing such a function in this devel cycle, which will produce shrunken or moderated LFC after having run DESeq() with betaPrior=FALSE.

The implementation of LFC shrinkage requires that each level of the factor have it's own coefficient in addition to the intercept, so the shrinkage is symmetric regardless of the way the factors are coded. The ideas behind this are discussed in the Methods of the DESeq2 paper. But for your purposes, you can use the contrast argument of results() if you want to compare levels. It will take care of which coefficients to contrast, whether you have betaPrior=TRUE or FALSE.

ADD COMMENT

Login before adding your answer.

Traffic: 698 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6