Question: Unexpected MAplot after apeglm shrinkage in DESeq2
0
gravatar for ATPoint
6 months ago by
ATPoint0
ATPoint0 wrote:

Dear Michael,

I am analyzing an RNA-seq cohort of about 90 patients for differential expression between two disease subtypes. Low-level processing was done with Salmon/tximport followed by the standard workflow with DESeq2, using the disease subtype as the only covariate in the design. I was quiet surprised that in the MAplot with apeglm, there was an absence of the "typical" cloud of data points around LFC ~ 0, especially in comparison with the default method. Is this normal and expected?

 

 

I uploaded the Rdata object and a minimal script if that helps. If you need further details, please let me know.

best wishes,

Alexander

 

R version 3.5.1 (2018-07-02)

Platform: x86_64-pc-linux-gnu (64-bit)

Running under: CentOS Linux 7 (Core)

Matrix products: default

BLAS: foo/anaconda3/lib/R/lib/libRblas.so

LAPACK: foo/anaconda3/lib/R/lib/libRlapack.so

locale:

[1] en_US.UTF-8

attached base packages:

[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:

 [1] apeglm_1.2.1                DESeq2_1.20.0               SummarizedExperiment_1.10.1 DelayedArray_0.6.6          BiocParallel_1.14.2        

 [6] matrixStats_0.54.0          Biobase_2.40.0              GenomicRanges_1.32.7        GenomeInfoDb_1.16.0         IRanges_2.14.12            

[11] S4Vectors_0.18.3            BiocGenerics_0.26.0         data.table_1.11.8           RevoUtils_11.0.1            RevoUtilsMath_11.0.0       

loaded via a namespace (and not attached):

 [1] bit64_0.9-7            splines_3.5.1          Formula_1.2-3          assertthat_0.2.0       latticeExtra_0.6-28    blob_1.1.1             GenomeInfoDbData_1.1.0

 [8] yaml_2.2.0             numDeriv_2016.8-1      pillar_1.3.0           RSQLite_2.1.1          backports_1.1.2        lattice_0.20-35        glue_1.3.0            

[15] bbmle_1.0.20           digest_0.6.18          RColorBrewer_1.1-2     XVector_0.20.0         checkmate_1.8.5        colorspace_1.3-2       htmltools_0.3.6       

[22] Matrix_1.2-14          plyr_1.8.4             XML_3.98-1.16          pkgconfig_2.0.2        emdbook_1.3.10         genefilter_1.62.0      zlibbioc_1.26.0       

[29] purrr_0.2.5            xtable_1.8-3           scales_1.0.0           htmlTable_1.12         tibble_1.4.2           annotate_1.58.0        ggplot2_3.1.0         

[36] nnet_7.3-12            lazyeval_0.2.1         survival_2.42-6        magrittr_1.5           crayon_1.3.4           memoise_1.1.0          MASS_7.3-51           

[43] foreign_0.8-71         tools_3.5.1            stringr_1.3.1          locfit_1.5-9.1         munsell_0.5.0          cluster_2.0.7-1        AnnotationDbi_1.42.1  

[50] bindrcpp_0.2.2         compiler_3.5.1         rlang_0.3.0.1          grid_3.5.1             RCurl_1.95-4.11        rstudioapi_0.8         htmlwidgets_1.3       

[57] bitops_1.0-6           base64enc_0.1-3        gtable_0.2.0           DBI_1.0.0              R6_2.3.0               gridExtra_2.3          knitr_1.20            

[64] dplyr_0.7.7            bit_1.1-14             bindr_0.1.1            Hmisc_4.1-1            stringi_1.2.4          Rcpp_0.12.19           geneplotter_1.58.0    

[71] rpart_4.1-13           acepack_1.4.1          coda_0.19-2            tidyselect_0.2.5     
deseq2 rna-seq apeglm • 241 views
ADD COMMENTlink modified 6 months ago by Michael Love23k • written 6 months ago by ATPoint0

I'll take a look. Thanks. One difference is that estimated LFCs that are compatible with LFC=0 are moved more toward 0 by apeglm, where the Normal prior tends to move them only slightly. But I'll look to see if I see something about the code that is more informative here.

ADD REPLYlink modified 6 months ago • written 6 months ago by Michael Love23k
Answer: Unexpected MAplot after apeglm shrinkage in DESeq2
4
gravatar for Michael Love
6 months ago by
Michael Love23k
United States
Michael Love23k wrote:

hi Alexander,

I think because of the large sample size, you'll get better dispersion estimates, and then better shrinkage estimates, by subsetting off a number of the small count genes.

I got more reasonable shrinkage plots with the following code:

library(DESeq2)
library(apeglm)
load("bioc.RData")
dlbcl.dds$tmp.coo <- relevel(dlbcl.dds$tmp.coo, ref = "GCB")
dds <- dlbcl.dds
dds$condition <- dds$tmp.coo
design(dds) <- ~condition
table(dds$condition)

keep <- rowSums(counts(dds) >= 10) >= 5
table(keep)
dds <- dds[keep,]

vsd <- vst(dds)
plotPCA(vsd, "condition")

dds <- DESeq(dds, minRep=Inf) # ~4 min

res <- results(dds)
summary(res)
plotMA(res, ylim=c(-5,5))

res2 <- lfcShrink(dds, coef=3, type="normal")
res3 <- lfcShrink(dds, coef=3, type="apeglm")
res4 <- lfcShrink(dds, coef=3, type="ashr")

plotMA(res2, ylim=c(-5,5))
plotMA(res3, ylim=c(-5,5))
plotMA(res4, ylim=c(-5,5))
ADD COMMENTlink written 6 months ago by Michael Love23k

Thank you. Just out of interest, because I encountered this problem also on a different and unrelated dataset with larger sample sizes, suggesting that this is a rather general issue when n becomes large. Do you plan to make changes in future versions of apeglm to compensate for this issue internally without the need for external filtering?

ADD REPLYlink modified 3 months ago • written 3 months ago by ATPoint0

No we probably won’t ever do minimal filtering inside the function but instead I prefer that any filtering if it is being used can be seen by looking at the script.

ADD REPLYlink written 3 months ago by Michael Love23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 150 users visited in the last hour