Different results in DESeq2 when using betaPrior=TRUE and betaPrior=FALSE
1
0
Entering edit mode
sven.schenk ▴ 20
@svenschenk-15837
Last seen 5.0 years ago
Vienna

Hello,

I am analysing a RNA-Seq experiment using DESeq2 trying to understand the different log2Fold shrinking algorithms available in DESeq2. After reading the documentation I understood that setting betaPrior=TRUE when calling DESeq would use the "normal" log2Fold shrinking as it was used as default in DESeq2 versions before <1.16 and that from v1.16 onwards the default setting would be betaPrior=FALSE and that the default workflow would be betaPrior=FALSE and then call lfcShrink to perform log2Fold shrinking for ranking and visualisation. I now ran DESeq with betaPrior=FALSE (as default) and then performed IHW on the results:

>dds<-DESeq(ddsFullCountTable, fitType="parametric", betaPrior=FALSE,parallel=F)
>res<-results(dds, contrast=c("pheno", "ctrl", "treat"), parallel=F, cooksCutoff=TRUE,
  + independentFiltering=TRUE, filterFun=ihw)
>head(res[order(res$padj),])
log2 fold change (MLE): pheno ctrl vs treat
Wald test p-value: pheno ctrl vs treat 
DataFrame with 6 rows and 7 columns
                            baseMean     log2FoldChange              lfcSE
                           <numeric>          <numeric>          <numeric>
comp2268393_c0_seq2 20.2323701630026   -21.166856612115   4.24861896850527
comp2273865_c0_seq2 1643.74563240146 -0.753207633363883  0.153241371180767
comp2277935_c1_seq8 286.910832769305 -0.691222223493662  0.169714446384218
comp2141997_c0_seq1 320.550667604245  -0.59060881408569  0.151468610465752
comp2268577_c0_seq9 1030.35777262206  0.450412167787341  0.118900113344764
Pdu_Vtg_1.1         2766.07425488488  0.363861647087163 0.0960080095221327
                                 stat               pvalue               padj
                            <numeric>            <numeric>          <numeric>
comp2268393_c0_seq2 -4.98205576188957 6.29122985060415e-07 0.0199758626049576
comp2273865_c0_seq2 -4.91517158558562 8.87047341413335e-07 0.0199758626049576
comp2277935_c1_seq8  -4.0728543634335 4.64404654102552e-05 0.0551478686234231
comp2141997_c0_seq1 -3.89921589872398  9.6504696838006e-05  0.085949208773278
comp2268577_c0_seq9  3.78815591606141 0.000151769598702969  0.114241548368527
Pdu_Vtg_1.1          3.78990928880034   0.0001507023086035  0.149298720611182
                              weight
                           <numeric>
comp2268393_c0_seq2                1
comp2273865_c0_seq2                1
comp2277935_c1_seq8 12.6425685091306
comp2141997_c0_seq1 12.6425685091306
comp2268577_c0_seq9 11.9668387790622
Pdu_Vtg_1.1         7.57707015104483

However, when when I run the same data set with betaPrior=TRUE I get different results:

>dds<-DESeq(ddsFullCountTable, fitType="parametric", betaPrior=TRUE,parallel=F)
>res<-results(dds, contrast=c("pheno", "ctrl", "treat"), parallel=F, cooksCutoff=TRUE,
+ independentFiltering=TRUE, filterFun=ihw)
>head(res[order(res$padj),])
log2 fold change (MAP): pheno ctrl vs treat 
Wald test p-value: pheno ctrl vs treat 
DataFrame with 6 rows and 7 columns
                                      baseMean     log2FoldChange              lfcSE
                                     <numeric>          <numeric>          <numeric>
comp2273865_c0_seq2           1643.74563240146 -0.669065523108693  0.136251802675902
comp2277935_c1_seq8           286.910832769305 -0.598085295272818   0.14708009415555
Pdu_Vtg_1.1                   2766.07425488488  0.346637211959349 0.0915191874241819
comp2224448_c0_seq1           5846.99678497238  0.278434407963567 0.0815422524249737
comp2252584_c0_seq1           2574.82319882063 -0.306324451519527 0.0914966906834413
gnl|Pdu_trscr_assembly_1|4219 1109.47286054481 -0.517737621443726  0.128467156601779
                                           stat               pvalue                padj
                                      <numeric>            <numeric>           <numeric>
comp2273865_c0_seq2           -4.91050767746668 9.08408992313057e-07 0.00281860960251004
comp2277935_c1_seq8           -4.06639184389078 4.77466288487251e-05  0.0865926427665106
Pdu_Vtg_1.1                    3.78759057762086 0.000152115238366902   0.100296223175306
comp2224448_c0_seq1            3.41460285536939 0.000638750715994235   0.292933653069786
comp2252584_c0_seq1            -3.3479292992065 0.000814177791399682   0.292933653069786
gnl|Pdu_trscr_assembly_1|4219 -4.03011660831417 5.57491897770106e-05   0.292933653069786
                                        weight
                                     <numeric>
comp2273865_c0_seq2           14.5156081808396
comp2277935_c1_seq8           12.4171081284368
Pdu_Vtg_1.1                   22.7696118687405
comp2224448_c0_seq1           14.5156081808396
comp2252584_c0_seq1           22.7696118687405
gnl|Pdu_trscr_assembly_1|4219                1

For me this means that using a log2FoldChange shriking algorithm during running DESeq2 is influencing the DE calls. If I understood the mathematical meaning of the beta prior correctly this makes kind of sense, but I am now not sure, which workflow to use. If I use betaPrior=TRUE when calling DESeq I would have the "normal" log2FoldChange shrinking influencing my results (would there be a possibility to use "apeglm" when setting betaPrior=TRUE?), and when using the default setting, i.e. betaPrior=FALSE I would have no shirnking impacting on my DE calls, but when using lfcShrink to visualise and rank data this "shrinking factor" would not be accounted for in the DE calls. Is there any advice on that? Or ideal way to proceed?

Thanks a lot for your help/advice

Sven

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices datasets  utils     methods  
[9] base     

other attached packages:
 [1] IHW_1.10.1                  ggplot2_3.1.0               gplots_3.0.1.1             
 [4] RColorBrewer_1.1-2          DESeq2_1.22.2               SummarizedExperiment_1.12.0
 [7] DelayedArray_0.8.0          BiocParallel_1.16.6         matrixStats_0.54.0         
[10] Biobase_2.42.0              GenomicRanges_1.34.0        GenomeInfoDb_1.18.2        
[13] IRanges_2.16.0              S4Vectors_0.20.1            BiocGenerics_0.28.0        
[16] installr_0.21.0             stringr_1.4.0              

loaded via a namespace (and not attached):
 [1] bitops_1.0-6           bit64_0.9-7            numDeriv_2016.8-1     
 [4] tools_3.5.0            backports_1.1.3        R6_2.3.0              
 [7] rpart_4.1-13           KernSmooth_2.23-15     Hmisc_4.2-0           
[10] DBI_1.0.0              lazyeval_0.2.1         colorspace_1.4-0      
[13] nnet_7.3-12            apeglm_1.4.2           withr_2.1.2           
[16] gridExtra_2.3          bit_1.1-14             compiler_3.5.0        
[19] fdrtool_1.2.15         htmlTable_1.13.1       slam_0.1-44           
[22] caTools_1.17.1.1       scales_1.0.0           checkmate_1.9.1       
[25] genefilter_1.64.0      digest_0.6.18          foreign_0.8-71        
[28] XVector_0.22.0         base64enc_0.1-3        pkgconfig_2.0.2       
[31] htmltools_0.3.6        lpsymphony_1.10.0      bbmle_1.0.20          
[34] htmlwidgets_1.3        rlang_0.3.1            rstudioapi_0.9.0      
[37] RSQLite_2.1.1          bindr_0.1.1            gtools_3.8.1          
[40] acepack_1.4.1          RCurl_1.95-4.11        magrittr_1.5          
[43] GenomeInfoDbData_1.2.0 Formula_1.2-3          Matrix_1.2-15         
[46] Rcpp_1.0.0             munsell_0.5.0          stringi_1.2.4         
[49] MASS_7.3-51.1          zlibbioc_1.28.0        plyr_1.8.4            
[52] grid_3.5.0             blob_1.1.1             gdata_2.18.0          
[55] crayon_1.3.4           lattice_0.20-38        splines_3.5.0         
[58] annotate_1.60.0        locfit_1.5-9.1         knitr_1.21            
[61] pillar_1.3.1           geneplotter_1.60.0     XML_3.98-1.17         
[64] glue_1.3.0             latticeExtra_0.6-28    data.table_1.12.0     
[67] gtable_0.2.0           assertthat_0.2.0       emdbook_1.3.11        
[70] xfun_0.4               xtable_1.8-3           coda_0.19-2           
[73] survival_2.43-3        tibble_2.0.1           AnnotationDbi_1.44.0  
[76] memoise_1.1.0          bindrcpp_0.2.2         cluster_2.0.7-1       
>
deseq2 • 1.0k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

Hi Sven,

I prefer the current recommended code and function defaults for the reasons outlined in the apeglm paper. If you want inference on the shrunken LFCs we do support that in lfcShrink, via svalue.

ADD COMMENT
0
Entering edit mode

Hi,

thanks a lot for clarifying this.

Sven

ADD REPLY

Login before adding your answer.

Traffic: 692 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6