Hi,
I'm struggling to wrap my head around the change in absolute number of DEGs calculated using different shrinkage methods when I specify independent hypothesis weight (ihw) in both instances. Is this difference expected, or am I feeding results into lfcShrink incorrectly? Possibly an assumption I'm missing?
Overview
Building my DESeq object
> dds <- DESeqDataSetFromMatrix(countData = toc, colData = colData, design = ~batch + Pop_Sex)
> dds <- estimateSizeFactors(dds)
> dds <- estimateDispersions(dds)
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
Hypothesis Testing
a) Hypothesis testing using normal shrinkage through BetaPrior=TRUE
> dds$Pop_Sex <- relevel(dds$Pop_Sex,"MRTD_F")
> dds <- nbinomWaldTest(dds, betaPrior=TRUE)
found results columns, replacing these
> resultsNames(dds)
[1] "Intercept" "batch1" "batch2" "Pop_SexMRTD_F" "Pop_SexCGRT_F" "Pop_SexCGRT_M"
[7] "Pop_SexCRTH_F" "Pop_SexCRTH_M" "Pop_SexMRTD_M" "Pop_SexTBAC_F" "Pop_SexTBAC_M" "Pop_SexTDNV_F"
[13] "Pop_SexTDNV_M"
> results_MRTD_norm <- results(dds, contrast=c("Pop_Sex","MRTD_M", "MRTD_F"), cooksCutoff=TRUE, filterFun=ihw)
> table(results_MRTD_norm$padj < 0.05)
FALSE TRUE
44233 15
b) Hypothesis testing using apeglm shrinkage through lfcShrink
> dds$Pop_Sex <- relevel(dds$Pop_Sex,"MRTD_F")
> dds <- nbinomWaldTest(dds, betaPrior=FALSE)
found results columns, replacing these
> resultsNames(dds)
[1] "Intercept" "batch_2_vs_1" "Pop_Sex_CGRT_F_vs_MRTD_F"
[4] "Pop_Sex_CGRT_M_vs_MRTD_F" "Pop_Sex_CRTH_F_vs_MRTD_F" "Pop_Sex_CRTH_M_vs_MRTD_F"
[7] "Pop_Sex_MRTD_M_vs_MRTD_F" "Pop_Sex_TBAC_F_vs_MRTD_F" "Pop_Sex_TBAC_M_vs_MRTD_F"
[10] "Pop_Sex_TDNV_F_vs_MRTD_F" "Pop_Sex_TDNV_M_vs_MRTD_F"
> res <- results(dds, name="Pop_Sex_MRTD_M_vs_MRTD_F", cooksCutoff=TRUE, filterFun=ihw)
> results_MRTD_ape <- lfcShrink(dds, coef="Pop_Sex_MRTD_M_vs_MRTD_F", res=res, type="apeglm")
> table(results_MRTD_ape$padj < 0.05)
FALSE TRUE
44122 126
# gives very similar result (112 padj < 0.05) when fed into lfcShrink
# res <- results(dds, contrast=c("Pop_Sex","MRTD_M", "MRTD_F"), cooksCutoff=TRUE, filterFun=ihw)
> DESeq2::plotMA(results_MRTD_norm)
![MAplot for normal shrinkage][1]
> DESeq2::plotMA(results_MRTD_ape)
![MAplot from apeglm shrinkage][1]
> head(results_MRTD_norm)
log2 fold change (MAP): Pop_Sex MRTD_M vs MRTD_F
Wald test p-value: Pop_Sex MRTD_M vs MRTD_F
DataFrame with 6 rows and 7 columns
baseMean log2FoldChange lfcSE stat pvalue padj weight
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
ENSMUSG00000000001 724.6171634 -0.107774 0.297979 -0.361685 0.717588 1 0.866199
ENSMUSG00000000003 0.0243594 0.000000 0.621820 0.000000 1.000000 1 0.000000
ENSMUSG00000000028 45.0620373 -0.165511 0.637214 -0.259741 0.795064 1 2.245600
ENSMUSG00000000031 698.5754183 -0.154553 1.053695 -0.146677 0.883387 1 1.121025
ENSMUSG00000000037 3.6064554 0.335497 0.659736 0.508532 0.611081 1 1.897672
ENSMUSG00000000049 249.3472568 0.145636 0.543124 0.268145 0.788587 1 1.606663
> head(results_MRTD_ape)
log2 fold change (MAP): Pop Sex MRTD M vs MRTD F
Wald test p-value: Pop Sex MRTD M vs MRTD F
DataFrame with 6 rows and 5 columns
baseMean log2FoldChange lfcSE pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric>
ENSMUSG00000000001 724.6171634 -1.23632e-06 0.00144268 0.716050 1
ENSMUSG00000000003 0.0243594 -7.51882e-09 0.00144270 0.994846 1
ENSMUSG00000000028 45.0620373 -3.66459e-07 0.00144269 0.802388 1
ENSMUSG00000000031 698.5754183 -7.02208e-08 0.00144269 0.943279 1
ENSMUSG00000000037 3.6064554 6.82196e-07 0.00144269 0.629760 1
ENSMUSG00000000049 249.3472568 4.22034e-07 0.00144269 0.825730 1
Session details
> sessionInfo( )
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] msigdbr_7.4.1 pathfindR_1.6.3 pathfindR.data_1.1.2 goseq_1.44.0
[5] geneLenDataBase_1.28.0 BiasedUrn_1.07 GenomicFeatures_1.44.2 org.Mm.eg.db_3.13.0
[9] AnnotationDbi_1.54.1 BiocParallel_1.26.1 ggrepel_0.9.1 clusterProfiler_4.0.5
[13] biomaRt_2.48.3 venneuler_1.1-0 rJava_1.0-6 ggbiplot_0.55
[17] scales_1.1.1 plyr_1.8.6 ggpubr_0.4.0 cowplot_1.1.1
[21] forcats_0.5.1 stringr_1.4.0 purrr_0.3.4 readr_2.1.1
[25] tidyr_1.1.4 tibble_3.1.3 tidyverse_1.3.1 viridis_0.6.2
[29] viridisLite_0.4.0 dplyr_1.0.7 ComplexHeatmap_2.8.0 RColorBrewer_1.1-2
[33] ggplot2_3.3.5 gplots_3.1.1 limma_3.48.3 IHW_1.20.0
[37] vsn_3.60.0 DESeq2_1.32.0 SummarizedExperiment_1.22.0 MatrixGenerics_1.4.3
[41] matrixStats_0.61.0 GenomicRanges_1.44.0 GenomeInfoDb_1.28.4 IRanges_2.26.0
[45] S4Vectors_0.30.0 GEOquery_2.60.0 Biobase_2.52.0 BiocGenerics_0.38.0
loaded via a namespace (and not attached):
[1] rappdirs_0.3.3 rtracklayer_1.52.1 coda_0.19-4 bit64_4.0.5 knitr_1.37
[6] DelayedArray_0.18.0 data.table_1.14.2 KEGGREST_1.32.0 RCurl_1.98-1.3 doParallel_1.0.17
[11] generics_0.1.2 snow_0.4-4 preprocessCore_1.54.0 RSQLite_2.2.7 shadowtext_0.1.1
[16] bit_4.0.4 tzdb_0.2.0 enrichplot_1.12.3 xml2_1.3.2 lubridate_1.8.0
[21] assertthat_0.2.1 apeglm_1.14.0 xfun_0.28 hms_1.1.1 babelgene_21.4
[26] evaluate_0.15 fansi_0.5.0 restfulr_0.0.13 progress_1.2.2 caTools_1.18.2
[31] dbplyr_2.1.1 readxl_1.3.1 igraph_1.2.7 DBI_1.1.2 geneplotter_1.70.0
[36] ellipsis_0.3.2 backports_1.4.0 annotate_1.70.0 vctrs_0.3.8 Cairo_1.5-12.2
[41] abind_1.4-5 cachem_1.0.5 withr_2.4.3 ggforce_0.3.3 bdsmatrix_1.3-4
[46] GenomicAlignments_1.28.0 treeio_1.16.2 fdrtool_1.2.17 prettyunits_1.1.1 cluster_2.1.2
[51] DOSE_3.18.3 ape_5.5 lazyeval_0.2.2 crayon_1.5.0 genefilter_1.74.0
[56] pkgconfig_2.0.3 slam_0.1-48 labeling_0.4.2 tweenr_1.0.2 nlme_3.1-152
[61] rlang_0.4.11 lifecycle_1.0.1 downloader_0.4 filelock_1.0.2 affyio_1.62.0
[66] BiocFileCache_2.0.0 modelr_0.1.8 cellranger_1.1.0 polyclip_1.10-0 Matrix_1.3-3
[71] aplot_0.1.2 carData_3.0-5 lpsymphony_1.20.0 reprex_2.0.1 GlobalOptions_0.1.2
[76] png_0.1-7 rjson_0.2.20 bitops_1.0-7 KernSmooth_2.23-20 Biostrings_2.60.1
[81] blob_1.2.2 shape_1.4.6 qvalue_2.24.0 rstatix_0.7.0 gridGraphics_0.5-1
[86] ggsignif_0.6.3 memoise_2.0.1 magrittr_2.0.1 zlibbioc_1.38.0 compiler_4.1.0
[91] scatterpie_0.1.7 BiocIO_1.2.0 bbmle_1.0.24 clue_0.3-60 Rsamtools_2.8.0
[96] cli_3.1.0 affy_1.70.0 XVector_0.32.0 patchwork_1.1.1 MASS_7.3-54
[101] mgcv_1.8-35 tidyselect_1.1.1 stringi_1.7.3 emdbook_1.3.12 yaml_2.3.5
[106] GOSemSim_2.18.1 locfit_1.5-9.4 fastmatch_1.1-3 tools_4.1.0 circlize_0.4.14
[111] rstudioapi_0.13 foreach_1.5.2 gridExtra_2.3 farver_2.1.0 ggraph_2.0.5
[116] digest_0.6.27 BiocManager_1.30.16 Rcpp_1.0.7 car_3.0-12 broom_0.7.12
[121] httr_1.4.2 colorspace_2.0-2 rvest_1.0.2 XML_3.99-0.6 fs_1.5.2
[126] splines_4.1.0 yulab.utils_0.0.4 tidytree_0.3.8 graphlayouts_0.7.1 ggplotify_0.1.0
[131] xtable_1.8-4 jsonlite_1.7.2 ggtree_3.0.4 tidygraph_1.2.0 ggfun_0.0.5
[136] R6_2.5.1 htmltools_0.5.2 pillar_1.7.0 glue_1.4.2 fastmap_1.1.0
[141] codetools_0.2-18 fgsea_1.18.0 mvtnorm_1.1-3 utf8_1.2.2 lattice_0.20-44
[146] numDeriv_2016.8-1.1 curl_4.3.2 gtools_3.9.2 GO.db_3.13.0 survival_3.2-11
[151] rmarkdown_2.11 munsell_0.5.0 DO.db_2.9 GetoptLong_1.0.5 GenomeInfoDbData_1.2.6
[156] iterators_1.0.14 haven_2.4.3 reshape2_1.4.4 gtable_0.3.0
MA plots didn't upload properly, here at the links for ref.
normal : MAplot_normal.png
apeglm : MAplot_apeglm.png
R markdown : Rmd file