Hello, I realized that when I run Deseq2 in the terminal, it gives me a different result (different values in the res table except for baseMean) compare to when I run in with Snakemake (I was able to reproduce the same issue also on a different computer). However, when I use only one CPU for R in terminal taskset --cpu-list 1 R
it gives me the same results as with Snakemake.
My code:
library("tximeta")
library("DESeq2")
library("dplyr")
library("ggplot2")
library("genefilter")
library("ggrepel")
library("DEGreport")
# Read metadata table
sample_metadata <- read.table('data/raw/samples.csv', sep=',', header=TRUE, stringsAsFactors = FALSE)
# Ad $files columns with path to the quant.sf files (result from salmon)
sample_metadata$files <- paste0("data/processed/30062020","/", sample_metadata$names, "/salmon_out/quant.sf")
# Parse salmon result with tximeta
se <- tximeta(sample_metadata)
# summarize transcript counts to genes
gse <- summarizeToGene(se)
# Construction of DESeqDataSet object
dds <- DESeqDataSet(gse, design = ~ intracellular_ig)
# Remove rows of the DESeqDataSet that have no counts, or only a single count across all samples
keep <- rowSums(counts(dds)) > 1
dds <- dds[keep,]
# differential expression pipeline
dds <- DESeq(dds)
# create a result table with False discovery rate 0.05
res <- results(dds, alpha = 0.05)
# order according to padj
resOrdered <- res[order(res$padj),]
resOrderedDF <- as.data.frame(resOrdered)
# filtResOrderedDF = filter(resOrderedDF, padj < 0.05)
write.csv(resOrderedDF, file = "results/test/deseq_result.csv")
I was able to see a difference after the estimateDispersion with different values dispFit, dispIter, dispMap columns. I was also checking my cores during these commands, and it uses more than one core.
Do you please know what could cause this kind of error?
sessionInfo( )
R version 4.0.0 (2020-04-24)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS
Matrix products: default
BLAS/LAPACK: /home/david/anaconda3/envs/differential_expression/lib/libmkl_rt.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=cs_CZ.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=cs_CZ.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=cs_CZ.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=cs_CZ.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] DEGreport_1.24.1 ggrepel_0.8.2
[3] genefilter_1.70.0 ggplot2_3.3.2
[5] dplyr_1.0.2 DESeq2_1.30.0
[7] SummarizedExperiment_1.18.2 DelayedArray_0.14.1
[9] matrixStats_0.57.0 Biobase_2.48.0
[11] GenomicRanges_1.40.0 GenomeInfoDb_1.24.2
[13] IRanges_2.22.2 S4Vectors_0.26.1
[15] BiocGenerics_0.34.0 tximeta_1.6.3
loaded via a namespace (and not attached):
[1] colorspace_2.0-0 rjson_0.2.20
[3] ellipsis_0.3.1 circlize_0.4.11
[5] XVector_0.28.0 ggdendro_0.1.22
[7] GlobalOptions_0.1.2 clue_0.3-57
[9] bit64_4.0.5 interactiveDisplayBase_1.26.3
[11] AnnotationDbi_1.50.3 xml2_1.3.2
[13] splines_4.0.0 logging_0.10-108
[15] mnormt_2.0.2 tximport_1.16.1
[17] knitr_1.30 geneplotter_1.66.0
[19] jsonlite_1.7.1 Nozzle.R1_1.1-1
[21] Rsamtools_2.4.0 broom_0.7.2
[23] annotate_1.66.0 cluster_2.1.0
[25] dbplyr_2.0.0 png_0.1-7
[27] shiny_1.5.0 BiocManager_1.30.10
[29] compiler_4.0.0 httr_1.4.2
[31] backports_1.2.0 assertthat_0.2.1
[33] Matrix_1.2-18 fastmap_1.0.1
[35] lazyeval_0.2.2 limma_3.44.3
[37] later_1.1.0.1 lasso2_1.2-21.1
[39] htmltools_0.5.0 prettyunits_1.1.1
[41] tools_4.0.0 gtable_0.3.0
[43] glue_1.4.2 GenomeInfoDbData_1.2.3
[45] rappdirs_0.3.1 Rcpp_1.0.5
[47] vctrs_0.3.5 Biostrings_2.56.0
[49] nlme_3.1-150 rtracklayer_1.48.0
[51] psych_2.0.9 xfun_0.19
[53] stringr_1.4.0 mime_0.9
[55] lifecycle_0.2.0 ensembldb_2.12.1
[57] XML_3.99-0.5 AnnotationHub_2.20.2
[59] edgeR_3.30.3 MASS_7.3-53
[61] zlibbioc_1.34.0 scales_1.1.1
[63] hms_0.5.3 promises_1.1.1
[65] ProtGenerics_1.20.0 AnnotationFilter_1.12.0
[67] RColorBrewer_1.1-2 ComplexHeatmap_2.4.3
[69] yaml_2.2.1 curl_4.3
[71] memoise_1.1.0 biomaRt_2.44.4
[73] reshape_0.8.8 stringi_1.5.3
[75] RSQLite_2.2.1 BiocVersion_3.11.1
[77] GenomicFeatures_1.40.1 BiocParallel_1.22.0
[79] shape_1.4.5 rlang_0.4.8
[81] pkgconfig_2.0.3 bitops_1.0-6
[83] lattice_0.20-41 purrr_0.3.4
[85] GenomicAlignments_1.24.0 cowplot_1.1.0
[87] bit_4.0.4 tidyselect_1.1.0
[89] plyr_1.8.6 magrittr_2.0.1
[91] R6_2.5.0 generics_0.1.0
[93] DBI_1.1.0 pillar_1.4.7
[95] withr_2.3.0 survival_3.2-7
[97] RCurl_1.98-1.2 tibble_3.0.4
[99] crayon_1.3.4 tmvnsim_1.0-2
[101] BiocFileCache_1.12.1 GetoptLong_1.0.4
[103] progress_1.2.2 locfit_1.5-9.4
[105] grid_4.0.0 blob_1.2.1
[107] ConsensusClusterPlus_1.52.0 digest_0.6.27
[109] xtable_1.8-4 tidyr_1.1.2
[111] httpuv_1.5.4 openssl_1.4.3
[113] munsell_0.5.0 askpass_1.1