very different DESeq2 results between two versions
1
0
Entering edit mode
sven.schenk ▴ 20
@svenschenk-15837
Last seen 5.0 years ago
Vienna

Hi,

last year I ran DESeq2 on RNASeq data comparing treated vs untreated cells. When I initially ran DESeq2 I found that I had rather low log2FC and a kind of high variation. However, I now re-ran DESeq2 on the same data with the same script

ddsFullCountTable <- DESeqDataSetFromMatrix(countData=countdata,colData=coldata,design = ~ pheno)
ddsFullCountTable <- DESeq(ddsFullCountTable, parallel=F) #parallel somehow doesn`t to work anymore...
res <- results(ddsFullCountTable,contrast=c("pheno","ctrl", "mf"), parallel=F)
# rank genes by pvalue
resOrdered <- res[order(res$padj),]
resSig <- subset(resOrdered, padj<pvalue_cutoff)

and  when I look at the results they look more or less comletely different, which might be illustrated by thes two volcano plots:

old (Aug 17):

 

new (Jun 18):

The gene in the new plot with a log2FC ~-21 is a transcript which in the DESeq normalised read counts has in the three control samples 0 counts and in two of the three replicates of the treatmean twice 0 and once 180, both in the "old" read count list and the "new" one:.

old:

> TS.28922.norm.old["comp2268393_c0_seq2",]
                    ctrl_50112 ctrl_50113 ctrl_50114 mf_50115 mf_50116 mf_50117
comp2268393_c0_seq2          0          0          0        0        0 180.3483

new:

> TS.28922.norm.new["comp2268393_c0_seq2",]
                    ctrl_50112 ctrl_50113 ctrl_50114 mf_50115 mf_50116 mf_50117
comp2268393_c0_seq2          0          0          0        0        0 180.3483

In the DESeq result from last year this was at the bottom of the result list marked as outlier (i.e. pvalue NA) with a log2FC of ~-0.046:

old:

> subset(DESeqResultOld, X=="comp2268393_c0_seq2")
                        X baseMean log2FoldChange      lfcSE      stat pvalue padj
28833 comp2268393_c0_seq2 20.23561    -0.04622691 0.04388205 -1.053435     NA   NA

new:

> subset(DESeqResultNew, X=="comp2268393_c0_seq2")
                    X baseMean log2FoldChange    lfcSE      stat       pvalue       padj
1 comp2268393_c0_seq2 20.23561      -21.16507 4.248619 -4.981636 6.304886e-07 0.01526217

So my question now is, how can this be? Has anything changed in the default DESeq() or reslults() calls between the August 17 version (I don`t know the version number anymore) and v1.20. that I am not aware off? I thing the example should clearly be considered as an outlier (I also tested with calling cooksCutoff = TRUE and cooksCutoff = FALSE, but the results are completely different)?

Cheers

Sven

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices datasets  utils     methods  
[9] base     

other attached packages:
 [1] ggplot2_2.2.1               gplots_3.0.1                RColorBrewer_1.1-2         
 [4] DESeq2_1.20.0               SummarizedExperiment_1.10.1 DelayedArray_0.6.0         
 [7] BiocParallel_1.14.1         matrixStats_0.53.1          Biobase_2.40.0             
[10] GenomicRanges_1.32.3        GenomeInfoDb_1.16.0         IRanges_2.14.10            
[13] S4Vectors_0.18.3            BiocGenerics_0.26.0         genefilter_1.62.0          
[16] BiocInstaller_1.30.0        installr_0.20.0             stringr_1.3.1              

loaded via a namespace (and not attached):
 [1] bit64_0.9-7            splines_3.5.0          gtools_3.5.0          
 [4] Formula_1.2-3          latticeExtra_0.6-28    blob_1.1.1            
 [7] GenomeInfoDbData_1.1.0 pillar_1.2.3           RSQLite_2.1.1         
[10] backports_1.1.2        lattice_0.20-35        digest_0.6.15         
[13] XVector_0.20.0         checkmate_1.8.5        colorspace_1.3-2      
[16] htmltools_0.3.6        Matrix_1.2-14          plyr_1.8.4            
[19] XML_3.98-1.11          zlibbioc_1.26.0        xtable_1.8-2          
[22] snow_0.4-2             scales_0.5.0           gdata_2.18.0          
[25] htmlTable_1.12         tibble_1.4.2           annotate_1.58.0       
[28] nnet_7.3-12            lazyeval_0.2.1         survival_2.41-3       
[31] magrittr_1.5           memoise_1.1.0          foreign_0.8-70        
[34] tools_3.5.0            data.table_1.11.4      munsell_0.5.0         
[37] locfit_1.5-9.1         cluster_2.0.7-1        AnnotationDbi_1.42.1  
[40] compiler_3.5.0         caTools_1.17.1         rlang_0.2.1           
[43] grid_3.5.0             RCurl_1.95-4.10        rstudioapi_0.7        
[46] htmlwidgets_1.2        bitops_1.0-6           base64enc_0.1-3       
[49] gtable_0.2.0           DBI_1.0.0              gridExtra_2.3         
[52] knitr_1.20             bit_1.1-14             Hmisc_4.1-1           
[55] KernSmooth_2.23-15     stringi_1.1.7          Rcpp_0.12.17          
[58] geneplotter_1.58.0     rpart_4.1-13           acepack_1.4.1   

deseq2 rnaseq • 2.0k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 1 hour ago
United States

You can always check the NEWS file and the section of the vignette that lists important changes:

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#methods-changes-since-the-2014-deseq2-paper

If you noticed the LFCs change and the previous version of DESeq2 was v1.14 or less (these versions are from October 2016 or earlier) this is because of the betaPrior default changing from TRUE to FALSE, and now shrinkage is accomplished by a separate function, lfcShrink().

I posted on the forum describing the motivation for the change:

New function lfcShrink() in DESeq2

ADD COMMENT
0
Entering edit mode

Thanks, I actually saw that betaPrior is now set to FALSE, but I ignored it as I thought I`d anyway never used it, but if it was TRUE by default. My mistake.

When I now run DESeq with betaPrior=TRUE and get the same result as before (Aug 17), which should give  the result same as if I had used the lfcShrink function, if I understood that correctly from the post you sent.

ADD REPLY

Login before adding your answer.

Traffic: 952 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6