deseq2 outliers problems
1
0
Entering edit mode
aristotele_m ▴ 40
@aristotele_m-6821
Last seen 7.6 years ago
Italy

Dear all,

I have this situation for a gene overexppressed:

Group A: average 85.33

GroupB: average 23081.19

average gene 1930.54

On the result table  of differential expression:

basemean: 1930.54,log2Foldchange 5.115 lfcSE 0.341

The results seem more different from  the ratio obtained.. 

I found this results also in other comparison:

   
GROUP A GROUP B
89.267 21448.225

 

baseMean log2FoldChange lfcSE stat pvalue padj
1794.7910 1.70 0.175111091 9.7157783091 2.58268280887747E-22 3.97733152567131E-18

Any idea?

 

 

 

R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.2 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=it_IT.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=it_IT.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=it_IT.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] gplots_3.0.1               genefilter_1.54.2          limma_3.28.21             
 [4] biomaRt_2.28.0             reshape2_1.4.2             RColorBrewer_1.1-2        
 [7] ggplot2_2.2.1              pheatmap_1.0.8             DESeq2_1.12.4             
[10] SummarizedExperiment_1.2.3 Biobase_2.32.0             GenomicRanges_1.24.3      
[13] GenomeInfoDb_1.8.7         IRanges_2.6.1              S4Vectors_0.10.3          
[16] BiocGenerics_0.18.0       

loaded via a namespace (and not attached):
 [1] gtools_3.5.0         locfit_1.5-9.1       splines_3.3.2        lattice_0.20-35     
 [5] colorspace_1.3-2     htmltools_0.3.5      base64enc_0.1-3      survival_2.40-1     
 [9] XML_3.98-1.5         foreign_0.8-67       DBI_0.6-1            BiocParallel_1.6.6  
[13] plyr_1.8.4           stringr_1.2.0        zlibbioc_1.18.0      munsell_0.4.3       
[17] gtable_0.2.0         caTools_1.17.1       htmlwidgets_0.8      memoise_1.0.0       
[21] labeling_0.3         latticeExtra_0.6-28  knitr_1.15.1         geneplotter_1.50.0  
[25] AnnotationDbi_1.34.4 htmlTable_1.9        Rcpp_0.12.9          KernSmooth_2.23-15  
[29] acepack_1.4.1        xtable_1.8-2         backports_1.0.5      scales_0.4.1        
[33] checkmate_1.8.2      gdata_2.17.0         Hmisc_4.0-2          annotate_1.50.1     
[37] XVector_0.12.1       gridExtra_2.2.1      digest_0.6.12        stringi_1.1.2       
[41] grid_3.3.2           tools_3.3.2          bitops_1.0-6         magrittr_1.5        
[45] lazyeval_0.2.0       RCurl_1.95-4.8       tibble_1.2           RSQLite_1.1-2       
[49] Formula_1.2-1        cluster_2.0.6        Matrix_1.2-8         data.table_1.10.0   
[53] assertthat_0.1       rpart_4.1-10         nnet_7.3-12        
deseq2 • 1.1k views
ADD COMMENT
0
Entering edit mode

Can you give us more details to understand the problem you are facing?

ADD REPLY
0
Entering edit mode

Gruop  A  express GFP and Group B express my target genes. So I want to understand the effect of  overexpression of my gene on  my transcriptome.

I know are overexpressed and also my count demonstate are overexpressed but the fold change are not close with my ratio calclulate from the counts. Is it normal?

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 1 day ago
United States

If you used the latest version 1.16 you would get an MLE fold change. You can set betaPrior=FALSE to get an MLE fold change with older versions.

ADD COMMENT
0
Entering edit mode

Thanks now works. But I have only this series where I need that comand. What could be the reason? Is a bug or  don't change so much because I have always I very  low p-value (^-18)

 

ADD REPLY
0
Entering edit mode

It's not a bug it's a feature. Likely the MLE fold change is driven by an outlier. We provide moderated LFC estimates (previously using DESeq(), now using lfcShrink() after DESeq()). We can show that moderation of LFC produces more reproducible estimates:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4302049/figure/Fig3/

ADD REPLY

Login before adding your answer.

Traffic: 452 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6