Question: How to exactly replicate the results from plotMA in DESeq2 ?
gravatar for Yvan Wenger
9 months ago by
Yvan Wenger50
Yvan Wenger50 wrote:

Hello everyone,

I was comparing the graph generated by the function plotMA() from DESeq2, with the graph that I generated supposing

I extracted the data with this function:

ds2data <-,cooksCutoff = FALSE))

and plotted the result with ggplot2 using similar margins and color code. The points seem to be at the same positions, however, the colors are sometimes not matching (there are for example two red and three grey triangles on the lower boundary, while there are three red and two grey triangles on my plot).

Does this come to some additional parameters passed to results() ? Or some tweaking in plotMA()?



The images are:

plotMA with DESeq2:

My plot based on the results() function above:


R version 3.4.1 (2017-06-30)

Platform: x86_64-pc-linux-gnu (64-bit)

Running under: Ubuntu 14.04.5 LTS

Matrix products: default

BLAS: /usr/lib/atlas-base/atlas/

LAPACK: /usr/lib/lapack/


 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   

 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            


attached base packages:

[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:

 [1] bindrcpp_0.2               tidyr_0.7.1                DESeq2_1.16.1              SummarizedExperiment_1.6.3 DelayedArray_0.2.7        

 [6] matrixStats_0.52.2         Biobase_2.36.2             GenomicRanges_1.28.5       GenomeInfoDb_1.12.2        IRanges_2.10.3            

[11] S4Vectors_0.14.4           BiocGenerics_0.22.0        gplots_3.0.1               dplyr_0.7.3                ggplot2_2.2.1             

[16] BiocInstaller_1.26.1      

loaded via a namespace (and not attached):

 [1] bit64_0.9-7             splines_3.4.1           gtools_3.5.0            Formula_1.2-2           assertthat_0.2.0       

 [6] latticeExtra_0.6-28     blob_1.1.0              GenomeInfoDbData_0.99.0 RSQLite_2.0             backports_1.1.0        

[11] lattice_0.20-35         glue_1.1.1              digest_0.6.12           RColorBrewer_1.1-2      XVector_0.16.0         

[16] checkmate_1.8.3         colorspace_1.3-2        htmltools_0.3.6         Matrix_1.2-11           plyr_1.8.4             

[21] XML_3.98-1.9            pkgconfig_2.0.1         genefilter_1.58.1       zlibbioc_1.22.0         purrr_0.2.3            

[26] xtable_1.8-2            scales_0.5.0            gdata_2.18.0            BiocParallel_1.10.1     htmlTable_1.9          

[31] tibble_1.3.4            annotate_1.54.0         nnet_7.3-12             lazyeval_0.2.0          survival_2.41-3        

[36] magrittr_1.5            memoise_1.1.0           foreign_0.8-69          tools_3.4.1             data.table_1.10.4      

[41] stringr_1.2.0           locfit_1.5-9.1          munsell_0.4.3           cluster_2.0.6           AnnotationDbi_1.38.2   

[46] compiler_3.4.1          caTools_1.17.1          rlang_0.1.2             grid_3.4.1              RCurl_1.95-4.8         

[51] htmlwidgets_0.9         labeling_0.3            bitops_1.0-6            base64enc_0.1-3         gtable_0.2.0           

[56] DBI_0.7                 R6_2.2.2                gridExtra_2.3           knitr_1.17              bit_1.1-12             

[61] bindr_0.1               Hmisc_4.0-3             KernSmooth_2.23-15      stringi_1.1.5           Rcpp_0.12.12           

[66] geneplotter_1.54.0      rpart_4.1-11            acepack_1.4.1           tidyselect_0.2.0  
ADD COMMENTlink modified 9 months ago • written 9 months ago by Yvan Wenger50


Right. In the graphical examples linked above I used ylim on plotMA to control this.

plotMA(dds, ylim = c(-7.5, 7.5))

But actually the points seems to be at the same positions. What I am not sure about, is why the significances are sometimes different.

Here is what I used for ggplot2 with the data from

ds2data <-,cooksCutoff = FALSE))

I only slightly modified ds2data so that abs(log2FoldChange values) does not go beyond ±7.5 (log2ChangeMAX).

ggplot(data = ds2data) +
    aes(x = baseMean, y = log2FoldChangeMAX),
    size = 1.25,
    color = ifelse(dge_FH$padj < 0.1, "red", "grey20"),
    alpha = 0.5,
    shape = ifelse(abs(dge_FH$log2FoldChange) >= 7.5, 17, 16)
  ) +

I played a bit with the p-adj threshold (Padj < 0.1, Padj <= 0.1, the vignette says "...colored red if the adjusted p value is less than 0.1" so I assume Padj<0.1 is correct). Anyway, this does not seem to be the cause of the difference because I had to significantly lower the Pajd threshold on my script so that this grey triangle lying against the -7.5 boundary that has a different color between the two plots became unsignificant (grey), by this time, there were many more differences between my plot and the DESeq2::plotMA result.


ADD REPLYlink written 9 months ago by Yvan Wenger50

I would avoid putting another data.frame in the geom_point() call. You could have a few rows shifted/missing which could give wrong coloring. Why not just put padj and log2FoldChange as columns of ds2data?

ADD REPLYlink written 9 months ago by Michael Love18k

Ooops my mistake... I changed the names in my code meanwhile and wanted to be consistent with the previous post but missed this one... all my data come from the same data.frame... Here is the full code.

ds2data <-,cooksCutoff = FALSE))

ds2data <-
  ds2data %>% dplyr::mutate(
      log2FoldChangeMAX = ifelse(
      abs(log2FoldChange) > 7.5,
      sign(log2FoldChange) * 7.5,

ggplot(data = ds2data) +
    aes(x = baseMean, y = log2FoldChangeMAX),
    size = 1.25,
    color = ifelse(ds2data$padj < 0.1, "red", "grey20"),
    alpha = 0.5,
    shape = ifelse(abs(ds2data$log2FoldChange) >= 7.5, 17, 16)
  ) +

as compared with

plotMA(dds, ylim = c(-7.5, 7.5))
ADD REPLYlink modified 9 months ago • written 9 months ago by Yvan Wenger50

That's not the recommended way of using ggplot() and I don't have much time to help debug here.

You shouldn't be using "$" in the ggplot() call or geoms, all the information should be in the data that is provided to ggplot()

ADD REPLYlink written 9 months ago by Michael Love18k

Dear Dr. Love, many thanks for the advice on how to use ggplot... however, the discrepancy does not come from it as it clearly reflects the dataset generated by DESeq2::results() using the cooksCutoff = F option that seems to be used also in plotMA()

I was in fact more asking about the details of the call of results() that is used to produce the MA-plot, but how to explore S4 objects is obscure to me, if anyone else from the community has a suggestion, it would be great.

ADD REPLYlink modified 9 months ago • written 9 months ago by Yvan Wenger50

Oh I see now that you didn't call results() when you use plotMA(). 

I recommend using:

res <- results(dds, ...)

and then


so you can control exactly how the results table is generated. This is the way we show plotMA in the user guides. Sorry I didn't catch if this was the intent of your original question.

ADD REPLYlink written 9 months ago by Michael Love18k

Many thanks! I get now perfectly identical plots if I use the result of results() into plotMA or into my ggplot2 plot.


ADD REPLYlink written 9 months ago by Yvan Wenger50
gravatar for Michael Love
9 months ago by
Michael Love18k
United States
Michael Love18k wrote:


If you want to compare the points on the boundary, you'll need to set the ylim to be exactly equal, right?

ADD COMMENTlink written 9 months ago by Michael Love18k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 424 users visited in the last hour