weird MAplot or volcano plot of DESeq2 diff result
1
0
Entering edit mode
@shangguandong1996-21805
Last seen 18 months ago
China

Hi, Dr Dr Love

I find a werid MAplot or volcano plot of DESeq reuslt. I am wondering whether you can give me some advice.

This diff result is from two cell type bulk RNA-seq. I use two specific marker to get these two cell type using Flow cytometer. I alreadly know these cell type have very different RNA composition.

As you can see, the MA-plot is not like DESeq2 manul example. And the volcano plot is more werid, it looks like -log10(padj) is the linear function of log2FoldChange.

MAplot volcano plot

If you want to repeat my data, here is my data link:

https://github.com/shangguandong1996/picture_link/blob/main/WFX_count_Rmatrix.txt

And here is my code

# Prepare -----------------------------------------------------------------

# load up the packages
library(DESeq2)
library(dplyr)

library(ggplot2)

# Set Options
options(stringsAsFactors = F)

# load up the data
data <- read.table("rawdata/count/WFX_count_Rmatrix.txt",
                   header = TRUE,
                   row.names = 1)

coldata <- data.frame(row.names = colnames(data),
                      type = rep(c("Fx593", "Fx600"), each = 2))


# DESeq2 ------------------------------------------------------------------

dds <- DESeqDataSetFromMatrix(countData = data,
                              colData = coldata,
                              design= ~ type)
# PCA
vsd <- vst(dds)
plotPCA(vsd, intgroup = "type")

dds <- DESeq(dds)

res_lfc <- lfcShrink(dds = dds,
                     type = "ashr",
                     coef = "type_Fx600_vs_Fx593")

plotMA(res_lfc)

as_tibble(res_lfc) %>% 
    mutate(padj = case_when(
        is.na(padj) ~ 1,
        TRUE ~ padj
    )) %>% 
    ggplot(aes(x = log2FoldChange, y = -log10(padj))) +
    geom_point()
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /opt/sysoft/R-3.6.1/lib64/R/lib/libRblas.so
LAPACK: /opt/sysoft/R-3.6.1/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
 [1] ggplot2_3.3.3               dplyr_1.0.5                
 [3] DESeq2_1.26.0               SummarizedExperiment_1.16.0
 [5] DelayedArray_0.12.0         BiocParallel_1.19.6        
 [7] matrixStats_0.58.0          Biobase_2.46.0             
 [9] GenomicRanges_1.38.0        GenomeInfoDb_1.22.0        
[11] IRanges_2.20.0              S4Vectors_0.24.0           
[13] BiocGenerics_0.32.0        

loaded via a namespace (and not attached):
 [1] bitops_1.0-7           bit64_4.0.5            doParallel_1.0.15     
 [4] RColorBrewer_1.1-2     tools_3.6.1            backports_1.2.1       
 [7] utf8_1.2.1             R6_2.5.0               rpart_4.1-15          
[10] Hmisc_4.5-0            DBI_1.1.1              colorspace_2.0-1      
[13] nnet_7.3-16            withr_2.4.2            tidyselect_1.1.0      
[16] gridExtra_2.3          bit_4.0.4              compiler_3.6.1        
[19] cli_2.5.0              htmlTable_2.1.0        labeling_0.4.2        
[22] scales_1.1.1           checkmate_2.0.0        SQUAREM_2017.10-1     
[25] genefilter_1.68.0      mixsqp_0.2-2           stringr_1.4.0         
[28] digest_0.6.27          foreign_0.8-73         XVector_0.26.0        
[31] pscl_1.5.2             base64enc_0.1-3        jpeg_0.1-8.1          
[34] pkgconfig_2.0.3        htmltools_0.5.1.1      fastmap_1.1.0         
[37] htmlwidgets_1.5.3      rlang_0.4.11           rstudioapi_0.13       
[40] RSQLite_2.2.7          generics_0.1.0         farver_2.1.0          
[43] RCurl_1.98-1.3         magrittr_2.0.1         GenomeInfoDbData_1.2.2
[46] Formula_1.2-4          Matrix_1.3-2           Rcpp_1.0.6            
[49] munsell_0.5.0          fansi_0.4.2            lifecycle_1.0.0       
[52] stringi_1.5.3          MASS_7.3-54            zlibbioc_1.32.0       
[55] grid_3.6.1             blob_1.2.1             crayon_1.4.1          
[58] lattice_0.20-44        splines_3.6.1          annotate_1.64.0       
[61] locfit_1.5-9.4         knitr_1.33             pillar_1.6.0          
[64] geneplotter_1.64.0     codetools_0.2-18       XML_3.99-0.3          
[67] glue_1.4.1             packrat_0.5.0          latticeExtra_0.6-29   
[70] data.table_1.12.6      png_0.1-7              vctrs_0.3.8           
[73] foreach_1.4.7          gtable_0.3.0           purrr_0.3.3           
[76] assertthat_0.2.1       ashr_2.2-39            cachem_1.0.4          
[79] xfun_0.22              xtable_1.8-4           survival_3.2-11       
[82] truncnorm_1.0-8        tibble_3.1.1           iterators_1.0.12      
[85] AnnotationDbi_1.48.0   memoise_2.0.0          cluster_2.1.2         
[88] ellipsis_0.3.2
DESeq2 • 2.9k views
ADD COMMENT
1
Entering edit mode

This is probably a consequence of many rows with overall very low counts.

summary(rowSums((counts(dds) > 10)) >= 2)
   Mode   FALSE    TRUE 
logical   25902   11434 

Here you see that only about 1/3 of your rows actually have more than 10 raw counts in at least two of the four samples. I guess this is where some pre-filtering will help removing spurious calls.

ADD REPLY
0
Entering edit mode

Thanks, AT. After

keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]

The MA or volcano plot is better. But I am wondering whether there are other advices for this data type, after all, this data type is "pseudo single-cell RNA-seq".

ADD REPLY
0
Entering edit mode

I also post a question in biostar

https://www.biostars.org/p/9484986/

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 17 hours ago
United States

This has been addressed in your new post.

ADD COMMENT

Login before adding your answer.

Traffic: 1076 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6