Opposite sign of LFC in count plots of DEGs (DESeq2)
Entering edit mode
Last seen 23 days ago
South Korea

Hello, Thanks to you, I was able to easily enter bioinformatics.

I'm analyzing bulk RNA-seq for the first time. I've drawn the count plots of DEGs, but I don't understand the sign of the LFC.

Code should be placed in three backticks as shown below

# use DESeq()
dds_ver2 <- DESeqDataSetFromTximport(txi_ver2, colData = meta_ver2, design = ~sampletype_ver2)
dds_ver2 <- estimateSizeFactors(dds_ver2)
dds_ver2 <- DESeq(dds_ver2)

# use results()
res_h_d_ver2_noShrink <- results(dds_RUVg_ver2, contrast = c("sampletype_ver2","Lgr5_DTA","Homeostasis"), independentFiltering = F)

res_h_d_ver2_noShrink image

Then, I plotted count plot

# get DEGs
res_h_d_ver2_noShrink_tb <- res_h_d_ver2_noShrink %>% data.frame() %>% rownames_to_column(var='gene') %>% as_tibble()
res_h_d_ver2_noShrink_tb <- merge(res_h_d_ver2_noShrink_tb, vM34annot, by.x='gene', by.y='ensgene')
sig_h_d_up_ver2 <- res_h_d_tb_ver2_noShrink_tb %>% filter(padj < 0.05) %>% filter(log2FoldChange>=1)
sig_h_d_down_ver2 <- res_h_d_tb_ver2_noShrink_tb %>% filter(padj < 0.05) %>% filter(log2FoldChange<=-1)
sig_h_d_both_ver2 <- bind_rows(sig_h_d_up_ver2,sig_h_d_down_ver2)

# Count plots of DEGs
ordered_sig_h_d_ver2 <- sig_h_d_both_ver2[order(sig_h_d_both_ver2$log2FoldChange, decreasing =TRUE),]
sig_h_d_ver2_symbol <- ordered_sig_h_d_ver2$symbol
sig_h_d_ver2_ensname <- ordered_sig_h_d_ver2$gene
for (i in 1:29){
  DEG_h_d_ver2[[i]] <- plotCounts(dds_h_d_ver2,
                                  gene = sig_h_d_ver2_ensname[i],
                                  intgroup = "sampletype_h_d_ver2",
                                  returnData = TRUE)
for (i in 1:29){
  data <- DEG_h_d_ver2[[i]]
  title <- sig_h_d_ver2_symbol[i]
  q <- ggplot(data, aes(x=sampletype_h_d_ver2, y=count, color=sampletype_h_d_ver2))+
    geom_boxplot(width=0.3, show.legend = F)+
    geom_point(size=5, position=position_jitter(w=0.1,h=0))+
    geom_text_repel(aes(label=rownames(data)), size=4, show.legend = F)+
    ggtitle(paste("[DEG - Homeostasis vs. Lgr5-DTA] ", title))+
    theme(plot.title = element_text(hjust = 0.5), aspect.ratio = 1)+
    annotate("text", x=0.5, y=Inf,
             label = paste("padj = ", format(ordered_sig_h_d_ver2[i,]$padj, scientific = TRUE)),
             hjust = 0, vjust = 1)+
    annotate("text", x=0.5, y=Inf,
             label = paste("LFC = ", round((ordered_sig_h_d_ver2[i,]$log2FoldChange),digits =2)),
             hjust = 0, vjust = 3)+
    scale_color_manual(values = c("#f8766d","#00ba38"))

  ggsave(paste("plot_ver2/DEG_count_plots/1_sig_h_d_ver2/DEG_Hom_vs_Lgr5_DTA_",i,"_",title, ".png", sep = ""), plot = q, width = 8, height = 8, units = "in")

I understand that LFC converges to zero when the variance is large and that the LFC calculation in DESeq2 is not just treatment/control. Although there are some strange count plots the sign of the LFC are wrong. I'll show you one good plot and two strange plots. good count plot strange count plot 1 strange count plot 2

sessionInfo( )
> sessionInfo()
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22631)

Matrix products: default

[1] LC_COLLATE=Korean_Korea.utf8  LC_CTYPE=Korean_Korea.utf8    LC_MONETARY=Korean_Korea.utf8
[4] LC_NUMERIC=C                  LC_TIME=Korean_Korea.utf8    

time zone: Asia/Seoul
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] EnhancedVolcano_1.20.0      lubridate_1.9.3             forcats_1.0.0               stringr_1.5.1              
 [5] purrr_1.0.2                 readr_2.1.5                 tidyr_1.3.1                 tibble_3.2.1               
 [9] tidyverse_2.0.0             patchwork_1.2.0             Seurat_5.0.3                SeuratObject_5.0.1         
[13] sp_2.1-3                    sva_3.50.0                  genefilter_1.84.0           mgcv_1.9-0                 
[17] nlme_3.1-163                RUVSeq_1.36.0               pheatmap_1.0.12             dplyr_1.1.4                
[21] ggrepel_0.9.5               DESeq2_1.42.0               ggplot2_3.5.0               EDASeq_2.36.0              
[25] ShortRead_1.60.0            GenomicAlignments_1.38.2    SummarizedExperiment_1.32.0 MatrixGenerics_1.14.0      
[29] matrixStats_1.2.0           Rsamtools_2.18.0            GenomicRanges_1.54.1        Biostrings_2.70.2          
[33] GenomeInfoDb_1.38.8         XVector_0.42.0              IRanges_2.36.0              S4Vectors_0.40.2           
[37] BiocParallel_1.36.0         Biobase_2.62.0              BiocGenerics_0.48.1         edgeR_4.0.16               
[41] limma_3.58.1               

loaded via a namespace (and not attached):
  [1] R.methodsS3_1.8.2       progress_1.2.3          urlchecker_1.0.1        goftest_1.2-3          
  [5] vctrs_0.6.5             spatstat.random_3.2-3   digest_0.6.34           png_0.1-8              
  [9] deldir_2.0-4            parallelly_1.37.1       renv_1.0.5              MASS_7.3-60            
 [13] reshape2_1.4.4          httpuv_1.6.14           qvalue_2.34.0           withr_3.0.0            
 [17] ggfun_0.1.4             ellipsis_0.3.2          survival_3.5-7          memoise_2.0.1          
 [21] clusterProfiler_4.10.1  gson_0.1.0              profvis_0.3.8           systemfonts_1.0.6      
 [25] ragg_1.3.0              tidytree_0.4.6          zoo_1.8-12              pbapply_1.7-2          
 [29] R.oo_1.26.0             prettyunits_1.2.0       KEGGREST_1.42.0         promises_1.2.1         
 [33] httr_1.4.7              restfulr_0.0.15         globals_0.16.3          fitdistrplus_1.1-11    
 [37] rstudioapi_0.16.0       miniUI_0.1.1.1          generics_0.1.3          DOSE_3.28.2            
 [41] curl_5.2.1              zlibbioc_1.48.0         ggraph_2.2.1            polyclip_1.10-6        
 [45] GenomeInfoDbData_1.2.11 SparseArray_1.2.4       xtable_1.8-4            S4Arrays_1.2.0         
 [49] BiocFileCache_2.10.2    hms_1.1.3               irlba_2.3.5.1           colorspace_2.1-0       
 [53] filelock_1.0.3          ROCR_1.0-11             reticulate_1.35.0       spatstat.data_3.0-4    
 [57] magrittr_2.0.3          lmtest_0.9-40           later_1.3.2             viridis_0.6.5          
 [61] ggtree_3.10.1           lattice_0.21-9          spatstat.geom_3.2-9     future.apply_1.11.2    
 [65] scattermore_1.2         XML_3.99-0.16.1         triebeard_0.4.1         shadowtext_0.1.3       
 [69] cowplot_1.1.3           RcppAnnoy_0.0.22        pillar_1.9.0            compiler_4.3.2         
 [73] RSpectra_0.16-1         stringi_1.8.3           tensor_1.5              devtools_2.4.5         
 [77] genekitr_1.2.5          plyr_1.8.9              crayon_1.5.2            abind_1.4-5            
 [81] BiocIO_1.12.0           gridGraphics_0.5-1      emdbook_1.3.13          locfit_1.5-9.8         
 [85] graphlayouts_1.1.1      bit_4.0.5               geneset_0.2.7           fastmatch_1.1-4        
 [89] textshaping_0.3.7       codetools_0.2-19        plotly_4.10.4           mime_0.12              
 [93] splines_4.3.2           Rcpp_1.0.12             fastDummies_1.7.3       dbplyr_2.5.0           
 [97] europepmc_0.4.3         HDO.db_0.99.1           interp_1.1-6            blob_1.2.4             
[101] utf8_1.2.4              apeglm_1.24.0           fs_1.6.3                listenv_0.9.1          
[105] pkgbuild_1.4.4          openxlsx_4.2.5.2        ggplotify_0.1.2         Matrix_1.6-5           
[109] statmod_1.5.0           tzdb_0.4.0              tweenr_2.0.3            pkgconfig_2.0.3        
[113] tools_4.3.2             cachem_1.0.8            RSQLite_2.3.5           viridisLite_0.4.2      
[117] DBI_1.2.2               numDeriv_2016.8-1.1     fastmap_1.1.1           scales_1.3.0           
[121] grid_4.3.2              usethis_2.2.3           ica_1.0-3               coda_0.19-4.1          
[125] BiocManager_1.30.22     dotCall64_1.1-1         RANN_2.6.1              farver_2.1.1           
[129] tidygraph_1.3.1         scatterpie_0.2.2        yaml_2.3.8              latticeExtra_0.6-30    
[133] rtracklayer_1.62.0      cli_3.6.2               leiden_0.4.3.1          lifecycle_1.0.4        
[137] uwot_0.1.16             mvtnorm_1.2-4           sessioninfo_1.2.2       annotate_1.80.0        
[141] timechange_0.3.0        gtable_0.3.4            rjson_0.2.21            ggridges_0.5.6         
[145] progressr_0.14.0        parallel_4.3.2          ape_5.7-1               jsonlite_1.8.8         
[149] RcppHNSW_0.6.0          bitops_1.0-7            bit64_4.0.5             Rtsne_0.17             
[153] yulab.utils_0.1.4       spatstat.utils_3.0-4    zip_2.3.1               urltools_1.7.3         
[157] bdsmatrix_1.3-7         GOSemSim_2.28.1         R.utils_2.12.3          lazyeval_0.2.2         
[161] shiny_1.8.1.1           htmltools_0.5.7         enrichplot_1.22.0       GO.db_3.18.0           
[165] sctransform_0.4.1       rappdirs_0.3.3          glue_1.7.0              ggvenn_0.1.10          
[169] spam_2.10-0             RCurl_1.98-1.14         treeio_1.26.0           jpeg_0.1-10            
[173] gridExtra_2.3           igraph_2.0.3            R6_2.5.1                labeling_0.4.3         
[177] GenomicFeatures_1.54.4  cluster_2.1.4           bbmle_1.0.25.1          pkgload_1.3.4          
[181] aplot_0.2.2             DelayedArray_0.28.0     tidyselect_1.2.1        ggforce_0.4.2          
[185] xml2_1.3.6              AnnotationDbi_1.64.1    future_1.33.2           munsell_0.5.1          
[189] KernSmooth_2.23-22      data.table_1.15.2       htmlwidgets_1.6.4       aroma.light_3.32.0     
[193] fgsea_1.28.0            RColorBrewer_1.1-3      hwriter_1.3.2.1         biomaRt_2.58.2         
[197] rlang_1.1.3             spatstat.sparse_3.0-3   spatstat.explore_3.2-7  remotes_2.5.0          
[201] ggnewscale_0.4.10       fansi_1.0.6
DESeq2 • 366 views
Entering edit mode
swbarnes2 ★ 1.4k
Last seen 21 hours ago
San Diego

Your contrast is comparing LGR5 to Homeostasis. The bottom 2 are fine, it's the top one that is wrong.

Are you sure this step isn't mislabeling things?

sig_h_d_ver2_symbol <- ordered_sig_h_d_ver2$symbol
sig_h_d_ver2_ensname <- ordered_sig_h_d_ver2$gene
Entering edit mode

Thank you for your comment!

1)"Your contrast is comparing LGR5 to Homeostasis." means (Lgr5)/(Homeostasis). Is it right?

Then, I think if LGR5 has a higher expression level than Homeostasis, LFC must be positive.

If LGR5 has a lower expression level than Homeostasis, LFC must be negative.

May I ask why you said "The bottom 2 are fine, it's the top one that is wrong"?

2)I think there are no problem in those codes.

sig_h_d_ver2_symbol <- ordered_sig_h_d_ver2$symbol
sig_h_d_ver2_ensname <- ordered_sig_h_d_ver2$gene
Entering edit mode

You will need to show some reproducible code for anybody to track this down. Please note that DESeq2 has been used by thousands of people (probably) millions of times, and to my knowledge nobody has ever reported that it randomly assigns logFC values that don't reflect the underlying differences. Because of that, it is far more likely that you have made a mistake somewhere, which nobody can diagnose just by looking at your code.

Entering edit mode

Thank you for your comment!

I'll review my code from top to bottom.


Login before adding your answer.

Traffic: 588 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6