Hi, I am using DESeq2 to perform differential gene expression analysis at the single-cell level. Clustering was performed before this step. After clustering, I took 50 cells' gene expression data from cluster 1 and 50 cells from cluster2 and ran DESeq2. We also had cell annotation data. But, when the ultimate result came, it showed P_adjusted values 1 for all genes and all of the genes became insignificant. Highly sparsed data could be a problem but not getting even 5-6 significant genes made me surprised by the result as they belong to two different clusters. Therefore, it will be very helpful for me, if you kindly tell me what to do if the raw data that we are channeling to DESeq2 is highly sparse. Any hints or insights into this question will be very helpful to me. Waiting for your kind reply.
The code should be placed in three backticks as shown below
output of my sessioninfo( )
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] DESeq2_1.36.0 SummarizedExperiment_1.26.1 Biobase_2.56.0
[4] MatrixGenerics_1.8.1 matrixStats_0.62.0 GenomicRanges_1.48.0
[7] GenomeInfoDb_1.32.4 IRanges_2.30.1 S4Vectors_0.34.0
[10] BiocGenerics_0.42.0 forcats_0.5.2 stringr_1.4.1
[13] purrr_0.3.5 readr_2.1.3 tidyr_1.2.1
[16] tibble_3.1.8 ggplot2_3.3.6 tidyverse_1.3.2
[19] dplyr_1.0.10 R.matlab_3.7.0
loaded via a namespace (and not attached):
[1] bitops_1.0-7 fs_1.5.2 lubridate_1.8.0 bit64_4.0.5
[5] RColorBrewer_1.1-3 httr_1.4.4 tools_4.2.0 backports_1.4.1
[9] utf8_1.2.2 R6_2.5.1 DBI_1.1.3 colorspace_2.0-3
[13] withr_2.5.0 tidyselect_1.2.0 bit_4.0.4 compiler_4.2.0
[17] cli_3.4.1 rvest_1.0.3 xml2_1.3.3 DelayedArray_0.22.0
[21] scales_1.2.1 genefilter_1.78.0 R.utils_2.12.0 XVector_0.36.0
[25] pkgconfig_2.0.3 dbplyr_2.2.1 fastmap_1.1.0 rlang_1.0.6
[29] readxl_1.4.1 rstudioapi_0.14 RSQLite_2.2.18 generics_0.1.3
[33] jsonlite_1.8.2 BiocParallel_1.30.4 R.oo_1.25.0 googlesheets4_1.0.1
[37] RCurl_1.98-1.9 magrittr_2.0.3 GenomeInfoDbData_1.2.8 Matrix_1.5-1
[41] Rcpp_1.0.9 munsell_0.5.0 fansi_1.0.3 lifecycle_1.0.3
[45] R.methodsS3_1.8.2 stringi_1.7.8 zlibbioc_1.42.0 grid_4.2.0
[49] blob_1.2.3 parallel_4.2.0 crayon_1.5.2 lattice_0.20-45
[53] Biostrings_2.64.1 haven_2.5.1 splines_4.2.0 annotate_1.74.0
[57] hms_1.1.2 KEGGREST_1.36.3 locfit_1.5-9.6 pillar_1.8.1
[61] geneplotter_1.74.0 codetools_0.2-18 reprex_2.0.2 XML_3.99-0.11
[65] glue_1.6.2 modelr_0.1.9 png_0.1-7 vctrs_0.4.2
[69] tzdb_0.3.0 cellranger_1.1.0 gtable_0.3.1 assertthat_0.2.1
[73] cachem_1.0.6 xtable_1.8-4 broom_1.0.1 survival_3.3-1
[77] googledrive_2.0.0 gargle_1.2.1 AnnotationDbi_1.58.0 memoise_2.0.1
[81] ellipsis_0.3.2
Thanks, Michael for the reply. I will surely read the paper. I will let you know if I get the result. Earlier DESeq2 somehow could not find to get any up or down-regulated genes and P_adj values were too bad. Even data preprocessing by different method failed to reduce the p_adj below 0.1 which I am using as a cut off of false discovery.