Question

How to do differential gene expression analysis of very highly sparsed single cell data?

0

Entering edit mode

Subhasis • 0

@00ea5c2b

Last seen 17 months ago

India

Hi, I am using DESeq2 to perform differential gene expression analysis at the single-cell level. Clustering was performed before this step. After clustering, I took 50 cells' gene expression data from cluster 1 and 50 cells from cluster2 and ran DESeq2. We also had cell annotation data. But, when the ultimate result came, it showed P_adjusted values 1 for all genes and all of the genes became insignificant. Highly sparsed data could be a problem but not getting even 5-6 significant genes made me surprised by the result as they belong to two different clusters. Therefore, it will be very helpful for me, if you kindly tell me what to do if the raw data that we are channeling to DESeq2 is highly sparse. Any hints or insights into this question will be very helpful to me. Waiting for your kind reply.

The code should be placed in three backticks as shown below

output of my sessioninfo( )

R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DESeq2_1.36.0               SummarizedExperiment_1.26.1 Biobase_2.56.0             
 [4] MatrixGenerics_1.8.1        matrixStats_0.62.0          GenomicRanges_1.48.0       
 [7] GenomeInfoDb_1.32.4         IRanges_2.30.1              S4Vectors_0.34.0           
[10] BiocGenerics_0.42.0         forcats_0.5.2               stringr_1.4.1              
[13] purrr_0.3.5                 readr_2.1.3                 tidyr_1.2.1                
[16] tibble_3.1.8                ggplot2_3.3.6               tidyverse_1.3.2            
[19] dplyr_1.0.10                R.matlab_3.7.0             

loaded via a namespace (and not attached):
 [1] bitops_1.0-7           fs_1.5.2               lubridate_1.8.0        bit64_4.0.5           
 [5] RColorBrewer_1.1-3     httr_1.4.4             tools_4.2.0            backports_1.4.1       
 [9] utf8_1.2.2             R6_2.5.1               DBI_1.1.3              colorspace_2.0-3      
[13] withr_2.5.0            tidyselect_1.2.0       bit_4.0.4              compiler_4.2.0        
[17] cli_3.4.1              rvest_1.0.3            xml2_1.3.3             DelayedArray_0.22.0   
[21] scales_1.2.1           genefilter_1.78.0      R.utils_2.12.0         XVector_0.36.0        
[25] pkgconfig_2.0.3        dbplyr_2.2.1           fastmap_1.1.0          rlang_1.0.6           
[29] readxl_1.4.1           rstudioapi_0.14        RSQLite_2.2.18         generics_0.1.3        
[33] jsonlite_1.8.2         BiocParallel_1.30.4    R.oo_1.25.0            googlesheets4_1.0.1   
[37] RCurl_1.98-1.9         magrittr_2.0.3         GenomeInfoDbData_1.2.8 Matrix_1.5-1          
[41] Rcpp_1.0.9             munsell_0.5.0          fansi_1.0.3            lifecycle_1.0.3       
[45] R.methodsS3_1.8.2      stringi_1.7.8          zlibbioc_1.42.0        grid_4.2.0            
[49] blob_1.2.3             parallel_4.2.0         crayon_1.5.2           lattice_0.20-45       
[53] Biostrings_2.64.1      haven_2.5.1            splines_4.2.0          annotate_1.74.0       
[57] hms_1.1.2              KEGGREST_1.36.3        locfit_1.5-9.6         pillar_1.8.1          
[61] geneplotter_1.74.0     codetools_0.2-18       reprex_2.0.2           XML_3.99-0.11         
[65] glue_1.6.2             modelr_0.1.9           png_0.1-7              vctrs_0.4.2           
[69] tzdb_0.3.0             cellranger_1.1.0       gtable_0.3.1           assertthat_0.2.1      
[73] cachem_1.0.6           xtable_1.8-4           broom_1.0.1            survival_3.3-1        
[77] googledrive_2.0.0      gargle_1.2.1           AnnotationDbi_1.58.0   memoise_2.0.1         
[81] ellipsis_0.3.2

DESeq2 • 855 views

ADD COMMENT • link 18 months ago Subhasis • 0

score 0 · Answer 1 · 2022-10-17

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 17 hours ago

United States

See the glmGamPoi paper and methods, you can run them within DESeq2.

ADD COMMENT • link 18 months ago Michael Love 41k

0

Entering edit mode

Thanks, Michael for the reply. I will surely read the paper. I will let you know if I get the result. Earlier DESeq2 somehow could not find to get any up or down-regulated genes and P_adj values were too bad. Even data preprocessing by different method failed to reduce the p_adj below 0.1 which I am using as a cut off of false discovery.

ADD REPLY • link 18 months ago Subhasis • 0