scDblFinder and amulet
1
0
Entering edit mode
Bogdan ▴ 640
@bogdan-2367
Last seen 7 days ago
Palo Alto, CA, USA

Hi everyone,

After loading the library "scDblFinder", I can not find the functions "amulet" and "clamulet".

Any suggestions please ? Thank you !

 > library(scDblFinder)
> ?clamulet
No documentation for ‘clamulet’ in specified packages and libraries:
you could try ‘??clamulet’
> ?amulet
No documentation for ‘amulet’ in specified packages and libraries:
you could try ‘??amulet’
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS:   /home/btanasa/R-4.1.2/R-4.1.2/lib/libRblas.so
LAPACK: /home/btanasa/R-4.1.2/R-4.1.2/lib/libRlapack.so

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8       LC_NAME=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] scDblFinder_1.8.0

loaded via a namespace (and not attached):
[1] locfit_1.5-9.5              ggrepel_0.9.1
[3] Rcpp_1.0.8.3                rsvd_1.0.5
[5] lattice_0.20-45             assertthat_0.2.1
[7] SingleCellExperiment_1.16.0 utf8_1.2.2
[9] R6_2.5.1                    GenomeInfoDb_1.30.1
[11] stats4_4.1.2                bluster_1.4.0
[13] ggplot2_3.3.6               pillar_1.7.0
[15] sparseMatrixStats_1.6.0     zlibbioc_1.40.0
[17] rlang_1.0.2                 data.table_1.14.2
[19] irlba_2.3.5                 S4Vectors_0.32.4
[21] Matrix_1.4-1                BiocNeighbors_1.12.0
[23] statmod_1.4.36              BiocParallel_1.28.3
[25] igraph_1.3.1                RCurl_1.98-1.6
[27] munsell_0.5.0               beachmat_2.10.0
[29] DelayedArray_0.20.0         vipor_0.4.5
[31] compiler_4.1.2              BiocSingular_1.10.0
[33] pkgconfig_2.0.3             BiocGenerics_0.40.0
[35] ggbeeswarm_0.6.0            tidyselect_1.1.2
[37] SummarizedExperiment_1.24.0 gridExtra_2.3
[39] tibble_3.1.7                GenomeInfoDbData_1.2.7
[41] edgeR_3.36.0                IRanges_2.28.0
[43] matrixStats_0.62.0          metapod_1.2.0
[45] viridisLite_0.4.0           fansi_1.0.3
[47] crayon_1.5.1                dplyr_1.0.9
[49] MASS_7.3-57                 bitops_1.0-7
[51] grid_4.1.2                  gtable_0.3.0
[53] lifecycle_1.0.1             DBI_1.1.2
[55] magrittr_2.0.3              dqrng_0.3.0
[57] scales_1.2.0                ScaledMatrix_1.2.0
[59] cli_3.3.0                   stringi_1.7.6
[61] scuttle_1.4.0               XVector_0.34.0
[63] viridis_0.6.2               limma_3.50.3
[65] scater_1.22.0               DelayedMatrixStats_1.16.0
[67] ellipsis_0.3.2              vctrs_0.4.1
[69] generics_0.1.2              xgboost_0.90.0.2
[71] tools_4.1.2                 beeswarm_0.4.0

[73] Biobase_2.54.0              glue_1.6.2
[75] scran_1.22.1                purrr_0.3.4
[77] MatrixGenerics_1.6.0        parallel_4.1.2
[79] colorspace_2.0-3            cluster_2.1.3
[81] GenomicRanges_1.46.1

scDblFinder • 214 views
0
Entering edit mode

Your session info says that your version of scDblFinder is 1.8.0, and those functions were added after 1.9.0. I'm assuming you're looking at the bioconductor vignettes online, and that documentation doesn't match your slightly older installation.

You should have the functions if you install newer versions.

0
Entering edit mode
Pierre-Luc • 0
@1790f926
Last seen 1 day ago
Switzerland

Doesn't seem to have registered as an answer, so:

Your session info says that your version of scDblFinder is 1.8.0, and those functions were added after 1.9.0. I'm assuming you're looking at the bioconductor vignettes online, and that documentation doesn't match your slightly older installation.

You should have the functions if you install newer versions.

0
Entering edit mode

1. Which cut-off of the enrichment score shall I use when selecting the doublets according to Clamulet method ?

2. Why there is a large different number of cells that are called by Amulet vs Clamulet method ?

3. Why there is a large different number of cells that are called by Amulet vs CellRanger-atac ?

Thank you,

Bogdan

0
Entering edit mode

Hi,

in the first place, I wouldn't recommend using the Clamulet method: as indicated in the vignette, it's inferior to alternatives (should I put this in bold?) and it's rather slow. It's there chiefly because it was a strategy worth trying. When using it, the score can be interpreted like the scDblFinder score, meaning that it's a rough probability of belonging to doublets (1) or singlets (0). You can set your threshold to 0.5 or adjust according to what error (missing a doublet or mis-classifying a singlet) you deem worst.

The Amulet method has an interesting and elegant rationale, but it's rather rough. As we describe in the revised scDblFinder manuscript (which will be available shortly, but in the meantime the manuscript can be accessed here), a big problem is that it's highly dependent on library size. A consequence of this is that the number of doublets called can be much too low with low library sizes, or too high with high library sizes. The main value, instead, is the capacity to identify homotypic doublets.

If interested chiefly in kicking out heterotypic doublets, the scDblFinder-aggregation strategy or ArchR do a fairly good job. If interested in homotypic doublets as well, our current recommendation is to run one of these two methods along with Amulet, and combine the p-values as described in the vignette.

0
Entering edit mode

Dear Pierre-Luc,

It is a very comprehensive answer, thank you ! It would be great if you can advice us on an example that we do encounter :

<> in a sample A, the median # of fragments / cell is ~ 10 000 to 20 000 <> in a sample B, the median # of fragments / cell is ~ 60 000

The number of predicted doublets in sample A is approx the same as the number of predicted doublets in sample B (i.e approx 5 - 10 %).

Why there is such a large difference in terms of median # of fragments / cell between sample A and sample B ?

Thank you,

~ Bogdan

0
Entering edit mode

Dear Bogdan,

When you say the number of predicted doublets, you mean with Amulet?

The 60k/cell is a setting when Amulet probably works well, I'd be curious if you could share a plot of nAbove2 vs nFrags (both are outputted by scDblFinder::amulet).

The difference in median # fragments is typically just an effect of sequencing depth: most likely your sample B was sequenced deeper, or had fewer cells (resulting in more read per cell). How many cells do the datasets have?

Pierre-Luc