Currently I am using emptyDrops() for calling cells after applying swappedDrops() to perform barcode swapping removal of several 10X scRNA datasets. After experiencing long running times for emptyDrops() on these data, I was wondering about the scalability of the function's performance.
As mentioned in , emptyDrops() requires approximately 1-2 minutes to run on each of the tested datasets, and this was confirmed with example dataset "placenta1" (dimension of 33,694 features x 737,280 barcodes), which took around 75 seconds to complete. However, when using emptyDrops() on my datasets of interest, this is taking much longer than expected, eg. for a dataset of dimension 33,538 features x 737,280 barcodes (ie. a total of 156 fewer genes), the running time is around 245 seconds. When attempting to clarify this difference, I also looked at the degree of "sparsity" of each dataset, and while "placenta1" has 18,763,564 non-zero elements, my example dataset had 10,793,183. Do any of this factors (dimensions, sparsity, etc.) or others influence the running time of emptyDrops()? Is it possible to reduce it somehow? I apply this function several times over several datasets, therefore my interest in the matter.
Thank you in advance!