Does hashedDrops support 'imbalanced' combinations of HTOs?
1
0
Entering edit mode
Peter Hickey ▴ 470
@petehaitch
Last seen 4 hours ago
Walter and Eliza Hall Institute of Medi…

I've got a bit of a weird 10x scRNA-seq dataset with 6 samples that are labelled using HTOs as follows:

• Samples 1-5 labelled with a single, unique HTO (e.g., sample 1 labelled with human-1, sample 2 labelled with human-2, etc.)
• Sample 6 labelled with all 5 HTOs (i.e. human-1, human-2, ..., human-5)

Does DropletUtils::emptyDrops() with combinations support this? Thanks

DropletUtils scRNAseq • 300 views
1
Entering edit mode
Aaron Lun ★ 27k
@alun
Last seen 7 minutes ago
The city by the bay

tl;dr No. Also, that sounds insane.

Sample 6 will never be picked up by any parametrization of hashedDrops(), which relies on the majority of HTOs not being present in a given cell. You'll have to use a cruder strategy, e.g., fit a two-component distribution to each HTO, call a cell positive if it belongs in the "high" mode of any HTO, and then use that to generate calls based on the combinations that you see. This approach is implemented in CiteFuse but I don't know whether they support wacky designs like this; you'll probably have to write code yourself.

FYI, you can reach deep inside the package to get DropletUtils:::.get_lower_dist, which is the k-means-based two-component fitting code used by ambientProfileBimodal(). One could use this to get present/absent calls for each HTO in each cell, and then proceed from there.

0
Entering edit mode

Yeah it's definitely wacky and not something I want to deal with again. I've got a set of labels from an ad-hoc approach that clusters the HTOs followed by manually assigning each cluster to a sample; not fun.

In general, for when there are fewer HTOs available than samples, labelling each sample with a unique combination of, say, 2 HTOs would be compatible with hashedDrops() via combinations, right?

0
Entering edit mode

Yes, that would be the plan, provided that you have more than 4 HTOs. Otherwise, doublets containing all HTOs would be indistinguishable from deeply-sequenced empty droplets.

There is a way to rewrite the hashedDrops() algorithm to overcome this restriction, at the cost of (i) assuming that all HTOs exhibit a bimodal profile and (ii) increasing errors due to variation in sequencing depth across cells.

1
Entering edit mode

See the documentation on the constant.ambient=TRUE option in version 1.11.20 of DropletUtils, which supports situations where the total number of HTOs is less than or equal to half the expected number per cell.