Search
Question: Any pointers on simulating cell outliers?
0
6 months ago by
luke.zappia50
luke.zappia50 wrote:

I was wondering if it is possible to use Splatter to simulate cell outliers.

In the documentation of Splatter, there are expression outlier parameters, but I did not find any specific information on cell outliers. Excuse me if I missed anything. I only searched the keyword "outlier" in the documentation.

I thought about generating very small DE groups with large DE scale in Splatter as cell outliers, but I am not sure if it is appropriate or not.

I am trying to benchmark methods for detecting cell outliers. Although it is not very meaningful to identify a very small number of outliers in scRNA-seq datasets that usually contain thousands of cells, it may be a straightforward approach to reduce the noise in the datasets.

Any pointers would be appreciated!

By the way, I have been using Splatter to benchmark clustering methods, and it worked very smoothly. I appreciate your efforts on making the package reliable and easy to use.

modified 6 months ago • written 6 months ago by luke.zappia50
1
6 months ago by
luke.zappia50
luke.zappia50 wrote:

Cell outliers aren't part of the current model for the Splat simulation. I think your idea of having some groups with very small probabilities and relatively large DE factors is probably a good approach to try. Maybe something like:



sim <- splatSimulateGroups(group.prob = c(0.5, 0.4, 0.09, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001))



That would give you three "real" clusters and ten kinds of "outlier" cells. You will probably need to play around with the exact probabilities to get something that looks like what you want and you might want to do something similar with the DE parameters.