I've been using MAST for analysis of single-cell qPCR data, and I'm familiar with its use for "traditional" single-cell RNAseq data where reads from the full lengths of transcripts are converted to digital gene expression (via counts). I was wondering if anyone had considered any potential issues with using MAST for analysis of single-cell data from platforms like 10X DropSeq, where counts are estimated using UMIs but only from either the 3' or 5' end of transcripts (and never with any data from elsewhere in a transcript). From DropSeq approaches you can get a raw UMI count, and they recommend first filtering unexpressed genes, then normalizing the gene-specific UMI counts by the median number of UMIs obtained from each cell, and taking the log-transformation of the gene/cell matrix (this all seems very similar to what we would do with RSEM or EdgeR).
From my perspective I can't see any obvious issue here, but I wanted to know if anyone else had any thoughts on whether this sort of data might for some reason (perhaps related to the UMI approach, the 5'/3' specific sequencing, or this particular normalization approach) violate assumptions underlying the MAST framework.
Thanks for reading!