When I am doing DE test between 10 cells vs another 10 cells, I nearly can't find any DEs. After I increased the number of cells to do DE test ( including above 20 cells), I can get about hundreds of DEGs. So may I ask is there minimum number of cell requirement for MAST?
There is no minimum a priori, with the following caveats:
The tests rely on asymptotic theory to establish the correctness of the p-values. 20 cells may be on the low side. In https://www.ncbi.nlm.nih.gov/pubmed/23267174 we found when the frequency of expression, times the number of cells is greater than 16 we didn't have any issues. If you aren't in this regime, the tests will still run, but the p-values will probably be anti-conservative.
The residual degrees of freedom are probably more important than the raw number of cells, and in particular it might even be the case that the design is not identifiable with 20 cells, or there could be some other convergence issue with the logistic regression. You can check if this might be the case by looking at the
library(MAST) data(vbetaFA) zz = zlm(~Stim.Condition, sca = vbetaFA) zz@converged
- Otherwise, what you report is consistent with how statistical testing works. Smaller n means less power.