single-cells RNA-seq analysis by MAST with unbalanced group size
1
1
Entering edit mode
shao ▴ 100
@shao-6241
Last seen 6.9 years ago
Germany

We have single-cells sequencing data from two groups of cells with 100 cells and 3000 cells, respectively. Our interest is to find the gene differentially expressed between these two groups, and we like to try MAST.

We are worrying abou the unbalanced group size, i.e, 100 vs 3000, will it affect the differential analysis, and how? Is there a way to attenuate the effect?

Thanks!

Shao

 

mast single-cell rna-seq differential analysis • 2.3k views
ADD COMMENT
3
Entering edit mode
@andrew_mcdavid-11488
Last seen 7 weeks ago
United States

Statistical validity is unaffected by the unbalanced group sizes.  Just like in an ANOVA or linear regression, the standard error of the estimate will account for the lower precision in the smaller group.  Unbalanced groups does reduce the statistical power, compared to an experiment with balanced groups (and the same overall N).  But 100 isn't such a small group in the first place, so I wouldn't be worried about it.

ADD COMMENT
0
Entering edit mode

To followup the question, I have one extreme case with only 3 cells in group 1 and 500 cells in group 2. Am I still on the safe side?  How should I estimate the minimal disparity in the sample sizes? I have other cases with 20 - 50 cells in group 1, and 500 cells in group 2.  Thank you. 

ADD REPLY
0
Entering edit mode

Statistical validity (calibration of p-values) is unaffected by the unbalanced group sizes, but as above, power will be low.  Can you clarify what is meant by "minimal disparity in sample sizes?"

ADD REPLY

Login before adding your answer.

Traffic: 652 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6