Question: single-cells RNA-seq analysis by MAST with unbalanced group size
gravatar for shao
10 weeks ago by
shao80 wrote:

We have single-cells sequencing data from two groups of cells with 100 cells and 3000 cells, respectively. Our interest is to find the gene differentially expressed between these two groups, and we like to try MAST.

We are worrying abou the unbalanced group size, i.e, 100 vs 3000, will it affect the differential analysis, and how? Is there a way to attenuate the effect?




ADD COMMENTlink modified 10 weeks ago by Andrew_McDavid100 • written 10 weeks ago by shao80
gravatar for Andrew_McDavid
10 weeks ago by
Andrew_McDavid100 wrote:

Statistical validity is unaffected by the unbalanced group sizes.  Just like in an ANOVA or linear regression, the standard error of the estimate will account for the lower precision in the smaller group.  Unbalanced groups does reduce the statistical power, compared to an experiment with balanced groups (and the same overall N).  But 100 isn't such a small group in the first place, so I wouldn't be worried about it.

ADD COMMENTlink written 10 weeks ago by Andrew_McDavid100

To followup the question, I have one extreme case with only 3 cells in group 1 and 500 cells in group 2. Am I still on the safe side?  How should I estimate the minimal disparity in the sample sizes? I have other cases with 20 - 50 cells in group 1, and 500 cells in group 2.  Thank you. 

ADD REPLYlink modified 28 days ago • written 28 days ago by grl01030

Statistical validity (calibration of p-values) is unaffected by the unbalanced group sizes, but as above, power will be low.  Can you clarify what is meant by "minimal disparity in sample sizes?"

ADD REPLYlink written 27 days ago by Andrew_McDavid100
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 359 users visited in the last hour