Entering edit mode
I am using the est_power_distribution()
function RnaSeqSampleSize
for estimating the power for different sample sizes for an RNAseq experiment. What is not clear to me is on what basis to chose m1
which is the "expected number of prognostic genes". Is it the number of significant genes I identified in my own dataset? For the example below, the power increases when I set a higher m1
, e.g. 3 or 4. Why is it like that?
Another question that came up is why can't I set parameter k
(Ratio of sample size between two groups) for est_power_distribution()
as for estimate_samples()
?
# Estimate the gene read count and dispersion distribution
dataMatrixDistribution = est_count_dispersion(counts, group=groups)
# Power estimation by read count and dispersion distribution
est_power_distribution(
n = 20, # Number of samples in each group
m = 12, # Total number of genes for testing
m1 =2, # Expected number of prognostic genes
f = 0.05, # FDR level
rho = 2, # Minimum fold changes for prognostic genes between two groups
repNumber = 100,
minAveCount = 5, # Minimal average read count for each gene. Genes with smaller read counts will not be used
selectedGenes = selected_genes, # 11 genes
distributionObject = dataMatrixDistribution)