I am using the
RnaSeqSampleSize for estimating the power for different sample sizes for an RNAseq experiment. What is not clear to me is on what basis to chose
m1 which is the "expected number of prognostic genes". Is it the number of significant genes I identified in my own dataset? For the example below, the power increases when I set a higher
m1, e.g. 3 or 4. Why is it like that?
Another question that came up is why can't I set parameter
k (Ratio of sample size between two groups) for
est_power_distribution() as for
# Estimate the gene read count and dispersion distribution dataMatrixDistribution = est_count_dispersion(counts, group=groups) # Power estimation by read count and dispersion distribution est_power_distribution( n = 20, # Number of samples in each group m = 12, # Total number of genes for testing m1 =2, # Expected number of prognostic genes f = 0.05, # FDR level rho = 2, # Minimum fold changes for prognostic genes between two groups repNumber = 100, minAveCount = 5, # Minimal average read count for each gene. Genes with smaller read counts will not be used selectedGenes = selected_genes, # 11 genes distributionObject = dataMatrixDistribution)