Question

Number of prognistic genes in RnaSeqSampleSize

0

Entering edit mode

Matthias Munz ▴ 20

@matmu

Last seen 5 months ago

Germany

I am using the est_power_distribution() function RnaSeqSampleSize for estimating the power for different sample sizes for an RNAseq experiment. What is not clear to me is on what basis to chose m1 which is the "expected number of prognostic genes". Is it the number of significant genes I identified in my own dataset? For the example below, the power increases when I set a higher m1, e.g. 3 or 4. Why is it like that?

Another question that came up is why can't I set parameter k (Ratio of sample size between two groups) for est_power_distribution() as for estimate_samples()?


# Estimate the gene read count and dispersion distribution
dataMatrixDistribution = est_count_dispersion(counts, group=groups)


# Power estimation by read count and dispersion distribution
est_power_distribution(
            n = 20, # Number of samples in each group
            m = 12, # Total number of genes for testing
            m1 =2, # Expected number of prognostic genes
            f = 0.05, # FDR level
            rho = 2, # Minimum fold changes for prognostic genes between two groups
            repNumber = 100, 
            minAveCount = 5, # Minimal average read count for each gene. Genes with smaller read counts will not be used
            selectedGenes = selected_genes, # 11 genes
            distributionObject = dataMatrixDistribution)

RnaSeqSampleSize • 743 views

ADD COMMENT • link 3.8 years ago Matthias Munz ▴ 20