Hello, I am currently trying to do a sample size estimation for an RNAseq experiment I am planning, using ssizeRNA package in R. This package uses average read counts and dispersion, proportion of DEGs and total genes mapped to estimate sample size based on power. Here is a link to the vignette (https://cran.r-project.org/web/packages/ssizeRNA/vignettes/ssizeRNA.pdf). I used a publically available dataset somewhat related to my topic/tissue of interest to estimate the parameters needed, similar to what they did in the last part of the vignette.
However, I am getting really high numbers per group (400+) . I am not sure if I am doing it right and not many people seem to estimate sample sizes prior to RNAseq experiments. I also noticed that RNAseq papers done in humans use relatively high number of samples per group however, nothing as high as what the analysis gave me. Has anyone used this package before? And are there any tips you can give? Or are there other tools/packages/websites that you can recommend for this? Thanks
Hi James, thanks for your response. So this is the code that I used;
For fold change I used;
fc <- function(x){exp(rnorm(x, log(2), 0.5*log(2)))}
as provided in the Vignette.For dispersion and mu, I calculated it based on a publicly available dataset GSE1285873 using the code;
where Sharpton 2019 is an expression dataset I generated with gene ids (rows) and corresponding number of counts for each sample (columns).
In your experience, is it necessary to use a preliminary dataset or is it okay to use the default values for dispersion and average read counts?
Oh wait. You used
ssize
as the tag, so I assumed you had a typo when you called it ssizeRNA. That's a CRAN package, so you are in the wrong place. This support site is meant for Bioconductor packages. You might try asking on biostars.org instead.Alright. I have also posted in biostars and waiting for a response. I just looked up as ssize and it seems to be exclusive to Microarray data. Do you know of a similar bioconductor package that can be used for RNAseq sample size estimation?
If you are using the limma-voom pipeline,
ssize
is fine.You could also try
PROPER