6 months ago by

United States

From the vignette:

The default statistical test in ballgown is a parametric F-test comparing nested linear models; details are available in the Ballgown manuscript (Frazee et al. (2014)). These models are conceptually simialar to the models used by Smyth (2005) in the `limma`

package. In `limma`

, more sophisticated empirical Bayes shrinkage methods are used, and generally a single linear model is fit per feature instead of doing a nested model comparison, but the flavor is similar (and in fact, `limma`

can easily be run on any of the data matrices in a `ballgown`

object).

Ballgown's statistical models are implemented with the `stattest`

function. Two models are fit to each feature, using expression as the outcome: one including the covariate of interest (e.g., case/control status or time) and one not including that covariate. An F statistic and p-value are calculated using the fits of the two models. A significant p-value means the model including the covariate of interest fits significantly better than the model without that covariate, indicating differential expression. We adjust for multiple testing by reporting q-values (Storey & Tibshirani (2003)) for each transcript in addition to p-values: reporting features with, say, q < 0.05 means the false discovery rate should be controlled at about 5%.

If you want to know what you are doing, there is no substitute for reading the original manuscript.