Voom variance values: Some minor confusion about which variance is fed to lowess when using a design matrix of covariates
1
0
Entering edit mode
Brian • 0
@3b7c269e
Last seen 4 hours ago
United States

I am working through the voom method/approach and hung up on one simple misunderstanding. I've run voom on a table of values (counts) with and without a design matrix of covariates (X). I can confirm that when I run voom without the design matrix that the resulting plot and lowess fit is showing:

on the x axis: the mean of the log2(counts + 0.5)

and on the y axis: the square root of the log2(CPM) standard deviations.

This is all well and good.

Now...

when I run the model with a design matrix (X) and fit my coefficients (B) I can see that the x values on the voom plot still correspond to the mean of the log2(counts + 0.5). I cannot seem to figure out where the variances come from. They are not the variances of the predicted CPM values, nor are they the variances of the predicted counts when I convert back to counts (log2(CPM_pred) + log2(library_size + 1) - log2(1e6)). I also check the RMSE of the fitted versus actual log2*(CPM) and log2(CPM_pred) values.

I have confirmed this by running voom with and without the design matrix:

y <- voom(counts = counts, design = design_matrix, plot = T, save.plot = T, span = 0.66666)
z <- voom(counts = counts, plot = T, save.plot = T, span = 0.66666)


and simply examining the y$voom.xy$x, y$voom.xy$y and z$voom.xy$x, z$voom.xy$x values.

I've also fit my own OLS model to a few rows of the data and confirmed that I can recreate all values except for these dang variances in y$voom.xy$y!

my questions is simply:

What values are used to calculate the stdev (y$voom.xy$y) for the lowess fit when a design/covariates matrix is provided?

Happy to post more code or info if needed.

voom • 72 views
1
Entering edit mode
@gordon-smyth
Last seen 48 minutes ago
WEHI, Melbourne, Australia

The voom method is described by Law et al (2014) http://www.statsci.org/smyth/pubs/VoomPreprint.pdf and by the limma documentation help("voom"). The y-values in the voom plot are the square-root residual standard deviations after regressing the logCPM on the design matrix.