Search
Question: Expression profile from $E from Voom 0 13 months ago by anpham0 anpham0 wrote: Following the voom-limma workflow and using RNA-seq count data, I created the DGEList, ran the calcNormFactor (TMM) step, ran Voom step, and got the v object. I then extracted the$E object from v.

Here is my code:

x <- DGEList(counts=RNA)

x <- calcNormFactors(x, method = "TMM")

par(mfrow=c(1,2))
v <- voom(x, design, plot=TRUE)
Elist <-v$E My questions are: 1) expression profile in Elist has performed log(CPM) and been normalized for library size, right?; 2) Has the expression profile in Elist incorporated the precision weights (represented as$weights in v object) yet? Thank you.

modified 13 months ago by Steve Lianoglou12k • written 13 months ago by anpham0
0
13 months ago by
United States
James W. MacDonald48k wrote:

You could just read the help page for voom to find this out. From ?voom

Details:

This function is intended to process RNA-Seq or ChIP-Seq data
prior to linear modelling in limma.

voom  is an acronym for mean-variance modelling at the
observational level. The key concern is to estimate the
mean-variance relationship in the data, then use this to compute
appropriate weights for each observation. Count data almost show
non-trivial mean-variance relationships. Raw counts show
increasing variance with increasing count size, while log-counts
typically show a decreasing mean-variance trend. This function
estimates the mean-variance trend for log-counts, then assigns a
weight to each observation based on its predicted variance. The
weights are then used in the linear modelling process to adjust
for heteroscedasticity.

voom  performs the following specific calculations. First, the
counts are converted to logCPM values, adding 0.5 to all the
counts to avoid taking the logarithm of zero. The matrix of logCPM
values is then optionally normalized. The  lmFit  function is used
to fit row-wise linear models. The  lowess  function is then used
to fit a trend to the square-root-standard-deviations as a
function of average logCPM. The trend line is then used to predict
the variance of each logCPM value as a function of its fitted
value, and the inverse variances become the estimated precision
weights.
1

But I should point out that the object  you call "v" is an EList that has the weights, and the thing you call Elist is a regular matrix that just contains the E list item from v.

You should not strip things out of objects like that unless you know what you are doing. The EList object (called v) is something that lmFit will know how to use, and will do something sensible with. If you use your Elist object with lmFit, it won't know about the weights, and will fit a conventional unweighted model.

0
13 months ago by
Denali
Steve Lianoglou12k wrote:

1) expression profile in Elist has performed log(CPM) and been normalized for library size, right?

Yes

2) Has the expression profile in Elist incorporated the precision weights (represented as \$weights in v object) yet?

No

Note that if you want to use the expression data in your voomed "v" object for anything except the limma differential expression pipeline, the recommendation is to rather get logged cpms from your DGEList using a higher prior.count value, ie.

E <- cpm(x, log=TRUE, prior.count=5)

Or you can take your counts matrix and shoot it through DESeq2::vst