Expression profile from $E from Voom
2
0
Entering edit mode
anpham ▴ 60
@anpham-7402
Last seen 6.6 years ago

Following the voom-limma workflow and using RNA-seq count data, I created the DGEList, ran the calcNormFactor (TMM) step, ran Voom step, and got the v object. I then extracted the $E object from v.

Here is my code:

x <- DGEList(counts=RNA)

x <- calcNormFactors(x, method = "TMM")

par(mfrow=c(1,2))
v <- voom(x, design, plot=TRUE)
Elist <-v$E 

My questions are: 1) expression profile in Elist has performed log(CPM) and been normalized for library size, right?; 2) Has the expression profile in Elist incorporated the precision weights (represented as $weights in v object) yet? Thank you.

 

 

 

Voom $EList • 2.6k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

You could just read the help page for voom to find this out. From ?voom

Details:

     This function is intended to process RNA-Seq or ChIP-Seq data
     prior to linear modelling in limma.

      voom  is an acronym for mean-variance modelling at the
     observational level. The key concern is to estimate the
     mean-variance relationship in the data, then use this to compute
     appropriate weights for each observation. Count data almost show
     non-trivial mean-variance relationships. Raw counts show
     increasing variance with increasing count size, while log-counts
     typically show a decreasing mean-variance trend. This function
     estimates the mean-variance trend for log-counts, then assigns a
     weight to each observation based on its predicted variance. The
     weights are then used in the linear modelling process to adjust
     for heteroscedasticity.

      voom  performs the following specific calculations. First, the
     counts are converted to logCPM values, adding 0.5 to all the
     counts to avoid taking the logarithm of zero. The matrix of logCPM
     values is then optionally normalized. The  lmFit  function is used
     to fit row-wise linear models. The  lowess  function is then used
     to fit a trend to the square-root-standard-deviations as a
     function of average logCPM. The trend line is then used to predict
     the variance of each logCPM value as a function of its fitted
     value, and the inverse variances become the estimated precision
     weights.
ADD COMMENT
2
Entering edit mode

But I should point out that the object  you call "v" is an EList that has the weights, and the thing you call Elist is a regular matrix that just contains the E list item from v.

You should not strip things out of objects like that unless you know what you are doing. The EList object (called v) is something that lmFit will know how to use, and will do something sensible with. If you use your Elist object with lmFit, it won't know about the weights, and will fit a conventional unweighted model.

ADD REPLY
1
Entering edit mode
@steve-lianoglou-2771
Last seen 14 months ago
United States

1) expression profile in Elist has performed log(CPM) and been normalized for library size, right?

Yes

2) Has the expression profile in Elist incorporated the precision weights (represented as $weights in v object) yet?

No

Note that if you want to use the expression data in your voomed "v" object for anything except the limma differential expression pipeline, the recommendation is to rather get logged cpms from your DGEList using a higher prior.count value, ie.

E <- cpm(x, log=TRUE, prior.count=5)

Or you can take your counts matrix and shoot it through DESeq2::vst

 

ADD COMMENT
0
Entering edit mode

Thank you very much for your helpful response, Steve, especially the comment on DESeq2::vst. You helped me solve the problem I was trying to figure out.

ADD REPLY

Login before adding your answer.

Traffic: 466 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6