Question

Expression profile from $E from Voom

0

Entering edit mode

anpham ▴ 60

@anpham-7402

Last seen 6.6 years ago

Following the voom-limma workflow and using RNA-seq count data, I created the DGEList, ran the calcNormFactor (TMM) step, ran Voom step, and got the v object. I then extracted the $E object from v.

Here is my code:

x <- DGEList(counts=RNA)

x <- calcNormFactors(x, method = "TMM")

par(mfrow=c(1,2))
v <- voom(x, design, plot=TRUE)
Elist <-v$E

My questions are: 1) expression profile in Elist has performed log(CPM) and been normalized for library size, right?; 2) Has the expression profile in Elist incorporated the precision weights (represented as $weights in v object) yet? Thank you.

Voom $EList • 2.6k views

ADD COMMENT • link updated 6.6 years ago by Steve Lianoglou ★ 13k • written 6.6 years ago by anpham ▴ 60

score 1 · Answer 1 · 2017-09-22

You could just read the help page for voom to find this out. From ?voom

Details:

     This function is intended to process RNA-Seq or ChIP-Seq data
     prior to linear modelling in limma.

      voom  is an acronym for mean-variance modelling at the
     observational level. The key concern is to estimate the
     mean-variance relationship in the data, then use this to compute
     appropriate weights for each observation. Count data almost show
     non-trivial mean-variance relationships. Raw counts show
     increasing variance with increasing count size, while log-counts
     typically show a decreasing mean-variance trend. This function
     estimates the mean-variance trend for log-counts, then assigns a
     weight to each observation based on its predicted variance. The
     weights are then used in the linear modelling process to adjust
     for heteroscedasticity.

      voom  performs the following specific calculations. First, the
     counts are converted to logCPM values, adding 0.5 to all the
     counts to avoid taking the logarithm of zero. The matrix of logCPM
     values is then optionally normalized. The  lmFit  function is used
     to fit row-wise linear models. The  lowess  function is then used
     to fit a trend to the square-root-standard-deviations as a
     function of average logCPM. The trend line is then used to predict
     the variance of each logCPM value as a function of its fitted
     value, and the inverse variances become the estimated precision
     weights.

score 1 · Answer 2 · 2017-09-22

1

Entering edit mode

Steve Lianoglou ★ 13k

@steve-lianoglou-2771

Last seen 13 months ago

United States

1) expression profile in Elist has performed log(CPM) and been normalized for library size, right?

Yes

2) Has the expression profile in Elist incorporated the precision weights (represented as $weights in v object) yet?

No

Note that if you want to use the expression data in your voomed "v" object for anything except the limma differential expression pipeline, the recommendation is to rather get logged cpms from your DGEList using a higher prior.count value, ie.

E <- cpm(x, log=TRUE, prior.count=5)

Or you can take your counts matrix and shoot it through DESeq2::vst

ADD COMMENT • link 6.6 years ago Steve Lianoglou ★ 13k

0

Entering edit mode

Thank you very much for your helpful response, Steve, especially the comment on DESeq2::vst. You helped me solve the problem I was trying to figure out.

ADD REPLY • link 6.6 years ago anpham ▴ 60