Question: Selecting genes based on expressional variance
1
gravatar for Jon Bråte
5.2 years ago by
Jon Bråte160
Norway
Jon Bråte160 wrote:

Hi,

I am trying to follow the procedure outlined in Liao, Q. et al. NAR 2011 (Large-scale prediction of long non-coding RNA functions in a coding-non-coding gene co-expression network) and want to select the genes with "expressional variance ranked in the top 75 percentile of each data set". I have a matrix of count data that is variance stabilized using DESeq2 (recommended as input for the WGCNA package), but I am unsure how to proceed to select the genes with highest variance. My matrix consists of datasets representing different biological conditions and most of them are in triplicates.

Thanks,

Jon

geneexpression rnaseq wgcna • 2.4k views
ADD COMMENTlink modified 5.2 years ago by Laurent Gatto1.2k • written 5.2 years ago by Jon Bråte160
Answer: Selecting genes based on expressional variance
3
gravatar for Laurent Gatto
5.2 years ago by
Laurent Gatto1.2k
Belgium
Laurent Gatto1.2k wrote:

Here is a suggestion that uses the rowVars from genefilter:

> set.seed(1)
> m <- matrix(rnorm(1000, 10), ncol = 10)
> dim(m)
[1] 100  10
> library("genefilter")
> rv <- rowVars(m)
> summary(rv)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.2754  0.7198  0.9987  1.0590  1.3720  2.7310 
> (q75 <- quantile(rowVars(m), .75))
     75% 
1.372284 
> m2 <- m[rv > q75, ]
> dim(m2)
[1] 25 10
> summary(rowVars(m2))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  1.375   1.477   1.564   1.708   1.848   2.731 

DESeq2 might have some functionality to do similar or more appropriate filtering, but I'll leave it to the experts.

ADD COMMENTlink written 5.2 years ago by Laurent Gatto1.2k

Thanks, works perfect!

ADD REPLYlink written 5.2 years ago by Jon Bråte160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 312 users visited in the last hour