Question

Using RNA-Seq raw count data in Weighted Gene Co-Expression Network Analysis

1

Entering edit mode

Jon Bråte ▴ 260

@jon-brate-6263

Last seen 4 months ago

Norway

Hi,

I have gene expression count data generated by HTSeq and I wonder how I can use them in the WGCNA-package? I think one of the datasets from the turotial is: "the ratio of the mean log10 intensity (mlratio) relative to the pool derived from 150 mice". Can I use voom transformation in limma for instance?

Thanks,

Jon

rnaseq network limma • 3.7k views

ADD COMMENT • link updated 10.1 years ago by Steve Lianoglou ★ 13k • written 10.1 years ago by Jon Bråte ▴ 260

score 6 · Accepted Answer · 2014-09-23

You wouldn't use a "voom transformation" ... voom doesn't perform much of a transformation at all as it simply provides something like a +0.5 smoothed logCPM estimate for the counts form its inputted DGEList (though, I will grant that this is a transformation! :-).

The magic of voom is the "sister" weights matrix that it provides, and for that to be useful, your downstream method would have to be one that can leverage these observational weights.

You likely want some type of "variance stabilizing transformation" of your count data, though. In the edgeR/limma world, this would involve calling `cpm` on your count matrix with a value somewhere between 2-5 for the "prior.count" argument (sorry, but I can't give you better guidance on the choice of "prior.count" ... picking "the right" value for that (if there can be one) seems like a bit of voodoo for the time being, but perhaps Gordon can chime in), cf:

Section 2.11 of the edgeRUsersGuide (Clustering, heatmaps, etc.); and
Previous posts on the bioconductor forum, A: Remove batch effect in small RNASeq study (SVA or others?) (among others)

Alternatively you could use the "varianceStabilizing" or "rlog" transformations from DESeq2, see the "Data transformations and visualization" section of the Differential analysis of count data vignette in the DESeq2 package.