Correlation between CpG methylation and gene expression from DESeq2-> Normalization of counts using VST + fpkm/rpkm for optimal normalization?
1
0
Entering edit mode
@676d8098
Last seen 24 days ago
United States

As the title states, I've got some RRBS data and am looking to evaluate a correlation between gene expression and CpG island methylation.

In prepping the expression data (from a completed DESeq2 run), I'm planning to normalize the counts with the variance stabilized transformation (VST) prior to exporting and moving forward. However as I understand, the VST accounts for the library's size factors & inter-sample count variance BUT does not normalize for the feature length. Is this indeed the case? And if so, am I right to conclude that these should be normalized using fpkm or an analogous method before being used for downstream analysis?

I've been reading the documentation + source code as well as previous related answers but I'm still not 100% sure. The question seems silly as the only recommendations prior to exporting these data made in the vignettes and in previous questions allude to the rlog or VST transformations, but I wanted to be certain and don't have a good bioinformatics mentor to ask.

I sincerely appreciate your input.

DESeq2 • 147 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 16 minutes ago
United States

In this case, can you say why you want to normalize for gene length? In our tximport/tximeta pipeline, it would correct for differential gene length (e.g. if effective gene length is changing across samples) but we don't need to divide out a common gene length factor from the entire row. It wouldn't affect a correlation anyway, which is scale invariant.

ADD COMMENT

Login before adding your answer.

Traffic: 199 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6