Can fry() be used on infinium methylation array data to call differential methylation?
1
0
Entering edit mode
@markziemann-21777
Last seen 4 months ago
Australia

I'm investigating ways to summarise methylation array data in a gene-centric way in order to perform downstream gene set/pathway enrichment analysis.

As each gene has many probes, it should be possible to apply an enrichment test to score differential gene methylation. So far I've been experimenting with two different approaches (1) GSEA like test using the limma t-statistics, and (2) fry test.

It looks as if the GSEA test is over-estimating the number of differentially methylated genes as a result of the high degree of correlation of probes belonging to a gene, and also the number of probes per gene strongly biases the results.

On the other hand the fry test results seem more in line with the expected "true" results. My question is, is it statistically correct to use fry() in this way? Secondly, would it be possible to perform GSEA, CAMERA or another gene set test downstream of fry in this way?

Any help is much appreciated.

limma infinium fry • 589 views
ADD COMMENT
1
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

It looks as if the GSEA test is over-estimating the number of differentially methylated genes as a result of the high degree of correlation of probes belonging to a gene, and also the number of probes per gene strongly biases the results.

I am guessing that you mean preranked-GSEA rather than true GSEA. Preranked-GSEA has never been published and is not recommended even by the GSEA authors themselves. As you suspect, it completely ignores inter-probe correlations and gives wildly inflated significance levels. One of my postdocs did a little simulation recently to confirm that.

is it statistically correct to use fry() in this way?

Yes, that's a valid use of fry().

Secondly, would it be possible to perform GSEA, CAMERA or another gene set test downstream of fry in this way?

I suppose you could use Preranked-CAMERA with a pre-set inter-gene correlation. Preranked-GSEA would suffer the same problems at the gene level that it does at the probe level.

ADD COMMENT
0
Entering edit mode

Thanks for this Gordon, it is very helpful and greatly appreciated!

ADD REPLY

Login before adding your answer.

Traffic: 465 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6