What is logFC and "B" in methylation analysis ??
1
1
Entering edit mode
AST ▴ 60
@ast-8648
Last seen 2.4 years ago
INDIA

Hi all,

Being new to 450k data analysis, there are certain output terms that I don't understand.

Can someone please explain me what we can infer from logFC, AvgExp and B in 450k methylation result output while using limma and how these values are calculated from Beta/M-values?

Do these values have any significance while inferring the 450K result?

I understand that these terms are important for expression data but they doesn't make sense to me for 450k data.

Also, how does limma decides which sample to use as reference for fitting among the contrasting groups? 

 

limma champ logfc • 6.1k views
ADD COMMENT
3
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 2 hours ago
The city by the bay

I haven't worked with methylation data, but from what I understand, you should be applying limma on M-values. These are more accurately modelled under normality, at least in regard to the range of values, the mean-variance relationship, etc. Some people seem to do all the linear modelling with M-values, and then report back the fold-changes, etc. for significant probes in terms of beta-values, which are easier to interpret.

Anyway, the logFC field should represent the change in the average M-value between conditions, which - I think - is interpretable as a change in the log-odds of methylation. For example, a logFC of 1 would indicate that in one condition, the odds of being methylated to being nonmethylated are twice as high as the other condition. Or, in the simplest terms: larger logFC = stronger differential methylation. The AveExpr field would be the average M-value across all samples, which gives you a measure of the overall amount of methylation for each probe. The B-statistic is the log-odds of differential methylation to constant methylation (note, not the log-odds of methylation to nonmethylation, which is the M-value itself). I tend not to use the B-statistic much for DE analyses as I find it a bit unintuitive, but to each his own.

Finally, the chosen reference depends on the parametrization of the design matrix. If you have a one-way layout and you construct a design with an intercept via model.matrix, the alphabetically-first group will be the reference.

ADD COMMENT
3
Entering edit mode

To add further to Aaron's comments, with methylation data the individual CpG sites may not be as informative as the local methylation status of all CpGs in a region. Using something like bumphunter to detect regions that appear to be consistently differentially methylated, and then fitting models based on a regional measure of methylation may be a more appropriate way to proceed. See the minfi vignette for more information.

ADD REPLY

Login before adding your answer.

Traffic: 301 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6