Question

Detecting significant trends in variability

0

Entering edit mode

Gustavo Fernández Bayón ▴ 440

@gustavo-fernandez-bayon-5300

Last seen 8.3 years ago

Spain

Hi everybody, I have a question related to the analysis of methylation microarray data. At first, I asked it on Biostar (here http://www.biostars.org/p/64405/#64521) and somebody there suggested me to put it also here. The question is: "[..] I am currently working on a DNA methylation microarray analysis project. I have 20 samples measured on a Illumina 450k. After some initial preprocessing and non-specific filtering, I lowered its dimensionality down to 47k probes. Using minfi, I adjust a linear regression model to each probe taking the sample age as the only continuous predictor and trying to estimate the methylation level (in the form of M-values, logit transformations of the beta values). P-values are then adjusted using FDR, and I keep the significant probes as the final subset of differentially methylated probes. Now, we want to divide these probes in several groups, according to their variability trend. This is, we want to be able to detect if, for a given probe, the methylation values are convergent or divergent with respect to age. At first I was thinking about using the White test to see if the squared residuals behave as stated before, or something equivalent for heteroskedasticity testing. But then I thought that if the squared residuals behave in a non-normal way, it could be due to several other factors, such as outliers or influence points. Am I right untrusting this approximation or the White test could fit in this context? A fellow told me another possible way would be to use Mixed Models with a variance function. That way I could model not only the change in methylation level but also the change in variabilty. If I choose this way, then I should define some age groups and partition the samples among them, shouldn't I? Is this a better approximation in this case than the basic linear regression? [..]" ADDITIONAL NOTES: I really like the mixed model approach, and I have managed to play a little bit with the nlme package and varFunc class family in order to study the heteroskedasticity, but I still think I am missing something. I have also being reading excerpts from the "Mixed Effects Models in S and S/Plus" book by Pinheiro and Bates, and I think I can understand the examples, but then I find it hard to adapt the examples to the methylation scenario. For example, say I have the methylation values for one probe. Obvious simple linear model is "meth ~ age". So far, so good. But, if I want to convert it to a mixed model, which covariate can be declared as a random effect? I have been playing with the age as a random factor, but I am not sure if that is a good model. In the end, what I want is to be able to use lme() and pass it a varFunc in order to see if it can adjust a model for the variability trend. If this cannot be modeled as a mixed model, is there any tool to fit a linear model with a variance function, just as the lme() function does? Any help will be much appreciated. Regards, Gus

Microarray Preprocessing probe minfi Microarray Preprocessing probe minfi • 1.2k views

ADD COMMENT • link 11.2 years ago Gustavo Fernández Bayón ▴ 440